Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harrison/arize #2180

Merged
merged 2 commits into from
Mar 30, 2023
Merged

Harrison/arize #2180

merged 2 commits into from
Mar 30, 2023

Conversation

hwchase17
Copy link
Contributor

No description provided.

hakantekgul and others added 2 commits March 29, 2023 22:40
**Arize CallBack Handler for LangChain <> Arize Integration**
This pull request adds a new Python class, ArizeCallbackHandler, which
serves as a callback handler that logs data into an ML Observability
platform named Arize whenever a Langchain Language Model (LLM) runs
within Langchain. Specifically, it logs prompt and response pairs along
with their embeddings for monitoring and further finetuning.

The ArizeCallbackHandler class is implemented as a subclass of
BaseCallbackHandler and contains the following methods for handling
different events triggered during the LLM execution:

on_llm_start: records the prompts when an LLM starts
on_llm_end: logs data to Arize when an LLM ends, including prompt,
response embeddings, id and timestamp

The constructor of ArizeCallbackHandler takes four optional arguments
(Needed for logging into Arize):

- model_id: the ID of the model in Arize (default: None)
- model_version: the version of the model in Arize (default: None)
- SPACE_KEY: the space key for the Arize client (default: None)
- API_KEY: the API key for the Arize client (default: None)

When ArizeCallbackHandler is instantiated, it initializes empty lists to
store the prompt/response pairs and other necessary data. It also
creates an embedding generator for generating embeddings from prompts
and responses. For embeddings consistency and ease of use, Arize's
embedding generator is used. Finally, it creates an Arize client and
checks if the SPACE_KEY and API_KEY are set with the correct value.

**Motivation and Context**
The purpose of this pull request is to provide a way to log data into
Arize whenever an LLM runs within Langchain. This will allow us to
monitor the performance of the LLM and make further fine-tuning to the
model if necessary. As Arize develops LLM-supporting capabilities, some
minor changes might be necessary in the future.

**How Has This Been Tested?**
I have tested this code by using weights and biases test cases with
LLMs, LLM chains, and LLM agents. I checked if the prompt/response pairs
and their embeddings were logged into Arize correctly. I have also
tested this code with different model IDs and model versions in Arize to
ensure that the data was logged to the correct model.

An example test is provided below: 
```
from langchain.callbacks import StdOutCallbackHandler
from langchain.callbacks.base import CallbackManager
from langchain.llms import OpenAI

arize_callback = ArizeCallbackHandler(
    model_id="llm-langchain-demo",
    model_version="1.0",
    SPACE_KEY="88aa7af",
    API_KEY="7cab3a79306ef1cd9e7"
)

manager = CallbackManager([StdOutCallbackHandler(), arize_callback])

llm = OpenAI(temperature=0, callback_manager=manager, verbose=True)
```

```llm_result = llm.generate(["Tell me a joke","Tell me a poem"])```
@hwchase17 hwchase17 merged commit 65c0c73 into master Mar 30, 2023
@hwchase17 hwchase17 deleted the harrison/arize branch March 30, 2023 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants