# LangSmith Evaluation with OpenAI Assistants

The OpenAI Assistants API allows you to build assistants for your application. An assistant can leverage tools, models, and knowledge to respond to the user's query. In this demo, we'll build a very simply assistant to answer questions based on knowledge in documents. Learn more about OpenAI assistants [here](https://platform.openai.com/docs/assistants/overview?context=with-streaming).

First, we'll do some setup. Create a LangSmith API Key by navigating to the settings page in LangSmith, then set the following environment variables.
```
OPENAI_API_KEY=<YOUR OPENAI API KEY>
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT='oai-test'
LANGCHAIN_API_KEY=<YOUR LANGSMITH API KEY>
```

In [40]:
import OpenAI from "npm:openai@4.33.1";
import { Client } from "npm:langsmith";
import { traceable } from "npm:langsmith/traceable";
import { wrapOpenAI } from "npm:langsmith/wrappers";

// Initialize the LangSmith and OpenAI clients
const openai = new OpenAI();
const langsmith = new Client();

In [41]:
// Functions to upload files to OpenAI s.t. the Assistant can use them
// Need to use the API to upload files instead of the SDK for Deno compat
async function uploadFile(filename: string, path: string, apiKey: string) {
    const fileData = await Deno.readFile(path);
    const formData = new FormData();
    formData.append("purpose", "assistants");
    formData.append("file", new Blob([fileData]), filename);

    const response = await fetch("https://api.openai.com/v1/files", {
        method: "POST",
        headers: {
            "Authorization": `Bearer ${apiKey}`
        },
        body: formData
    });

    if (response.ok) {
        const data = await response.json();
        console.log(`Uploaded file with ID: ${data.id}`);
        return data.id;
    } else {
        console.error("Failed to upload file:", await response.text());
        throw new Error("Failed to upload file");
    }
}

const files = ["alice.txt", "bob.txt", "sarah.txt"];
const apiKey = Deno.env.get("OPENAI_API_KEY"); // Make sure the API key is set in the environment
let fileIds = [];

for (const filename of files) {
    const filePath = `./files/${filename}`;
    try {
        const fileId = await uploadFile(filename, filePath, apiKey);
        fileIds.push(fileId);
    } catch (error) {
        console.error(`Error uploading ${filename}: ${error.message}`);
    }
}

console.log("All file IDs:", fileIds);

Uploaded file with ID: file-1mLrAxdYvZyXm5eYKbGsbfPJ
Uploaded file with ID: file-tXS5BaWZF1AeninxtN2h4L4t
Uploaded file with ID: file-UhneqboUOJLioqtRRexrrfUh
All file IDs: [
  [32m"file-1mLrAxdYvZyXm5eYKbGsbfPJ"[39m,
  [32m"file-tXS5BaWZF1AeninxtN2h4L4t"[39m,
  [32m"file-UhneqboUOJLioqtRRexrrfUh"[39m
]


In [33]:
// Create the Assistant
const assistant = await openai.beta.assistants.create({
    name: "Investment Assistant",
    instructions: "You an investment support chatbot. Answer the user's questions based SOLELY on information in the provided documents.",
    tools: [{ type: "retrieval" }],
    model: "gpt-4-turbo",
    file_ids: fileIds
});

async function generateAssistantResponse(input: { question: string }) {
    const thread = await openai.beta.threads.create();
    const message = await openai.beta.threads.messages.create(
      thread.id,
      {
        role: "user",
        content: input.question
      }
    );
    const run = await openai.beta.threads.runs.createAndPoll(thread.id, {
      assistant_id: assistant.id,
    });
    const messages = await openai.beta.threads.messages.list(
        run.thread_id
    );
    return messages.data.reverse()[1].content[0].text.value
}

In [42]:
// Wrap the function in `traceable` to trace responses to LangSmith
const traceGenerateAssistantResponse = traceable(generateAssistantResponse);

In [28]:
await traceGenerateAssistantResponse({question: "What is Sarah's investment portfolio?"})

[32m"Sarah's investment portfolio includes 6% ownership in Acme Inc and 2% ownership in XYZ Corp."[39m

![Screenshot of Trace View in LangSmith](assistant_trace_view.png)

In [36]:
// Create a dataset to be used for testing using the LangSmith client
const examples = [
  [
    "What is Sarah's investment portfolio?",
    "Sarah's investment portfolio includes 6% ownership in Acme Inc and 2% ownership in XYZ Corp.",
  ],
  [
    "How much of Acme Inc does Bob own?",
    "Bob own 5% of Acme Inc.",
  ],
  [
    "How much of Acme Inc does Alice own?",
    "Alice owns 2% of Acme Inc."
  ],
];

const datasetName = "OpenAI Assistants Pipeline";
const dataset = await langsmith.createDataset(datasetName);

await Promise.all(
  examples.map(async ([question, answer]) => {
    await langsmith.createExample(
      { question },
      { answer },
      { datasetId: dataset.id }
    );
  })
);

[ [90mundefined[39m, [90mundefined[39m, [90mundefined[39m ]

In [37]:
// Run a set of tests on the dataset and compare them in LangSmith
// First, set up evaluators to run against the test results
import type { RunEvalType, RunEvaluatorLike } from "npm:langchain/smith";
import { runOnDataset, Criteria, LabeledCriteria } from "npm:langchain/smith";

const evaluators: RunEvalType[] = [
  // LangChain's built-in evaluators
  Criteria("conciseness"),
  LabeledCriteria("correctness"),
];

In [39]:
// Use `runOnDataset` to run the pipeline against examples in the Dataset
await runOnDataset(traceGenerateAssistantResponse, datasetName, {
  evaluators,
  projectName: "test-oai-assistants-demo",
});


Predicting: ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100.00% | 3/3

Completed
Running Evaluators: ▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░░░░░░░░░░░░░░░░░░░░░░░ 33.33% | 1/3

Running Evaluators: ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░░░░░░░░░ 66.67% | 2/3

Running Evaluators: ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100.00% | 3/3



{
  projectName: [32m"test-oai-assistants-demo"[39m,
  results: {
    [32m"7cb7f3b5-ab41-4f82-9eab-004462500286"[39m: {
      execution_time: [33m6169[39m,
      feedback: [
        {
          id: [32m"bb44f247-e6b2-45dd-affd-a9e4055011c6"[39m,
          run_id: [32m"9ba82d51-17ae-47b2-9ef3-a0955397624e"[39m,
          key: [32m"conciseness"[39m,
          score: [33m1[39m,
          value: [32m"Y"[39m,
          correction: [90mundefined[39m,
          comment: [32m"The criterion is conciseness. This means the submission should be brief and to the point, without un"[39m... 352 more characters,
          feedback_source: [36m[Object][39m,
          feedbackConfig: [90mundefined[39m
        },
        {
          id: [32m"55e374ad-6a6e-4c30-9d50-9f2aa68897c4"[39m,
          run_id: [32m"9ba82d51-17ae-47b2-9ef3-a0955397624e"[39m,
          key: [32m"correctness"[39m,
          score: [33m1[39m,
          value: [32m"Y"[39m,
          correction: [90mun

![Screenshot of Experiment View in LangSmith](assistant_experiment.png)