# Log manual experiments

Here, we'll do a very basic walkthrough of how you can log individual predictions as LangSmith `experiments`. This is useful if you already have some evaluation flow set up but you still want to take advantage of LangSmith's experiment tracking functionality.

First, create a LangSmith account and API Key, then create an .env file with values for the following variables in the same directory as this notebook:

```
LANGCHAIN_API_KEY=<YOUR LANGSMITH API KEY>
```

In [1]:
import "dotenv/config"; // Load env vars from .env file

[Module: null prototype] { default: {} }

## Create Dataset

First, create a dataset from the inputs (and optional reference outputs) we are evaluating over. These examples let us compare predictions from different models or systems on similar data points.

In [2]:
import { Client } from "langsmith";

const client = new Client();

const datasetName = `My-Dataset-${new Date().toISOString()}`;
const dataset = await client.createDataset(datasetName);
await client.createExample({"input": "Foo"}, {"output": "Bar"}, {datasetId: dataset.id})
await client.createExample({"input": "Foo 2"}, {"output": "Bar 2"}, {datasetId: dataset.id})

{
  inputs: { input: [32m"Foo 2"[39m },
  outputs: { output: [32m"Bar 2"[39m },
  dataset_id: [32m"893dd7d2-8ba6-4a47-9ebb-6fc133aaeba1"[39m,
  source_run_id: [1mnull[22m,
  metadata: [1mnull[22m,
  created_at: [32m"2024-04-17T22:56:51.675000+00:00"[39m,
  id: [32m"3a3b8945-156e-4a35-8e2f-12f7619a77ce"[39m,
  name: [32m""[39m,
  modified_at: [32m"2024-04-17T22:56:51.675000+00:00"[39m
}

## AddPrediction helper

Next, define an `addPrediction` helper function. This does 2 things:
1. Creates an experiment, if it doesn't already exist.
2. Creates a `Run` to represent your prediction on this data point.

It returns the runId you can use for logging feedback.

In [3]:
import { v4 as uuidv4 } from "uuid";

interface addPredictionProps {
    experimentName: string;
    name?: string;
    runId?: string;
    referenceExampleId: string;
    inputs?: unknown;
}

async function addPrediction(client: Client, prediction: unknown, props: addPredictionProps): Promise<void> {
    const { experimentName, runId, referenceExampleId, inputs, name } = props;

    const example = await client.readExample(referenceExampleId);

    await client.createProject({ projectName: experimentName, referenceDatasetId: example.dataset_id, upsert: true });
    const runId_ = runId ?? uuidv4();
    await client.createRun({
        name: name ?? "Tested" ,
        id: runId_,
        inputs: inputs ? {input: inputs}: undefined,
        outputs: {output: prediction}, 
        run_type: 'chain', 
        reference_example_id: referenceExampleId,
        start_time: Date.now(),
        end_time: Date.now(),
        project_name: experimentName,
    });
    return runId_;
}


## Add your first prediction

To log a prediction, at minimum we need:
- The predicted value
- The dataset example we were predicting for
- The experiment name to associate this prediction with

In [10]:
const exampleIds = [];
for await (const example of client.listExamples({datasetName})) {
    exampleIds.push(example.id)
}
console.log(exampleIds)

[
  [32m"3a3b8945-156e-4a35-8e2f-12f7619a77ce"[39m,
  [32m"8164992a-8db6-495e-bd23-bef7769629a9"[39m
]


In [11]:
const predictionOne = "Foo"; 
const experimentName = "MyExperiment"

const runId = await addPrediction(client, predictionOne, {referenceExampleId: exampleIds[0], experimentName})

## Log Feedback

Now you can log any type of feedback metrics for this prediction. For instance, you can score this with continuous values:

In [12]:
await client.createFeedback(runId, "correctness", {score: 1, comment: "This looks impeccable."})

{
  id: [32m"a619d236-a28f-444a-a127-aea3498b0808"[39m,
  run_id: [32m"5abc59ee-0e42-4f5f-b600-7f8e76cb4794"[39m,
  key: [32m"correctness"[39m,
  score: [33m1[39m,
  value: [90mundefined[39m,
  correction: [90mundefined[39m,
  comment: [32m"This looks impeccable."[39m,
  feedback_source: { type: [32m"api"[39m, metadata: {} },
  feedbackConfig: [90mundefined[39m
}

You can also log unstructured notes without a score.

In [13]:
await client.createFeedback(runId, "note", { comment: "I think I could do better though. Not gonna leave a score here."})

{
  id: [32m"2acb6755-6c35-4d24-83c8-9bbb79e85674"[39m,
  run_id: [32m"5abc59ee-0e42-4f5f-b600-7f8e76cb4794"[39m,
  key: [32m"note"[39m,
  score: [90mundefined[39m,
  value: [90mundefined[39m,
  correction: [90mundefined[39m,
  comment: [32m"I think I could do better though. Not gonna leave a score here."[39m,
  feedback_source: { type: [32m"api"[39m, metadata: {} },
  feedbackConfig: [90mundefined[39m
}

## Add more predictions

Continue in a similar way, logging predictions for each example in the dataset.

In [14]:
const predictionTwo = "Bar";
const runId2 = await addPrediction(client, predictionTwo, {referenceExampleId: exampleIds[1], experimentName})

In [15]:
await client.createFeedback(runId, "Correctness", {score: 0, comment: "Completely wrong."})

{
  id: [32m"aa8f1d3f-ff14-47b5-b974-48da07ff2faa"[39m,
  run_id: [32m"5abc59ee-0e42-4f5f-b600-7f8e76cb4794"[39m,
  key: [32m"Correctness"[39m,
  score: [33m0[39m,
  value: [90mundefined[39m,
  correction: [90mundefined[39m,
  comment: [32m"Completely wrong."[39m,
  feedback_source: { type: [32m"api"[39m, metadata: {} },
  feedbackConfig: [90mundefined[39m
}