Support RunId or Experimentation API in a cross-process manner

### Description

When setting up evals for an existing system, the reality is that the "AI pipeline" is often not "pure". It depends on a lot of external resources in the middle. This makes it hard to simply "extract" it out and run it as part of an experiment.

As such, it is common to have an external tool/script which triggers an endpoint to start the AI pipeline. Generally, this is the same endpoint that real users would trigger in the product. This script can then be pointed at a local development or even production environment to generate logs.

However, it is possible to run the same set of inputs against the same versioned function multiple times. It would be helpful to be able to annotate and compare these separately (i.e. analogous to an A/A test).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support RunId or Experimentation API in a cross-process manner #433

Description

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support RunId or Experimentation API in a cross-process manner #433

Description

Description

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions