[JS][Proposal] Streamlined Generation APIs

This is a proposed breaking change API for Genkit to streamline the most common scenarios while keeping the flexibility and capability level constant. The changes can be broken down into three components:

1. Encouraging default model configurations
2. Streamlining generation to return data directly instead of returning a wrapping response
3. Separating out multi-turn and single-turn use cases

## Default Model Configurations

While one of the strengths of Genkit is the ability to easily swap between multiple models, we find in practice that most people use a single model as their "go-to" with other models swapped in as needed. The same goes for model configuration -- most of the time you're going to want the same settings.

Proposed is to encourage setting a default model (now just called `model`) when initializing Genkit as well as the ability to define model settings when instantiating a reference to a model.

```ts
import { genkit } from "genkit";
import { vertexAI } from "@genkit-ai/vertexAI";

const ai = genkit({
  plugins: [vertexAI()],
  // sets a default model with configuration
  model: vertexAI.geminiModel('gemini-1.5-flash', {safetySettings: [...]});
});

const claude = vertexAI.anthropicModel('claude-3.5-sonnet', {...claudeSettings});
```

Both model and configuration can still be overridden at call time, but this makes it easier to set a common reusable baseline.

## Streamlining Generation

Most of the time, what you want from a `generate()` call is the data that is being generated. Today this requires a two-line "get response, get output from response" pattern which gets tedious when working with e.g. multi-step processes.

Proposed is to simplify to a `generate` API that will return text or structured data depending on call configuration:

```ts
const jokeText = await ai.generate("Tell a funny joke.");

const fakePerson = await ai.generate({
  prompt: "Generate the information for an imaginary person named Annaka",
  schema: z.object({name: z.string(), job: z.string(), hobbies: z.array(z.string())}),
});
```

This can get more complex if you want it to:

```ts
const jokeAdvanced = await ai.generate({
  model: gpt,
  config: {...},
  prompt: {role: "user", content: [{text: "Tell a funny joke."}],
});
```

When developers do want to dig into the metadata of the response, they can use a new `generateResponse` method which will be equivalent to `generate` today.

```ts
const jokeResponse = await ai.generateResponse("Tell a funny joke.");
console.log(jokeResponse.text());
console.log(jokeResponse.stopReason);
```

Streaming will be supported through `streamGenerate` and `streamGenerateResponse`. When doing `streamGenerate`, the chunks emitted will be in output form (either a partial data response or a string chunk):

```ts
const {stream, data} = ai.streamGenerate("Tell a really long joke with at least 5 paragraphs.");

for await (const chunk of stream) {
  console.log(chunk); // chunk is just a string
}

console.log(await data); // this is the full result, equiavalent to `generate()`

const {stream, response} = ai.streamGenerateResponse(...);
for await (const chunk of stream) {
  console.log(chunk.text()); // chunk is a Chunk instance
}

console.log((await response).usage);
```

## Multi-Turn Generation

All of the above is great if you only have a single turn generation, but it doesn't really help for a chatbot scenario. Fundamentally multi-turn use cases are pretty different and deserve better attention in the API surface.

Proposed is a new `Chat` class and a new `send()` method that lets you explicitly opt-in to multi-turn conversational use cases.

```ts
const chat = ai.chat({
  system: "You are a pirate.",
});

const response = await chat.send("How are you today?");
console.log(reply);
// "Yarr, not too bad, matey. How be ye?"
const {stream, data} = await chat.streamSend("Tell me a long story, ye scurvy sea dog!");
chat.messages(); // equivalent to `toHistory()` in current Genkit
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JS][Proposal] Streamlined Generation APIs #939

Default Model Configurations

Streamlining Generation

Multi-Turn Generation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[JS][Proposal] Streamlined Generation APIs #939

Description

Default Model Configurations

Streamlining Generation

Multi-Turn Generation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions