Skip to content

[JS][Proposal] Streamlined Generation APIs #939

@mbleigh

Description

@mbleigh

This is a proposed breaking change API for Genkit to streamline the most common scenarios while keeping the flexibility and capability level constant. The changes can be broken down into three components:

  1. Encouraging default model configurations
  2. Streamlining generation to return data directly instead of returning a wrapping response
  3. Separating out multi-turn and single-turn use cases

Default Model Configurations

While one of the strengths of Genkit is the ability to easily swap between multiple models, we find in practice that most people use a single model as their "go-to" with other models swapped in as needed. The same goes for model configuration -- most of the time you're going to want the same settings.

Proposed is to encourage setting a default model (now just called model) when initializing Genkit as well as the ability to define model settings when instantiating a reference to a model.

import { genkit } from "genkit";
import { vertexAI } from "@genkit-ai/vertexAI";

const ai = genkit({
  plugins: [vertexAI()],
  // sets a default model with configuration
  model: vertexAI.geminiModel('gemini-1.5-flash', {safetySettings: [...]});
});

const claude = vertexAI.anthropicModel('claude-3.5-sonnet', {...claudeSettings});

Both model and configuration can still be overridden at call time, but this makes it easier to set a common reusable baseline.

Streamlining Generation

Most of the time, what you want from a generate() call is the data that is being generated. Today this requires a two-line "get response, get output from response" pattern which gets tedious when working with e.g. multi-step processes.

Proposed is to simplify to a generate API that will return text or structured data depending on call configuration:

const jokeText = await ai.generate("Tell a funny joke.");

const fakePerson = await ai.generate({
  prompt: "Generate the information for an imaginary person named Annaka",
  schema: z.object({name: z.string(), job: z.string(), hobbies: z.array(z.string())}),
});

This can get more complex if you want it to:

const jokeAdvanced = await ai.generate({
  model: gpt,
  config: {...},
  prompt: {role: "user", content: [{text: "Tell a funny joke."}],
});

When developers do want to dig into the metadata of the response, they can use a new generateResponse method which will be equivalent to generate today.

const jokeResponse = await ai.generateResponse("Tell a funny joke.");
console.log(jokeResponse.text());
console.log(jokeResponse.stopReason);

Streaming will be supported through streamGenerate and streamGenerateResponse. When doing streamGenerate, the chunks emitted will be in output form (either a partial data response or a string chunk):

const {stream, data} = ai.streamGenerate("Tell a really long joke with at least 5 paragraphs.");

for await (const chunk of stream) {
  console.log(chunk); // chunk is just a string
}

console.log(await data); // this is the full result, equiavalent to `generate()`

const {stream, response} = ai.streamGenerateResponse(...);
for await (const chunk of stream) {
  console.log(chunk.text()); // chunk is a Chunk instance
}

console.log((await response).usage);

Multi-Turn Generation

All of the above is great if you only have a single turn generation, but it doesn't really help for a chatbot scenario. Fundamentally multi-turn use cases are pretty different and deserve better attention in the API surface.

Proposed is a new Chat class and a new send() method that lets you explicitly opt-in to multi-turn conversational use cases.

const chat = ai.chat({
  system: "You are a pirate.",
});

const response = await chat.send("How are you today?");
console.log(reply);
// "Yarr, not too bad, matey. How be ye?"
const {stream, data} = await chat.streamSend("Tell me a long story, ye scurvy sea dog!");
chat.messages(); // equivalent to `toHistory()` in current Genkit

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions