Ling is a workflow framework that supports streaming of structured content generated by large language models (LLMs). It enables quick responses to content streams produced by agents or bots within the workflow, thereby reducing waiting times.
- Supports data stream output via JSONL protocol
- Automatic correction of token errors in JSON output
- Supports complex asynchronous workflows with multiple agents/bots
- Supports status messages during streaming output
- Supports Server-Sent Events
- HTML and JSON parsers for efficient stream processing
- Compatible with OpenAI and other LLM providers
- MCP (Model Context Protocol) Client Support - Enables tool calling and external service integration
- Provides Client SDK
Complex AI workflows, such as those found in Bearbobo Learning Companion, require multiple agents/bots to process structured data collaboratively. However, considering real-time responses, utilizing structured data outputs is not conducive to enhancing timeliness through a streaming interface.
The commonly used JSON data format, although flexible, has structural integrity, meaning it is difficult to parse correctly until all the content is completely outputted. Of course, other structured data formats like YAML can be adopted, but they are not as powerful and convenient as JSON. Ling is a streaming framework created to address this issue. Its core is a real-time converter that can parse incoming JSON data streams character by character, outputting content in the form of jsonuri.
For example, consider the following JSON format:
{
"outline": [
{
"topic": "What are clouds made of?"
},
{
"topic": "Why do clouds look soft?"
}
]
// ...
}
During streaming input, the content may be converted in real-time into the following data outputs (using Server-sent Events):
data: {"uri": "outline/0/topic", "delta": "clo"}
data: {"uri": "outline/0/topic", "delta": "uds"}
data: {"uri": "outline/0/topic", "delta": "are"}
data: {"uri": "outline/0/topic", "delta": "mad"}
data: {"uri": "outline/0/topic", "delta": "e"}
data: {"uri": "outline/0/topic", "delta": "of"}
data: {"uri": "outline/0/topic", "delta": "?"}
data: {"uri": "outline/1/topic", "delta": "Why"}
data: {"uri": "outline/1/topic", "delta": "do"}
data: {"uri": "outline/1/topic", "delta": "clo"}
data: {"uri": "outline/1/topic", "delta": "uds"}
data: {"uri": "outline/1/topic", "delta": "loo"}
data: {"uri": "outline/1/topic", "delta": "k"}
data: {"uri": "outline/1/topic", "delta": "sof"}
data: {"uri": "outline/1/topic", "delta": "t"}
data: {"uri": "outline/1/topic", "delta": "?"}
...
This method of real-time data transmission facilitates immediate front-end processing and enables responsive UI updates.
npm install @bearbobo/ling
# or
pnpm add @bearbobo/ling
# or
yarn add @bearbobo/ling
Ling supports various LLM providers and models:
- OpenAI: GPT-4, GPT-4-Turbo, GPT-4o, GPT-3.5-Turbo
- Moonshot: moonshot-v1-8k, moonshot-v1-32k, moonshot-v1-128k
- Deepseek
- Qwen: qwen-max-longcontext, qwen-long
- Yi: yi-medium
Server Example:
import 'dotenv/config';
import express from 'express';
import bodyParser from 'body-parser';
import cors from 'cors';
import { Ling } from "@bearbobo/ling";
import type { ChatConfig } from "@bearbobo/ling/types";
import { pipeline } from 'node:stream/promises';
const apiKey = process.env.API_KEY as string;
const model_name = process.env.MODEL_NAME as string;
const endpoint = process.env.ENDPOINT as string;
const app = express();
app.use(cors());
app.use(bodyParser.json());
const port = 3000;
app.get('/', (req, res) => {
res.send('Hello World!');
});
app.post('/api', async (req, res) => {
const question = req.body.question;
const config: ChatConfig = {
model_name,
api_key: apiKey,
endpoint: endpoint,
};
// ------- The work flow start --------
const ling = new Ling(config);
const bot = ling.createBot(/*'bearbobo'*/);
bot.addPrompt('Respond to me in JSON format, starting with {.\n[Example]\n{"answer": "My response"}');
bot.chat(question);
bot.on('string-response', ({uri, delta}) => {
// Infer the content of the string in the JSON, and send the content of the 'answer' field to the second bot.
console.log('bot string-response', uri, delta);
const bot2 = ling.createBot(/*'bearbobo'*/);
bot2.addPrompt(`Expand the content I gave you into more detailed content, answer me in JSON format, place the detailed answer text in the 'details' field, and place 2-3 related knowledge points in the 'related_question' field.\n[Example]\n{"details": "My detailed answer", "related_question": [...]}`);
bot2.chat(delta);
bot2.on('response', (content) => {
// Stream data push completed.
console.log('bot2 response finished', content);
});
const bot3 = ling.createBot();
bot3.addPrompt('Expand the content I gave you into more detailed content, using Chinese. answer me in JSON format, place the detailed answer in Chinese in the 'details' field.\n[Example]\n{"details_cn": "my answer..."}');
bot3.chat(delta);
bot3.on('response', (content) => {
// Stream data push completed.
console.log('bot3 response finished', content);
});
});
ling.close(); // It can be directly closed, and when closing, it checks whether the status of all bots has been finished.
// ------- The work flow end --------
// setting below headers for Streaming the data
res.writeHead(200, {
'Content-Type': "text/event-stream",
'Cache-Control': "no-cache",
'Connection': "keep-alive"
});
console.log(ling.stream);
pipeline((ling.stream as any), res);
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
Client
<script setup>
import { onMounted, ref } from 'vue';
import { set, get } from 'jsonuri';
const response = ref({
answer: 'Brief:',
details: 'Details:',
details_eng: 'Translation:',
related_question: [
'?',
'?',
'?'
],
});
onMounted(async () => {
const res = await fetch('http://localhost:3000/api', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
question: 'Can I laid on the cloud?'
}),
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
let done = false;
const data = {
answer: 'Brief:',
details: 'Details:',
related_question: [],
};
while (!done) {
const { value, done: doneReading } = await reader.read();
done = doneReading;
if(!done) {
const content = decoder.decode(value);
const lines = content.trim().split('\n');
for(const line of lines) {
const input = JSON.parse(line);
if(input.uri) {
const content = get(data, input.uri);
set(data, input.uri, (content || '') + input.delta);
response.value = {...data};
}
}
}
}
});
</script>
<template>
<h1>Hello~</h1>
<p>{{ response.answer }}</p>
<p>{{ response.details }}</p>
<p>{{ response.details_eng }}</p>
<p v-for="item in response.related_question" :key="item.id"> >>> {{ item }}</p>
</template>
This event is triggered when a string field in the JSON output by the AI is completed, returning a jsonuri object.
This event is triggered when the AI has completed its current inference, returning the complete output content. At this point, streaming output may not have ended, and data continues to be sent to the front end.
This event is triggered when all data generated by the AI during this session has been sent to the front end.
Note: Typically, the
string-response
event occurs beforeinference-done
, which in turn occurs beforeresponse
.
Sometimes, we might want to send custom events to the front end to update its status. On the server, you can use ling.sendEvent({event, data})
to push messages to the front end. The front end can then receive and process JSON objects {event, data}
from the stream.
bot.on('inference-done', () => {
bot.sendEvent({event: 'inference-done', state: 'Outline generated!'});
});
Alternatively, you can also directly push jsonuri status updates, making it easier for the front end to set directly.
bot.on('inference-done', () => {
bot.sendEvent({uri: 'state/outline', delta: true});
});
You can force ling to response the Server-Sent Events data format by using ling.setSSE(true)
. This allows the front end to handle the data using the EventSource API.
const es = new EventSource('http://localhost:3000/?question=Can I laid on the cloud?');
es.onmessage = (e) => {
console.log(e.data);
}
es.onopen = () => {
console.log('Connecting');
}
es.onerror = (e) => {
console.log(e);
}
import { Ling, ChatConfig, ChatOptions } from '@bearbobo/ling';
// Configure LLM provider
const config: ChatConfig = {
model_name: 'gpt-4-turbo', // or any other supported model
api_key: 'your-api-key',
endpoint: 'https://api.openai.com/v1/chat/completions',
sse: true // Enable Server-Sent Events
};
// Optional settings
const options: ChatOptions = {
temperature: 0.7,
max_tokens: 2000
};
// Create Ling instance
const ling = new Ling(config, options);
// Create a bot for chat
const bot = ling.createBot();
// Add system prompt
bot.addPrompt('You are a helpful assistant.');
// Handle streaming response
ling.on('message', (message) => {
console.log('Received message:', message);
});
// Handle completion event
ling.on('finished', () => {
console.log('Chat completed');
});
// Handle bot's response
bot.on('string-response', (content) => {
console.log('Bot response:', content);
});
// Start chat with user message
await bot.chat('Tell me about cloud computing.');
// Close the connection when done
await ling.close();
The main class for managing LLM interactions and workflows.
new Ling(config: ChatConfig, options?: ChatOptions)
createBot(root?: string | null, config?: Partial<ChatConfig>, options?: Partial<ChatOptions>)
: Creates a new ChatBot instanceaddBot(bot: Bot)
: Adds an existing Bot to the workflowsetCustomParams(params: Record<string, string>)
: Sets custom parameters for template renderingsetSSE(sse: boolean)
: Enables or disables Server-Sent Eventsclose()
: Closes all connections and waits for bots to finishcancel()
: Cancels all ongoing operationssendEvent(event: any)
: Sends a custom event through the tube
tube
: Gets the underlying Tube instancemodel
: Gets the model namestream
: Gets the ReadableStreamid
: Gets the session ID
message
: Emitted when a message is receivedfinished
: Emitted when all operations are finishedcanceled
: Emitted when operations are canceledinference-done
: Emitted when a bot completes inference
Handles individual chat interactions with LLMs.
new ChatBot(tube: Tube, config: ChatConfig, options?: ChatOptions)
addPrompt(promptTpl: string, promptData?: Record<string, any>)
: Adds a system prompt with template supportsetPrompt(promptTpl: string, promptData?: Record<string, string>)
: Sets a single system promptaddHistory(messages: ChatCompletionMessageParam[])
: Adds message historysetHistory(messages: ChatCompletionMessageParam[])
: Sets message historyaddFilter(filter: ((data: any) => boolean) | string | RegExp | FilterMap)
: Adds a filter for messagesclearFilters()
: Clears all filterschat(message: string | ChatCompletionContentPart[])
: Starts a chat with the given messagefinish()
: Marks the bot as finished
string-response
: Emitted for text responsesobject-response
: Emitted for object responsesinference-done
: Emitted when inference is completeresponse
: Emitted when the full response is completeerror
: Emitted on errors
interface ChatConfig {
model_name: string; // LLM model name
endpoint: string; // API endpoint
api_key: string; // API key
api_version?: string; // API version (for some providers)
session_id?: string; // Custom session ID
max_tokens?: number; // Maximum tokens to generate
sse?: boolean; // Enable Server-Sent Events
}
interface ChatOptions {
temperature?: number; // Controls randomness (0-1)
presence_penalty?: number; // Penalizes repetition
frequency_penalty?: number; // Penalizes frequency
stop?: string[]; // Stop sequences
top_p?: number; // Nucleus sampling parameter
response_format?: any; // Response format settings
max_tokens?: number; // Maximum tokens to generate
quiet?: boolean; // Suppress output
bot_id?: string; // Custom bot ID
tool_type?: 'function_call' | 'tool_call'; // Tool calling type for MCP
}
Ling now supports MCP (Model Context Protocol) for tool calling and external service integration.
import { Ling } from '@bearbobo/ling';
import type { ChatConfig } from '@bearbobo/ling';
const config: ChatConfig = {
model_name: 'gpt-4',
api_key: 'your-api-key',
endpoint: 'https://api.openai.com/v1'
};
const ling = new Ling(config);
// 注册MCP服务器
ling.registerMSPServers({
"mcpServers": {
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
process.cwd()
]
},
"brave-search": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-brave-search"]
}
}
});
// 创建Bot并使用MCP功能
const bot = ling.createBot();
bot.chat('请帮我读取当前目录下的文件列表');
interface McpServerConfig {
command: string; // Command to start the MCP server
args: string[]; // Arguments for the command
}
interface McpServersConfig {
[name: string]: McpServerConfig; // Named MCP server configurations
}
Ling Class:
registerMSPServers(config: McpServersConfig)
: Register multiple MCP servers
Internal MCPClient Methods (accessed through Ling):
registerServer(name: string, config: McpServerConfig)
: Register a single MCP serverlistTools(toolType?: 'function_call' | 'tool_call')
: List all available tools from registered serverscallTool(name: string, args: any)
: Execute a tool call
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the Apache License - see the LICENSE file for details.