Skip to content

WeHomeBot/ling

Repository files navigation

Ling (灵)

Ling is a workflow framework that supports streaming of structured content generated by large language models (LLMs). It enables quick responses to content streams produced by agents or bots within the workflow, thereby reducing waiting times.

ling workflow

Core Features

  • Supports data stream output via JSONL protocol
  • Automatic correction of token errors in JSON output
  • Supports complex asynchronous workflows with multiple agents/bots
  • Supports status messages during streaming output
  • Supports Server-Sent Events
  • HTML and JSON parsers for efficient stream processing
  • Compatible with OpenAI and other LLM providers
  • MCP (Model Context Protocol) Client Support - Enables tool calling and external service integration
  • Provides Client SDK

Introduction

Complex AI workflows, such as those found in Bearbobo Learning Companion, require multiple agents/bots to process structured data collaboratively. However, considering real-time responses, utilizing structured data outputs is not conducive to enhancing timeliness through a streaming interface.

The commonly used JSON data format, although flexible, has structural integrity, meaning it is difficult to parse correctly until all the content is completely outputted. Of course, other structured data formats like YAML can be adopted, but they are not as powerful and convenient as JSON. Ling is a streaming framework created to address this issue. Its core is a real-time converter that can parse incoming JSON data streams character by character, outputting content in the form of jsonuri.

For example, consider the following JSON format:

{
  "outline": [
    {
      "topic": "What are clouds made of?"
    },
    {
      "topic": "Why do clouds look soft?"
    }
  ]
  // ...
}

During streaming input, the content may be converted in real-time into the following data outputs (using Server-sent Events):

data: {"uri": "outline/0/topic", "delta": "clo"}
data: {"uri": "outline/0/topic", "delta": "uds"}
data: {"uri": "outline/0/topic", "delta": "are"}
data: {"uri": "outline/0/topic", "delta": "mad"}
data: {"uri": "outline/0/topic", "delta": "e"}
data: {"uri": "outline/0/topic", "delta": "of"}
data: {"uri": "outline/0/topic", "delta": "?"}
data: {"uri": "outline/1/topic", "delta": "Why"}
data: {"uri": "outline/1/topic", "delta": "do"}
data: {"uri": "outline/1/topic", "delta": "clo"}
data: {"uri": "outline/1/topic", "delta": "uds"}
data: {"uri": "outline/1/topic", "delta": "loo"}
data: {"uri": "outline/1/topic", "delta": "k"}
data: {"uri": "outline/1/topic", "delta": "sof"}
data: {"uri": "outline/1/topic", "delta": "t"}
data: {"uri": "outline/1/topic", "delta": "?"}
...

This method of real-time data transmission facilitates immediate front-end processing and enables responsive UI updates.

Installation

npm install @bearbobo/ling
# or
pnpm add @bearbobo/ling
# or
yarn add @bearbobo/ling

Supported Models

Ling supports various LLM providers and models:

  • OpenAI: GPT-4, GPT-4-Turbo, GPT-4o, GPT-3.5-Turbo
  • Moonshot: moonshot-v1-8k, moonshot-v1-32k, moonshot-v1-128k
  • Deepseek
  • Qwen: qwen-max-longcontext, qwen-long
  • Yi: yi-medium

Demo

Server Example:

import 'dotenv/config';

import express from 'express';
import bodyParser from 'body-parser';
import cors from 'cors';

import { Ling } from "@bearbobo/ling";
import type { ChatConfig } from "@bearbobo/ling/types";

import { pipeline } from 'node:stream/promises';

const apiKey = process.env.API_KEY as string;
const model_name = process.env.MODEL_NAME as string;
const endpoint = process.env.ENDPOINT as string;

const app = express();

app.use(cors());
app.use(bodyParser.json());

const port = 3000;

app.get('/', (req, res) => {
  res.send('Hello World!');
});

app.post('/api', async (req, res) => {
  const question = req.body.question;

  const config: ChatConfig = {
    model_name,
    api_key: apiKey,
    endpoint: endpoint,
  };

  // ------- The work flow start --------
  const ling = new Ling(config);
  const bot = ling.createBot(/*'bearbobo'*/);
  bot.addPrompt('Respond to me in JSON format, starting with {.\n[Example]\n{"answer": "My response"}');
  bot.chat(question);
  bot.on('string-response', ({uri, delta}) => {
    // Infer the content of the string in the JSON, and send the content of the 'answer' field to the second bot.
    console.log('bot string-response', uri, delta);

    const bot2 = ling.createBot(/*'bearbobo'*/);
    bot2.addPrompt(`Expand the content I gave you into more detailed content, answer me in JSON format, place the detailed answer text in the 'details' field, and place 2-3 related knowledge points in the 'related_question' field.\n[Example]\n{"details": "My detailed answer", "related_question": [...]}`);
    bot2.chat(delta);
    bot2.on('response', (content) => {
      // Stream data push completed.
      console.log('bot2 response finished', content);
    });

    const bot3 = ling.createBot();
    bot3.addPrompt('Expand the content I gave you into more detailed content, using Chinese. answer me in JSON format, place the detailed answer in Chinese in the 'details' field.\n[Example]\n{"details_cn": "my answer..."}');
    bot3.chat(delta);
    bot3.on('response', (content) => {
      // Stream data push completed.
      console.log('bot3 response finished', content);
    });
  });
  ling.close(); // It can be directly closed, and when closing, it checks whether the status of all bots has been finished.
  // ------- The work flow end --------

  // setting below headers for Streaming the data
  res.writeHead(200, {
    'Content-Type': "text/event-stream",
    'Cache-Control': "no-cache",
    'Connection': "keep-alive"
  });

  console.log(ling.stream);

  pipeline((ling.stream as any), res);
});

app.listen(port, () => {
  console.log(`Example app listening at http://localhost:${port}`);
});

Client

<script setup>
import { onMounted, ref } from 'vue';
import { set, get } from 'jsonuri';

const response = ref({
  answer: 'Brief:',
  details: 'Details:',
  details_eng: 'Translation:',
  related_question: [
    '?',
    '?',
    '?'
  ],
});
onMounted(async () => {
  const res = await fetch('http://localhost:3000/api', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      question: 'Can I laid on the cloud?'
    }),
  });
  const reader = res.body.getReader();
  const decoder = new TextDecoder();
  let done = false;
  const data = {
    answer: 'Brief:',
    details: 'Details:',
    related_question: [],
  };
  while (!done) {
    const { value, done: doneReading } = await reader.read();
    done = doneReading;
    if(!done) {
      const content = decoder.decode(value);
      const lines = content.trim().split('\n');
      for(const line of lines) {
        const input = JSON.parse(line);
        if(input.uri) {
          const content = get(data, input.uri);
          set(data, input.uri, (content || '') + input.delta);
          response.value = {...data};
        }
      }
    }
  }
});
</script>

<template>
  <h1>Hello~</h1>
  <p>{{ response.answer }}</p>
  <p>{{ response.details }}</p>
  <p>{{ response.details_eng }}</p>
  <p v-for="item in response.related_question" :key="item.id"> >>> {{ item }}</p>
</template>

Bot Events

string-response

This event is triggered when a string field in the JSON output by the AI is completed, returning a jsonuri object.

inference-done

This event is triggered when the AI has completed its current inference, returning the complete output content. At this point, streaming output may not have ended, and data continues to be sent to the front end.

response

This event is triggered when all data generated by the AI during this session has been sent to the front end.

Note: Typically, the string-response event occurs before inference-done, which in turn occurs before response.

Custom Event

Sometimes, we might want to send custom events to the front end to update its status. On the server, you can use ling.sendEvent({event, data}) to push messages to the front end. The front end can then receive and process JSON objects {event, data} from the stream.

bot.on('inference-done', () => {
  bot.sendEvent({event: 'inference-done', state: 'Outline generated!'});
});

Alternatively, you can also directly push jsonuri status updates, making it easier for the front end to set directly.

bot.on('inference-done', () => {
  bot.sendEvent({uri: 'state/outline', delta: true});
});

Server-sent Events

You can force ling to response the Server-Sent Events data format by using ling.setSSE(true). This allows the front end to handle the data using the EventSource API.

const es = new EventSource('http://localhost:3000/?question=Can I laid on the cloud?');

es.onmessage = (e) => {
  console.log(e.data);
}

es.onopen = () => {
  console.log('Connecting');
}

es.onerror = (e) => {
  console.log(e);
}

Basic Usage

import { Ling, ChatConfig, ChatOptions } from '@bearbobo/ling';

// Configure LLM provider
const config: ChatConfig = {
  model_name: 'gpt-4-turbo',  // or any other supported model
  api_key: 'your-api-key',
  endpoint: 'https://api.openai.com/v1/chat/completions',
  sse: true  // Enable Server-Sent Events
};

// Optional settings
const options: ChatOptions = {
  temperature: 0.7,
  max_tokens: 2000
};

// Create Ling instance
const ling = new Ling(config, options);

// Create a bot for chat
const bot = ling.createBot();

// Add system prompt
bot.addPrompt('You are a helpful assistant.');

// Handle streaming response
ling.on('message', (message) => {
  console.log('Received message:', message);
});

// Handle completion event
ling.on('finished', () => {
  console.log('Chat completed');
});

// Handle bot's response
bot.on('string-response', (content) => {
  console.log('Bot response:', content);
});

// Start chat with user message
await bot.chat('Tell me about cloud computing.');

// Close the connection when done
await ling.close();

API Reference

Ling Class

The main class for managing LLM interactions and workflows.

new Ling(config: ChatConfig, options?: ChatOptions)

Methods

  • createBot(root?: string | null, config?: Partial<ChatConfig>, options?: Partial<ChatOptions>): Creates a new ChatBot instance
  • addBot(bot: Bot): Adds an existing Bot to the workflow
  • setCustomParams(params: Record<string, string>): Sets custom parameters for template rendering
  • setSSE(sse: boolean): Enables or disables Server-Sent Events
  • close(): Closes all connections and waits for bots to finish
  • cancel(): Cancels all ongoing operations
  • sendEvent(event: any): Sends a custom event through the tube

Properties

  • tube: Gets the underlying Tube instance
  • model: Gets the model name
  • stream: Gets the ReadableStream
  • id: Gets the session ID

Events

  • message: Emitted when a message is received
  • finished: Emitted when all operations are finished
  • canceled: Emitted when operations are canceled
  • inference-done: Emitted when a bot completes inference

ChatBot Class

Handles individual chat interactions with LLMs.

new ChatBot(tube: Tube, config: ChatConfig, options?: ChatOptions)

Methods

  • addPrompt(promptTpl: string, promptData?: Record<string, any>): Adds a system prompt with template support
  • setPrompt(promptTpl: string, promptData?: Record<string, string>): Sets a single system prompt
  • addHistory(messages: ChatCompletionMessageParam[]): Adds message history
  • setHistory(messages: ChatCompletionMessageParam[]): Sets message history
  • addFilter(filter: ((data: any) => boolean) | string | RegExp | FilterMap): Adds a filter for messages
  • clearFilters(): Clears all filters
  • chat(message: string | ChatCompletionContentPart[]): Starts a chat with the given message
  • finish(): Marks the bot as finished

Events

  • string-response: Emitted for text responses
  • object-response: Emitted for object responses
  • inference-done: Emitted when inference is complete
  • response: Emitted when the full response is complete
  • error: Emitted on errors

ChatConfig

interface ChatConfig {
  model_name: string;      // LLM model name
  endpoint: string;        // API endpoint
  api_key: string;         // API key
  api_version?: string;    // API version (for some providers)
  session_id?: string;     // Custom session ID
  max_tokens?: number;     // Maximum tokens to generate
  sse?: boolean;           // Enable Server-Sent Events
}

ChatOptions

interface ChatOptions {
  temperature?: number;        // Controls randomness (0-1)
  presence_penalty?: number;   // Penalizes repetition
  frequency_penalty?: number;  // Penalizes frequency
  stop?: string[];            // Stop sequences
  top_p?: number;             // Nucleus sampling parameter
  response_format?: any;       // Response format settings
  max_tokens?: number;        // Maximum tokens to generate
  quiet?: boolean;            // Suppress output
  bot_id?: string;            // Custom bot ID
  tool_type?: 'function_call' | 'tool_call';  // Tool calling type for MCP
}

MCP Client Support

Ling now supports MCP (Model Context Protocol) for tool calling and external service integration.

MCP Client Usage

import { Ling } from '@bearbobo/ling';
import type { ChatConfig } from '@bearbobo/ling';

const config: ChatConfig = {
  model_name: 'gpt-4',
  api_key: 'your-api-key',
  endpoint: 'https://api.openai.com/v1'
};

const ling = new Ling(config);

// 注册MCP服务器
ling.registerMSPServers({
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        process.cwd()
      ]
    },
    "brave-search": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-brave-search"]
    }
  }
});

// 创建Bot并使用MCP功能
const bot = ling.createBot();
bot.chat('请帮我读取当前目录下的文件列表');

MCP Configuration Types

interface McpServerConfig {
  command: string;  // Command to start the MCP server
  args: string[];   // Arguments for the command
}

interface McpServersConfig {
  [name: string]: McpServerConfig;  // Named MCP server configurations
}

MCP Related Methods

Ling Class:

  • registerMSPServers(config: McpServersConfig): Register multiple MCP servers

Internal MCPClient Methods (accessed through Ling):

  • registerServer(name: string, config: McpServerConfig): Register a single MCP server
  • listTools(toolType?: 'function_call' | 'tool_call'): List all available tools from registered servers
  • callTool(name: string, args: any): Execute a tool call

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the Apache License - see the LICENSE file for details.

About

The LLMs' framework optimized for ultra-fast response times.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published