Skip to content

Commit

Permalink
Add replicate example and fix event termination (#405)
Browse files Browse the repository at this point in the history
Fixes Replicate's non-standard SSE termination. They send an event named `done` with `data: {}` (https://replicate.com/docs/streaming). 

**The Problem**
<img width="539" alt="CleanShot 2023-08-01 at 09 29 18@2x" src="https://github.com/vercel-labs/ai/assets/4060187/9a92e64c-5448-465a-b23c-13232c8190a2">
<img width="403" alt="CleanShot 2023-08-01 at 09 29 13@2x" src="https://github.com/vercel-labs/ai/assets/4060187/c54db2db-92d2-43a8-8025-25f3974628fb">

Our current `customParser` abstraction only handles `event.data` which in the case of termination = `{}`. However, this could be part of the regular response output, and so isn't a sufficient conditional to decide termination. As a workaround, this ships an additional termination check in `AIStreamParser`. 

If we don't like this abstraction, we can refactor `customParser` to take the whole `event` and then move this logic into `ReplicateStream`.
  • Loading branch information
jaredpalmer committed Aug 1, 2023
1 parent 61ec9f4 commit 4df2a49
Show file tree
Hide file tree
Showing 18 changed files with 352 additions and 71 deletions.
5 changes: 5 additions & 0 deletions .changeset/tricky-swans-doubt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'ai': patch
---

Fix termination of ReplicateStream by removing the terminating `{}`from output
57 changes: 26 additions & 31 deletions docs/pages/docs/guides/providers/replicate.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ import { Steps, Callout } from 'nextra-theme-docs'

# Replicate

Vercel AI SDK supports streaming responses for certain Replicate models.
Vercel AI SDK supports streaming responses for certain [Replicate](https://replicate.com) text models (including Llama 2).
You can see supported models [on their website](https://replicate.com/docs/streaming).

## Guide: Text Completion
## Guide: Llama 2 Chatbot

<Steps>

Expand All @@ -33,7 +33,7 @@ REPLICATE_API_KEY=xxxxxxxxx

### Create a Route Handler

```tsx filename="app/api/completion/route.ts" showLineNumbers
```tsx filename="app/api/chat/route.ts" showLineNumbers
import { ReplicateStream, StreamingTextResponse } from 'ai'
import Replicate from 'replicate'

Expand All @@ -53,16 +53,18 @@ export async function POST(req: Request) {}
```tsx filename="app/api/completion/route.ts" showLineNumbers
export async function POST(req: Request) {
// Get the prompt from the request body
const { prompt } = await req.json()
const { messages } = await req.json()

const response = await replicate.predictions.create({
// IMPORTANT! You must enable streaming.
// You must enable streaming.
stream: true,
// The model must support streaming. See https://replicate.com/docs/streaming
// This is the model ID for Llama 2 70b Chat
version: '2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1',
// Format the message list into a single string with newlines
input: {
prompt
},
// IMPORTANT! The model must support streaming. See https://replicate.com/docs/streaming
version: '2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1'
prompt: messages.map(message => message.content).join('\n')
}
})

// Convert the response into a friendly text-stream
Expand All @@ -74,27 +76,26 @@ export async function POST(req: Request) {

### Wire up the UI

We can use the [`useCompletion`](/docs/api-reference/use-completion) hook to make it easy to wire up the UI. By default, the `useCompletion` hook will use the `POST` Route Handler we created above (it defaults to `/api/completion`). You can override this by passing a `api` prop to `useCompletion({ api: '...'})`.
Create a Client component with a form that we'll use to gather the prompt from the user and then stream back the completion from.
By default, the [`useChat`](/docs/api-reference/use-chat) hook will use the `POST` Route Handler we created above (it defaults to `/api/chat`). You can override this by passing a `api` prop to `useChat({ api: '...'})`.

```tsx filename="app/page.tsx" showLineNumbers
'use client'

import { useCompletion } from 'ai/react'

export default function Completion() {
const {
completion,
input,
stop,
isLoading,
handleInputChange,
handleSubmit
} = useCompletion({
api: '/api/completion'
})
import { useChat } from 'ai/react'

export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat()

return (
<div className="mx-auto w-full max-w-md py-24 flex flex-col stretch">
{messages.map(m => (
<div key={m.id}>
{m.role === 'user' ? 'User: ' : 'AI: '}
{m.content}
</div>
))}

<form onSubmit={handleSubmit}>
<label>
Say something...
Expand All @@ -104,13 +105,7 @@ export default function Completion() {
onChange={handleInputChange}
/>
</label>
<output>Completion result: {completion}</output>
<button type="button" onClick={stop}>
Stop
</button>
<button disabled={isLoading} type="submit">
Send
</button>
<button type="submit">Send</button>
</form>
</div>
)
Expand All @@ -123,7 +118,7 @@ export default function Completion() {

It’s common to want to save the result of a completion to a database after streaming it back to the user. The `ReplicateStream` adapter accepts a couple of optional callbacks that can be used to do this.

```tsx filename="app/api/completion/route.ts" showLineNumbers
```tsx filename="app/api/chat/route.ts" showLineNumbers
export async function POST(req: Request) {
// ...

Expand Down
1 change: 1 addition & 0 deletions examples/next-replicate/.env.local.example
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
REPLICATE_API_KEY=xxxxxx
35 changes: 35 additions & 0 deletions examples/next-replicate/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.

# dependencies
/node_modules
/.pnp
.pnp.js

# testing
/coverage

# next.js
/.next/
/out/

# production
/build

# misc
.DS_Store
*.pem

# debug
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# local env files
.env*.local

# vercel
.vercel

# typescript
*.tsbuildinfo
next-env.d.ts
42 changes: 42 additions & 0 deletions examples/next-replicate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Vercel AI SDK, Next.js, and Replicate (Llama 2) Chat Example

This example shows how to use the [Vercel AI SDK](https://sdk.vercel.ai/docs) with [Next.js](https://nextjs.org/) and [Meta's Llama 2 70b Chat model](https://replicate.com/replicate/llama-2-70b-chat) hosted on [Replicate](https://replicate.com) to create a ChatGPT-like AI-powered streaming chat bot.

## Deploy your own

Deploy the example using [Vercel](https://vercel.com?utm_source=github&utm_medium=readme&utm_campaign=ai-sdk-example):

[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Fvercel-labs%2Fai%2Ftree%2Fmain%2Fexamples%2Fnext-replicate&env=REPLICATE_API_KEY&envDescription=Replicate%20API%20Key&envLink=https://replicate.com/account/api-tokens&project-name=vercel-ai-chat-replicate&repository-name=vercel-ai-chat-replicate)

## How to use

Execute [`create-next-app`](https://github.com/vercel/next.js/tree/canary/packages/create-next-app) with [npm](https://docs.npmjs.com/cli/init), [Yarn](https://yarnpkg.com/lang/en/docs/cli/create/), or [pnpm](https://pnpm.io) to bootstrap the example:

```bash
npx create-next-app --example https://github.com/vercel-labs/ai/tree/main/examples/next-replicate next-replicate-app
```

```bash
yarn create next-app --example https://github.com/vercel-labs/ai/tree/main/examples/next-replicate next-replicate-app
```

```bash
pnpm create next-app --example https://github.com/vercel-labs/ai/tree/main/examples/next-replicate next-replicate-app
```

To run the example locally you need to:

1. Sign up at [Replicate's Platform](https://replicate.com/signin).
2. Go to [Replicate's dashboard](https://replicate.com/account/api-tokens) and create an API token.
3. Set the required Replicate environment variable as the token value as shown [the example env file](./.env.local.example) but in a new file called `.env.local`
4. `pnpm install` to install the required dependencies.
5. `pnpm dev` to launch the development server.

## Learn More

To learn more about OpenAI, Next.js, and the Vercel AI SDK take a look at the following resources:

- [Vercel AI SDK docs](https://sdk.vercel.ai/docs)
- [Vercel AI Playground](https://play.vercel.ai)
- [Replicate Documentation](https://replicate.com/docs) - learn about Replicate features and API.
- [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js features and API.
29 changes: 29 additions & 0 deletions examples/next-replicate/app/api/chat/route.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// ./app/api/chat/route.ts
import Replicate from 'replicate'
import { type Message, ReplicateStream, StreamingTextResponse } from 'ai'

// Create a Replicate API client (that's edge friendly!)
export const replicate = new Replicate({
auth: process.env.REPLICATE_API_KEY || ''
})

// IMPORTANT! Set the runtime to edge
export const runtime = 'edge'

export async function POST(req: Request) {
// Extract the `prompt` from the body of the request
const { messages } = await req.json()

// Ask Replicate for a streaming chat completion given the prompt
const prediction = await replicate.predictions.create({
// Llama-70b-chat
version: '2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1',
input: { prompt: messages.map((m: Message) => m.content).join('\n') },
stream: true
})

// Convert the response into a friendly text-stream
const stream = await ReplicateStream(prediction)
// Respond with the stream
return new StreamingTextResponse(stream)
}
Binary file added examples/next-replicate/app/favicon.ico
Binary file not shown.
3 changes: 3 additions & 0 deletions examples/next-replicate/app/globals.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
@tailwind base;
@tailwind components;
@tailwind utilities;
21 changes: 21 additions & 0 deletions examples/next-replicate/app/layout.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import './globals.css'
import { Inter } from 'next/font/google'

const inter = Inter({ subsets: ['latin'] })

export const metadata = {
title: 'Create Next App',
description: 'Generated by create next app'
}

export default function RootLayout({
children
}: {
children: React.ReactNode
}) {
return (
<html lang="en">
<body className={inter.className}>{children}</body>
</html>
)
}
29 changes: 29 additions & 0 deletions examples/next-replicate/app/page.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
'use client'

import { useChat } from 'ai/react'

export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat()

return (
<div className="flex flex-col w-full max-w-md py-24 mx-auto stretch">
{messages.length > 0
? messages.map(m => (
<div key={m.id} className="whitespace-pre-wrap">
{m.role === 'user' ? 'User: ' : 'AI: '}
{m.content}
</div>
))
: null}

<form onSubmit={handleSubmit}>
<input
className="fixed bottom-0 w-full max-w-md p-2 mb-8 border border-gray-300 rounded shadow-xl"
value={input}
placeholder="Say something..."
onChange={handleInputChange}
/>
</form>
</div>
)
}
4 changes: 4 additions & 0 deletions examples/next-replicate/next.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
/** @type {import('next').NextConfig} */
const nextConfig = {}

module.exports = nextConfig
29 changes: 29 additions & 0 deletions examples/next-replicate/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"name": "next-replicate",
"version": "0.1.0",
"private": true,
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "next lint"
},
"dependencies": {
"ai": "^2.1.29",
"next": "13.4.4-canary.11",
"react": "18.2.0",
"react-dom": "^18.2.0",
"replicate": "^0.14.1"
},
"devDependencies": {
"@types/node": "^17.0.12",
"@types/react": "18.2.8",
"@types/react-dom": "18.2.4",
"autoprefixer": "^10.4.14",
"eslint": "^7.32.0",
"eslint-config-next": "13.4.4-canary.11",
"postcss": "^8.4.23",
"tailwindcss": "^3.3.2",
"typescript": "5.1.3"
}
}
6 changes: 6 additions & 0 deletions examples/next-replicate/postcss.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
module.exports = {
plugins: {
tailwindcss: {},
autoprefixer: {}
}
}
18 changes: 18 additions & 0 deletions examples/next-replicate/tailwind.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
/** @type {import('tailwindcss').Config} */
module.exports = {
content: [
'./pages/**/*.{js,ts,jsx,tsx,mdx}',
'./components/**/*.{js,ts,jsx,tsx,mdx}',
'./app/**/*.{js,ts,jsx,tsx,mdx}'
],
theme: {
extend: {
backgroundImage: {
'gradient-radial': 'radial-gradient(var(--tw-gradient-stops))',
'gradient-conic':
'conic-gradient(from 180deg at 50% 50%, var(--tw-gradient-stops))'
}
}
},
plugins: []
}
28 changes: 28 additions & 0 deletions examples/next-replicate/tsconfig.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"compilerOptions": {
"target": "es5",
"lib": ["dom", "dom.iterable", "esnext"],
"allowJs": true,
"skipLibCheck": true,
"strict": true,
"forceConsistentCasingInFileNames": true,
"noEmit": true,
"esModuleInterop": true,
"module": "esnext",
"moduleResolution": "node",
"resolveJsonModule": true,
"isolatedModules": true,
"jsx": "preserve",
"incremental": true,
"plugins": [
{
"name": "next"
}
],
"paths": {
"@/*": ["./*"]
}
},
"include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", ".next/types/**/*.ts"],
"exclude": ["node_modules"]
}
9 changes: 6 additions & 3 deletions packages/core/streams/ai-stream.ts
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,12 @@ export function createEventStreamTransformer(
eventSourceParser = createParser(
(event: ParsedEvent | ReconnectInterval) => {
if (
'data' in event &&
event.type === 'event' &&
event.data === '[DONE]'
('data' in event &&
event.type === 'event' &&
event.data === '[DONE]') ||
// Replicate doesn't send [DONE] but does send a 'done' event
// @see https://replicate.com/docs/streaming
(event as any).event === 'done'
) {
controller.terminate()
return
Expand Down

0 comments on commit 4df2a49

Please sign in to comment.