Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sweep(slow): count tokens on the server side #17

Open
4 tasks done
CNSeniorious000 opened this issue Nov 30, 2023 · 1 comment · May be fixed by #20
Open
4 tasks done

Sweep(slow): count tokens on the server side #17

CNSeniorious000 opened this issue Nov 30, 2023 · 1 comment · May be fixed by #20
Labels

Comments

@CNSeniorious000
Copy link
Owner

CNSeniorious000 commented Nov 30, 2023

Details

在 src/pages/api/generate.ts 中加上和 src/components/Generator.tsx 中一样的裁剪 messages 的逻辑:
但是注意:服务端用不了 tiktoken 库,只能用 tiktoken-js 库,他们应该有类似的 interface

Checklist
  • Create src/utils/tiktoken-server.ts09d7244
  • Running GitHub Actions for src/utils/tiktoken-server.ts
  • Modify src/pages/api/generate.ts30a5ea4
  • Running GitHub Actions for src/pages/api/generate.ts

Flowchart

Copy link

sweep-ai bot commented Nov 30, 2023

Here's the PR! #20.

Sweep Basic Tier: I'm using GPT-4. You have 4 GPT-4 tickets left for the month and 3 for the day. (tracking ID: 1bfb7247a1)

For more GPT-4 tickets, visit our payment portal. For a one week free trial, try Sweep Pro (unlimited GPT-4 tickets).

Actions (click)

  • ↻ Restart Sweep

Sandbox Execution ✓

Here are the sandbox execution logs prior to making any changes:

Sandbox logs for 117c9ef
Checking src/pages/api/generate.ts for syntax errors... ✅ src/pages/api/generate.ts has no syntax errors! 1/1 ✓
Checking src/pages/api/generate.ts for syntax errors...
✅ src/pages/api/generate.ts has no syntax errors!

Sandbox passed on the latest endless, so sandbox checks will be enabled for this issue.


Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

import { Index, Match, Show, Switch, batch, createEffect, createSignal, onMount } from 'solid-js'
import { Toaster, toast } from 'solid-toast'
import { useThrottleFn } from 'solidjs-use'
import { generateSignature } from '@/utils/auth'
import { fetchModeration, fetchTitle } from '@/utils/misc'
import { audioChunks, getAudioBlob, startRecording, stopRecording } from '@/utils/record'
import { countTokens } from '@/utils/tiktoken'
import { MessagesEvent } from '@/utils/events'
import IconClear from './icons/Clear'
import MessageItem from './MessageItem'
import SystemRoleSettings from './SystemRoleSettings'
import ErrorMessageItem from './ErrorMessageItem'
import TokenCounter, { encoder } from './TokenCounter'
import type { ChatMessage, ErrorMessage } from '@/types'
import type { Setter } from 'solid-js'
export const minMessages = Number(import.meta.env.PUBLIC_MIN_MESSAGES ?? 3)
export const maxTokens = Number(import.meta.env.PUBLIC_MAX_TOKENS ?? 3000)

// #vercel-disable-blocks
import { ProxyAgent, fetch } from 'undici'
// #vercel-end
import { generatePayload, parseOpenAIStream } from '@/utils/openAI'
import { verifySignature } from '@/utils/auth'
import type { APIRoute } from 'astro'
const apiKey = import.meta.env.OPENAI_API_KEY
const httpsProxy = import.meta.env.HTTPS_PROXY
const baseUrl = ((import.meta.env.OPENAI_API_BASE_URL) || 'https://api.openai.com').trim().replace(/\/$/, '')
const sitePassword = import.meta.env.SITE_PASSWORD
const ua = import.meta.env.UNDICI_UA
const FORWARD_HEADERS = ['origin', 'referer', 'cookie', 'user-agent', 'via']

const storagePassword = localStorage.getItem('pass')
try {
const controller = new AbortController()
setController(controller)
const requestMessageList = [...messageList()]
let limit = maxTokens
const systemMsg = currentSystemRoleSettings()
? {
role: 'system',
content: currentSystemRoleSettings(),
} as ChatMessage
: null
systemMsg && (limit -= countTokens(encoder()!, [systemMsg])!.total)
while (requestMessageList.length > minMessages && countTokens(encoder()!, requestMessageList)!.total > limit)
requestMessageList.shift()
systemMsg && requestMessageList.unshift(systemMsg)
const timestamp = Date.now()
const response = await fetch('/api/generate', {
method: 'POST',
body: JSON.stringify({
model: localStorage.getItem('model') || 'gpt-3.5-turbo-1106',
messages: requestMessageList,
time: timestamp,
pass: storagePassword,
sign: await generateSignature({
t: timestamp,
m: requestMessageList?.[requestMessageList.length - 1]?.content || '',
}),
}),
signal: controller.signal,
headers: localStorage.getItem('apiKey') ? { authorization: `Bearer ${localStorage.getItem('apiKey')}` } : {},
})
if (!response.ok) {
const error = await response.json()
console.error(error.error)
setCurrentError(error.error)
throw new Error('Request failed')
}

import type { ChatMessage } from '@/types'
import type { Tiktoken } from 'tiktoken'
const countTokensSingleMessage = (enc: Tiktoken, message: ChatMessage) => {
return 4 + enc.encode(message.content).length // im_start, im_end, role/name, "\n"
}
export const countTokens = (enc: Tiktoken | null, messages: ChatMessage[]) => {
if (messages.length === 0) return
if (!enc) return { total: Infinity }
const lastMsg = messages.at(-1)
const context = messages.slice(0, -1)
const countTokens: (message: ChatMessage) => number = countTokensSingleMessage.bind(null, enc)
const countLastMsg = countTokens(lastMsg!)
const countContext = context.map(countTokens).reduce((a, b) => a + b, 3) // im_start, "assistant", "\n"
return { countContext, countLastMsg, total: countContext + countLastMsg }
}
const cl100k_base_json = import.meta.env.PUBLIC_CL100K_BASE_JSON_URL || '/cl100k_base.json'
const tiktoken_bg_wasm = import.meta.env.PUBLIC_TIKTOKEN_BG_WASM_URL || '/tiktoken_bg.wasm'
async function getBPE() {
return fetch(cl100k_base_json).then(r => r.json())
}
export const initTikToken = async() => {
const { init } = await import('tiktoken/lite/init')
const [{ bpe_ranks, special_tokens, pat_str }, { Tiktoken }] = await Promise.all([
getBPE().catch(console.error),
import('tiktoken/lite/init'),
fetch(tiktoken_bg_wasm).then(r => r.arrayBuffer()).then(wasm => init(imports => WebAssembly.instantiate(wasm, imports))),
])
return new Tiktoken(bpe_ranks, special_tokens, pat_str)


Step 2: ⌨️ Coding

  • Create src/utils/tiktoken-server.ts09d7244
Create src/utils/tiktoken-server.ts with contents:
• Create a new utility file named `tiktoken-server.ts` in the `src/utils` directory for the server-side token counting logic.
• Use `tiktoken-js` instead of `tiktoken` as the server-side equivalent library.
• Define and export a function `countTokensServer` that implements the same logic as `countTokens` from `src/utils/tiktoken.ts`.
• Ensure the function interface matches that of the `countTokens` presently on the client side, taking an encoder and a list of messages as arguments and returning an object with the total token count.
• Make sure to wrap any initializations that are not available on the server, such as fetching base configurations or initializing WebAssembly modules, in a server-compatible manner.
  • Running GitHub Actions for src/utils/tiktoken-server.ts
Check src/utils/tiktoken-server.ts with contents:

Ran GitHub Actions for 09d72442ebe25ea72693afd406fe601d703d1b27:
• Vercel Preview Comments:

  • Modify src/pages/api/generate.ts30a5ea4
Modify src/pages/api/generate.ts with contents:
• In the `post` method of the API route, import the `countTokensServer` function from `src/utils/tiktoken-server.ts`.
• After retrieving the request body, apply the token counting logic to trim the `messages` array, ensuring it remains under a defined token limit.
• Use the constants defined in `src/components/Generator.tsx` like `minMessages` and `maxTokens` to set the lower message limit and token count limit. These may need to be moved to a shared constants file if they are not already.
• Ensure that after implementing the logic, the trimmed `messages` are then passed on for the rest of the processing where the generation payload is created.
  • Running GitHub Actions for src/pages/api/generate.ts
Check src/pages/api/generate.ts with contents:

Ran GitHub Actions for 30a5ea4d0bdc06e092563c96327c3e11eeb3cff2:
• Vercel Preview Comments:


Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/server-side-token-counting_1.


🎉 Latest improvements to Sweep:

  • Sweep uses OpenAI's latest Assistant API to plan code changes and modify code! This is 3x faster and significantly more reliable as it allows Sweep to edit code and validate the changes in tight iterations, the same way as a human would.
  • Sweep now uses the rope library to refactor Python! Check out Large Language Models are Bad at Refactoring Code. To have Sweep refactor your code, try sweep: Refactor <your_file>.py!

💡 To recreate the pull request edit the issue title or description. To tweak the pull request, leave a comment on the pull request.
Join Our Discord

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment