Skip to content

Commit

Permalink
Retry upon Azure GPT-4 model overloaded requests error (#23)
Browse files Browse the repository at this point in the history
- Upgrade to latest nextjs and migrate api code to App Route
- Retry for `That model is currently overloaded with other requests`
error
- Revised and updated the README documentation for clarity and more
detailed instructions
- Updated the application's Next.js configuration and TS configuration
significantly

Resolves #21 #22
  • Loading branch information
blrchen committed Jul 24, 2023
1 parent 79f2953 commit 055d02a
Show file tree
Hide file tree
Showing 23 changed files with 1,770 additions and 5,620 deletions.
3 changes: 0 additions & 3 deletions .eslintrc.json

This file was deleted.

3 changes: 0 additions & 3 deletions .vscode/extensions.json

This file was deleted.

6 changes: 0 additions & 6 deletions .vscode/settings.json

This file was deleted.

34 changes: 20 additions & 14 deletions README.en-US.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,31 +2,37 @@

[English](./README.en-US.md) | Simplified Chinese

Azure OpenAI Proxy is a tool that transforms OpenAI API requests into Azure OpenAI API requests. This allows applications that are compatible only with OpenAI to use Azure Open AI seamlessly.
Azure OpenAI Proxy is a tool that transforms OpenAI API requests into Azure OpenAI API requests, allowing OpenAI-compatible applications to seamlessly use Azure Open AI.

## Prerequisites

To use Azure OpenAI Proxy, you need an Azure OpenAI account.
An Azure OpenAI account is required to use Azure OpenAI Proxy.

## Azure Deployment

[![Deploy to Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fscalaone%2Fazure-openai-proxy%2Fmain%2Fdeploy%2Fazure-deploy.json)

Remember to:

- Select the region that matches your Azure OpenAI resource for best performance.
- If deployment fails because the 'proxywebapp' name is already taken, change the resource prefix and redeploy.
- The deployed proxy app is part of a B1 pricing tier Azure web app plan, which can be modified in the Azure Portal after deployment.

## Docker Deployment

Run the following command to deploy using Docker:
To deploy using Docker, execute the following command:

`docker run -d -p 3000:3000 scalaone/azure-openai-proxy`

## Local Execution and Testing (Command Line)

Follow the steps below:
Follow these steps:

1. Install NodeJS 18.
2. Clone the repository in the command line window.
3. Run `npm install` to install the dependencies.
4. Run `npm start` to start the application.
5. Use the script below for testing. Replace `AZURE_RESOURCE_ID`, `AZURE_MODEL_DEPLOYMENT`, and `AZURE_API_KEY` before executing. `AZURE_API_VERSION` is optional, its default value is `2023-05-15`.
5. Use the script below for testing. Replace `AZURE_RESOURCE_ID`, `AZURE_MODEL_DEPLOYMENT`, and `AZURE_API_KEY` before running. The default value for `AZURE_API_VERSION` is `2023-05-15` and is optional.

<details>
<summary>Test script</summary>
Expand Down Expand Up @@ -60,44 +66,44 @@ The azure-openai-proxy has been tested and confirmed to work with the following

| Application Name | Docker-compose File |
| --------------------------------------------------------------- | --------------------------------------------------------------- |
| [chatbot-ui](https://github.com/mckaywrigley/chatbot-ui) | [docker-compose.yml](./e2e/chatbot-ui/docker-compose.yml) |
| [chatgpt-lite](https://github.com/blrchen/chatgpt-lite) | [docker-compose.yml](./e2e/chatgpt-lite/docker-compose.yml) |
| [chatgpt-next-web](https://github.com/Yidadaa/ChatGPT-Next-Web) | [docker-compose.yml](./e2e/chatgpt-next-web/docker-compose.yml) |
| [chatbot-ui](https://github.com/mckaywrigley/chatbot-ui) | [docker-compose.yml](./e2e/chatbot-ui/docker-compose.yml) |
| [chatgpt-web](https://github.com/Chanzhaoyu/chatgpt-web) | [docker-compose.yml](./e2e/chatgpt-web/docker-compose.yml) |
| [chatgpt-lite](https://github.com/blrchen/chatgpt-lite) | [docker-compose.yml](./e2e/chatgpt-lite/docker-compose.yml) |
| [chatgpt-mininal](https://github.com/blrchen/chatgpt-mininal) | [docker-compose.yml](./e2e/chatgpt-mininal/docker-compose.yml) |

To test locally, follow these steps:

1. Clone the repository in a command-line window.
2. Update the `OPENAPI_API_KEY` environment variable with `AZURE_RESOURCE_ID:AZURE_MODEL_DEPLOYMENT:AZURE_API_KEY`. Alternatively, update the OPENAPI_API_KEY value directly in the docker-compose.yml file.
2. Update the `OPENAPI_API_KEY` environment variable with `AZURE_RESOURCE_ID:AZURE_MODEL_DEPLOYMENT:AZURE_API_KEY`. Alternatively, update the OPENAPI_API_KEY value in the docker-compose.yml file directly.
3. Navigate to the directory containing the `docker-compose.yml` file for the application you want to test.
4. Run the build command: `docker-compose build`.
4. Execute the build command: `docker-compose build`.
5. Start the service: `docker-compose up -d`.
6. Launch the application locally using the exposed port defined in the docker-compose.yml file. For example, visit http://localhost:3000.
6. Access the application locally using the port defined in the docker-compose.yml file. For example, visit http://localhost:3000.

## FAQs

<details>
<summary>Q: What are `AZURE_RESOURCE_ID`,`AZURE_MODEL_DEPLOYMENT`, and `AZURE_API_KEY`?</summary>

A: You can find these in the Azure management portal. Refer to the image below for details:
A: These can be found in the Azure management portal. See the image below for reference:

![resource-and-model](./resource-and-model.jpg)

</details>

<details>
<summary>Q: How can I use GPT-4?</summary>
<summary>Q: How can I use gpt-4 and gpt-4-32k models?</summary>

A: To use GPT-4, use the key format as follows:
A: To use gpt-4 and gpt-4-32k models, follow the key format below:

`AZURE_RESOURCE_ID:gpt-3.5-turbo|AZURE_MODEL_DEPLOYMENT,gpt-4|AZURE_MODEL_DEPLOYMENT,gpt-4-32k|AZURE_MODEL_DEPLOYMENT:AZURE_API_KEY:AZURE_API_VERSION`

</details>

## Contributing

We welcome various PR submissions.
We welcome all PR submissions.

## Disclaimer

Expand Down
26 changes: 16 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,23 @@

[English](./README.en-US.md) | 简体中文

Azure OpenAI Proxy 是一个 OpenAI API 代理工具,它可以将 OpenAI API 请求转换为 Azure OpenAI API 请求,使仅支持 OpenAI 的应用程序可以无缝地使用 Azure OpenAI。
Azure OpenAI Proxy 是一个 OpenAI API 的代理工具,能将 OpenAI API 请求转为 Azure OpenAI API 请求,从而让只支持 OpenAI 的应用程序无缝使用 Azure OpenAI。

## 使用要求
## 使用条件

必须拥有 Azure OpenAI 帐户才能使用 Azure OpenAI Proxy。
你需要有一个 Azure OpenAI 账户才能使用 Azure OpenAI Proxy。

## Azure 部署

[![Deploy to Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fscalaone%2Fazure-openai-proxy%2Fmain%2Fdeploy%2Fazure-deploy.json)

## Docker Deployment
请注意:

- 选择与你的 Azure OpenAI 资源相匹配的区域以获得最佳性能。
- 如果部署失败是因为 'proxywebapp' 名称已被占用,只需修改资源前缀再重新部署。
- 已部署的代理应用位于 B1 定价层级的 Azure 网页应用计划下,你可以在部署后在 Azure 门户中进行更新。

## Docker 部署

```bash
docker run -d -p 3000:3000 scalaone/azure-openai-proxy
Expand All @@ -24,7 +30,7 @@ docker run -d -p 3000:3000 scalaone/azure-openai-proxy
2. 克隆代码到命令行窗口。
3. 运行 `npm install` 安装依赖项。
4. 运行 `npm start` 启动应用程序。
5. 运行下面脚本测试,运行前需要把`AZURE_RESOURCE_ID``AZURE_MODEL_DEPLOYMENT``AZURE_API_KEY`, `AZURE_API_VERSION`替换,`AZURE_API_VERSION`参数可选,目前默认是`2023-05-15`
5. 运行下面脚本测试,运行前需要把`AZURE_RESOURCE_ID``AZURE_MODEL_DEPLOYMENT``AZURE_API_KEY``AZURE_API_VERSION`替换,`AZURE_API_VERSION`参数可选,默认是`2023-05-15`

<details>
<summary>测试脚本</summary>
Expand Down Expand Up @@ -56,12 +62,12 @@ curl -X "POST" "http://localhost:3000/v1/chat/completions" \

以下应用已经过测试,确认可以与 azure-openai-proxy 一起工作:

| App Name | E2E Docker-compose file |
| 应用名称 | E2E测试 Docker-compose 文件 |
| --------------------------------------------------------------- | --------------------------------------------------------------- |
| [chatbot-ui](https://github.com/mckaywrigley/chatbot-ui) | [docker-compose.yml](./e2e/chatbot-ui/docker-compose.yml) |
| [chatgpt-lite](https://github.com/blrchen/chatgpt-lite) | [docker-compose.yml](./e2e/chatgpt-lite/docker-compose.yml) |
| [chatgpt-next-web](https://github.com/Yidadaa/ChatGPT-Next-Web) | [docker-compose.yml](./e2e/chatgpt-next-web/docker-compose.yml) |
| [chatbot-ui](https://github.com/mckaywrigley/chatbot-ui) | [docker-compose.yml](./e2e/chatbot-ui/docker-compose.yml) |
| [chatgpt-web](https://github.com/Chanzhaoyu/chatgpt-web) | [docker-compose.yml](./e2e/chatgpt-web/docker-compose.yml) |
| [chatgpt-lite](https://github.com/blrchen/chatgpt-lite) | [docker-compose.yml](./e2e/chatgpt-lite/docker-compose.yml) |
| [chatgpt-minimal](https://github.com/blrchen/chatgpt-minimal) | [docker-compose.yml](./e2e/chatgpt-minimal/docker-compose.yml) |

要在本地运行测试,请按照以下步骤操作:
Expand All @@ -84,8 +90,8 @@ A: 可以在Azure的管理门户里查找,具体见下图标注
</details>

<details>
<summary>Q: 如何支持GPT-4</summary>
A: 要使用GPT-4,请使用下列格式的key:
<summary>Q: 如何使用gpt-4 and gpt-4-32k模型</summary>
A: 要使用gpt-4 and gpt-4-32k模型,请使用下列格式的key:

`AZURE_RESOURCE_ID:gpt-3.5-turbo|AZURE_MODEL_DEPLOYMENT,gpt-4|AZURE_MODEL_DEPLOYMENT,gpt-4-32k|AZURE_MODEL_DEPLOYMENT:AZURE_API_KEY:AZURE_API_VERSION`

Expand Down
File renamed without changes.
17 changes: 17 additions & 0 deletions app/layout.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import type { Metadata } from 'next'
import { Inter } from 'next/font/google'

const inter = Inter({ subsets: ['latin'] })

export const metadata: Metadata = {
title: 'Create Next App',
description: 'Generated by create next app'
}

export default function RootLayout({ children }: { children: React.ReactNode }) {
return (
<html lang="en">
<body className={inter.className}>{children}</body>
</html>
)
}
8 changes: 8 additions & 0 deletions app/page.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
import Image from 'next/image'

export default function Home() {
return (
<main>
</main>
)
}
116 changes: 116 additions & 0 deletions app/v1/chat/completions/route.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
import { NextRequest, NextResponse } from 'next/server'

const DEFAULT_API_VERSION = '2023-05-15'
const MAX_RETRY_COUNT = 3
const RETRY_DELAY = 1000

export async function POST(request: NextRequest) {
const apiKey = request.headers.get('authorization')?.replace('Bearer ', '')
if (!apiKey) {
return NextResponse.json({ message: 'Unauthenticated' }, { status: 401 })
}
const body = await request.json()

let retryCount = 0
while (true) {
let response = await chat(apiKey, body)
const status = response.status
if (status < 300 || status === 400) {
return response
}
if (retryCount >= MAX_RETRY_COUNT) {
return response
} else {
retryCount++
console.log(`Status is ${status}, Retry ${retryCount} times`)
await delay(RETRY_DELAY)
}
}
}

async function chat(apiKey: string, body: any) {
const [resourceId, mapping, azureApiKey, apiVersion] = apiKey.split(':')
const model = body['model']

// get deployment id
let deploymentId
if (mapping.includes('|')) {
const modelMapping = Object.fromEntries(mapping.split(',').map((pair) => pair.split('|')))
deploymentId = modelMapping[model] || Object.values(modelMapping)[0]
} else {
deploymentId = mapping
}

let url = `https://${resourceId}.openai.azure.com/openai/deployments/${deploymentId}/chat/completions?api-version=${
apiVersion || DEFAULT_API_VERSION
}`
const response = await fetch(url, {
method: 'POST',
headers: {
'api-key': azureApiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify(body)
})
console.log(`[${resourceId}][${deploymentId}] ${response.status} ${response.statusText}`)
let resultStream: ReadableStream | undefined
let isFirstEventData = true
const status: number = await new Promise((resolve) => {
const decoder = new TextDecoder()
resultStream = new ReadableStream(
{
async pull(controller) {
const reader = response.body!.getReader()

while (true) {
const { value, done } = await reader.read()
if (done) {
controller.close()
}
let data = decoder.decode(value)
if (isFirstEventData) {
isFirstEventData = false
if (shouldRetry(data)) {
resolve(500)
} else {
resolve(response.status)
}
}
controller.enqueue(value)
}
}
},
{
highWaterMark: 1,
size(chunk) {
return chunk.length
}
}
)
})
return new Response(resultStream, {
status: status,
headers: response.headers
})
}

function delay(ms: number) {
return new Promise((resolve) => setTimeout(resolve, ms))
}

function shouldRetry(data: string) {
let shouldRetry = false
try {
const json = data.startsWith('data: ') ? data.match(/^data: (.*?)$/m)?.[1] : data
const jobject = JSON.parse(json!!)
if (
jobject?.error?.message.startsWith('That model is currently overloaded with other requests')
) {
shouldRetry = true
}
} catch (e) {
console.error(`first event data string: ${data}`)
console.error(`parse json error: ${e}`)
}
return shouldRetry
}
Loading

0 comments on commit 055d02a

Please sign in to comment.