Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bing): add support for context message (+ demo) and fast "Balanced" mode #216

Merged
merged 11 commits into from
Mar 21, 2023
1 change: 1 addition & 0 deletions demos/context-demo-text.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Skip to main content Site Navigation Research Product Developers Safety Company Search Introducing ChatGPT and Whisper APIs Developers can now integrate ChatGPT and Whisper models into their apps and products through our API. Ruby Chen March 1, 2023 Authors Greg Brockman Atty Eleti Elie Georges Joanne Jang Logan Kilpatrick Rachel Lim Luke Miller Michelle Pokrass Product , Announcements ChatGPT and Whisper models are now available on our API, giving developers access to cutting-edge language (not just chat!) and speech-to-text capabilities. Through a series of system-wide optimizations, we’ve achieved 90% cost reduction for ChatGPT since December; we’re now passing through those savings to API users. Developers can now use our open-source Whisper large-v2 model in the API with much faster and cost-effective results. ChatGPT API users can expect continuous model improvements and the option to choose dedicated capacity for deeper control over the models. We’ve also listened closely to feedback from our developers and refined our API terms of service to better meet their needs. Get started Early users of ChatGPT and Whisper APIs Snap Inc., the creator of Snapchat, introduced My AI for Snapchat+ this week. The experimental feature is running on ChatGPT API. My AI offers Snapchatters a friendly, customizable chatbot at their fingertips that offers recommendations, and can even write a haiku for friends in seconds. Snapchat, where communication and messaging is a daily behavior, has 750 million monthly Snapchatters: Play video My AI for Snapchat+ Quizlet is a global learning platform with more than 60 million students using it to study, practice and master whatever they’re learning. Quizlet has worked with OpenAI for the last three years, leveraging GPT-3 across multiple use cases, including vocabulary learning and practice tests. With the launch of ChatGPT API, Quizlet is introducing Q-Chat, a fully-adaptive AI tutor that engages students with adaptive questions based on relevant study materials delivered through a fun chat experience: Play video Quizlet Q-Chat Instacart is augmenting the Instacart app to enable customers to ask about food and get inspirational, shoppable answers. This uses ChatGPT alongside Instacart’s own AI and product data from their 75,000+ retail partner store locations to help customers discover ideas for open-ended shopping goals, such as “How do I make great fish tacos?” or “What’s a healthy lunch for my kids?” Instacart plans to launch “Ask Instacart” later this year: Play video Instacart’s Ask Instacart Shop, Shopify’s consumer app, is used by 100 million shoppers to find and engage with the products and brands they love. ChatGPT API is used to power Shop’s new shopping assistant. When shoppers search for products, the shopping assistant makes personalized recommendations based on their requests. Shop’s new AI-powered shopping assistant will streamline in-app shopping by scanning millions of products to quickly find what buyers are looking for—or help them discover something new: Play video Shopify’s Shop app Speak is an AI-powered language learning app focused on building the best path to spoken fluency. They’re the fastest-growing English app in South Korea, and are already using the Whisper API to power a new AI speaking companion product, and rapidly bring it to the rest of the globe. Whisper’s human-level accuracy for language learners of every level unlocks true open-ended conversational practice and highly accurate feedback: Play video The Speak app ChatGPT API Model: The ChatGPT model family we are releasing today, gpt-3.5-turbo, is the same model used in the ChatGPT product. It is priced at $0.002 per 1k tokens, which is 10x cheaper than our existing GPT-3.5 models. It’s also our best model for many non-chat use cases—we’ve seen early testers migrate from text-davinci-003 to gpt-3.5-turbo with only a small amount of adjustment needed to their prompts. API: Traditionally, GPT models consume unstructured text, which is represented to the model as a sequence of “tokens.” ChatGPT models instead consume a sequence of messages together with metadata. (For the curious: under the hood, the input is still rendered to the model as a sequence of “tokens” for the model to consume; the raw format used by the model is a new format called Chat Markup Language (“ChatML”).) We’ve created a new endpoint to interact with our ChatGPT models: Request Response Python bindings curl https://api.openai.com/v1/chat/completions \\ -H "Authorization: Bearer $OPENAI_API_KEY" \\ -H "Content-Type: application/json" \\ -d '{ "model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "What is the OpenAI mission?"}] }' To learn more about the ChatGPT API, visit our Chat guide. ChatGPT upgrades We are constantly improving our ChatGPT models, and want to make these enhancements available to developers as well. Developers who use the gpt-3.5-turbo model will always get our recommended stable model, while still having the flexibility to opt for a specific model version. For example, today we’re releasing gpt-3.5-turbo-0301, which will be supported through at least June 1st, and we’ll update gpt-3.5-turbo to a new stable release in April. The models page will provide switchover updates. Dedicated instances We are also now offering dedicated instances for users who want deeper control over the specific model version and system performance. By default, requests are run on compute infrastructure shared with other users, who pay per request. Our API runs on Azure, and with dedicated instances, developers will pay by time period for an allocation of compute infrastructure that’s reserved for serving their requests. Developers get full control over the instance’s load (higher load improves throughput but makes each request slower), the option to enable features such as longer context limits, and the ability to pin the model snapshot. Dedicated instances can make economic sense for developers running beyond ~450M tokens per day. Additionally, it enables directly optimizing a developer’s workload against hardware performance, which can dramatically reduce costs relative to shared infrastructure. For dedicated instance inquiries, contact us. Whisper API Whisper, the speech-to-text model we open-sourced in September 2022, has received immense praise from the developer community but can also be hard to run. We’ve now made the large-v2 model available through our API, which gives convenient on-demand access priced at $0.006 / minute. In addition, our highly-optimized serving stack ensures faster performance compared to other services. Whisper API is available through our transcriptions (transcribes in source language) or translations (transcribes into English) endpoints, and accepts a variety of formats (m4a, mp3, mp4, mpeg, mpga, wav, webm): Request Response Python bindings curl https://api.openai.com/v1/audio/transcriptions \\ -H "Authorization: Bearer $OPENAI_API_KEY" \\ -H "Content-Type: multipart/form-data" \\ -F model="whisper-1" \\ -F file="@/path/to/file/openai.mp3" To learn more about the Whisper API, visit our Speech to Text guide. Developer focus Over the past six months, we’ve been collecting feedback from our API customers to understand how we can better serve them. We’ve made concrete changes, such as: Data submitted through the API is no longer used for service improvements (including model training) unless the organization opts in Implementing a default 30-day data retention policy for API users, with options for stricter retention depending on user needs. Removing our pre-launch review (unlocked by improving our automated monitoring) Improving developer documentation Simplifying our Terms of Service and Usage Policies, including terms around data ownership: users own the input and output of the models. For the past two months our uptime has not met our own expectations nor that of our users. Our engineering team’s top priority is now stability of production use cases—we know that ensuring AI benefits all of humanity requires being a reliable service provider. Please hold us accountable for improved uptime over the upcoming months! We believe that AI can provide incredible opportunities and economic empowerment to everyone, and the best way to achieve that is to allow everyone to build with it. We hope that the changes we announced today will lead to numerous applications that everyone can benefit from. Start building next-generation apps powered by ChatGPT & Whisper. Get started Authors Greg Brockman View all articles Atty Eleti View all articles Elie Georges View all articles Joanne Jang View all articles Logan Kilpatrick View all articles Rachel Lim View all articles Luke Miller View all articles Michelle Pokrass View all articles Acknowledgments Contributors Jeff Belgum, Jake Berdine, Trevor Cai, Alexander Carney, Brooke Chan, Che Chang, Derek Chen, Ruby Chen, Aidan Clark, Thomas Degry, Steve Dowling, Sheila Dunning, Liam Fedus, Vik Goel, Scott Gray, Aurelia Guy, Jeff Harris, Peter Hoeschele, Angela Jiang, Denny Jin, Jong Wook Kim, Yongjik Kim, Michael Lampe, Daniel Levy, Brad Lightcap, Patricia Lue, Bianca Martin, Christine McLeavey, Luke Metz, Andrey Mishchenko, Vinnie Monaco, Evan Morikawa, Mira Murati, Rohan Nuttall, Alex Paino, Ashley Pantuliano, Mikhail Pavlov, Andrew Peng, Henrique Ponde de Oliveira Pinto, Alec Radford, Kendra Rimbach, Aliisa Rosenthal, Nick Ryder, Ted Sanders, Heather Schmidt, John Schulman, Zarina Stanik, Felipe Such, Nick Turley, Carroll Wainwright, Peter Welinder, Clemens Winter, Sherwin Wu, Tao Xu, Qiming Yuan, Barret Zoph Related research View all research GPT-4 Mar 14, 2023 March 14, 2023 Forecasting potential misuses of language models for disinformation campaigns and how to reduce risk Jan 11, 2023 January 11, 2023 Point-E: A system for generating 3D point clouds from complex prompts Dec 16, 2022 December 16, 2022 Scaling laws for reward model overoptimization Oct 19, 2022 October 19, 2022 Research Overview Index Product Overview GPT-4 DALL·E 2 Customer stories Safety standards Pricing Safety Overview Company About Careers Blog Charter OpenAI © 2015 – 2023 Terms & policies Twitter YouTube GitHub SoundCloud LinkedIn Back to top
25 changes: 23 additions & 2 deletions demos/use-bing-client.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
// eslint-disable-next-line no-unused-vars
import { KeyvFile } from 'keyv-file';
import { fileURLToPath } from 'url';
import path, { dirname } from 'path';
import fs from 'fs';
import { BingAIClient } from '../index.js';

// eslint-disable-next-line no-underscore-dangle
const __filename = fileURLToPath(import.meta.url);
// eslint-disable-next-line no-underscore-dangle
const __dirname = dirname(__filename);

const options = {
// Necessary for some people in different countries, e.g. China (https://cn.bing.com)
host: '',
Expand All @@ -15,11 +23,11 @@ const options = {
debug: false,
};

const bingAIClient = new BingAIClient(options);
let bingAIClient = new BingAIClient(options);

let response = await bingAIClient.sendMessage('Write a short poem about cats', {
// (Optional) Set a conversation style for this message (default: 'balanced')
toneStyle: 'balanced', // or creative, precise
toneStyle: 'balanced', // or creative, precise, fast
onProgress: (token) => {
process.stdout.write(token);
},
Expand All @@ -37,6 +45,19 @@ response = await bingAIClient.sendMessage('Now write it in French', {
});
console.log(JSON.stringify(response, null, 2)); // {"jailbreakConversationId":false,"conversationId":"...","conversationSignature":"...","clientId":"...","invocationId":2,"messageId":"...","conversationExpiryTime":"2023-03-08T03:20:23.463914Z","response":"Here is the same poem in French: ...","details":{ /* raw response... */ }}

/*
Sending context data
*/
bingAIClient = new BingAIClient(options);

response = await bingAIClient.sendMessage('Could you provide short and precise takeaways, do not search the web and only use the content from the document. The factual information should be literally from the document. Please memorize the part in the document which mention the factual information, but do not mark them explicitly. The takeaway should be credible, highly readable and informative. Please make the answer short, preferably within 500 characters. Generate the response in English language.', {
context: fs.readFileSync(path.resolve(__dirname, './context-demo-text.txt'), 'utf8'), // chatGPT API 10k characters, scrapped from the blog post https://openai.com/blog/introducing-chatgpt-and-whisper-apis
onProgress: (token) => {
process.stdout.write(token);
},
});
console.log(JSON.stringify(response, null, 2)); // {"jailbreakConversationId":false,"conversationId":"...","conversationSignature":"...","clientId":"...","invocationId":2,"messageId":"...","conversationExpiryTime":"2023-03-08T03:20:23.463914Z","response":"Some possible takeaways from the document are... Some early users of ChatGPT and Whisper APIs include Snap Inc., Quizlet, Instacart, Shopify and Speak.","details":{ /* raw response... */ }}

/*
Activate jailbreak mode by setting `jailbreakConversationId` to `true`.
This will return a `jailbreakConversationId` that you can use to continue the conversation.
Expand Down
38 changes: 30 additions & 8 deletions src/BingAIClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -151,15 +151,16 @@ export default class BingAIClient {
} = opts;

const {
toneStyle = 'balanced', // or creative, precise
toneStyle = 'balanced', // or creative, precise, fast
invocationId = 0,
systemMessage,
context,
parentMessageId = jailbreakConversationId === true ? crypto.randomUUID() : null,
abortController = new AbortController(),
} = opts;

if (typeof onProgress !== 'function') {
onProgress = () => {};
onProgress = () => { };
}

if (jailbreakConversationId || !conversationSignature || !conversationId || !clientId) {
Expand Down Expand Up @@ -238,6 +239,7 @@ export default class BingAIClient {
role: 'User',
message,
};

if (jailbreakConversationId) {
conversation.messages.push(userMessage);
}
Expand All @@ -253,7 +255,11 @@ export default class BingAIClient {
toneOption = 'h3imaginative';
} else if (toneStyle === 'precise') {
toneOption = 'h3precise';
} else if (toneStyle === 'fast') {
// new "Balanced" mode, allegedly GPT-3.5 turbo
toneOption = 'galileo';
} else {
// old "Balanced" mode
toneOption = 'harmonyv3';
}

Expand Down Expand Up @@ -290,19 +296,35 @@ export default class BingAIClient {
id: clientId,
},
conversationId,
previousMessages: [],
},
],
invocationId: invocationId.toString(),
target: 'chat',
type: 4,
};

if (previousMessagesFormatted) {
obj.arguments[0].previousMessages = [
{
text: previousMessagesFormatted,
author: 'bot',
},
];
obj.arguments[0].previousMessages.push({
text: previousMessagesFormatted,
author: 'bot',
});
}

// simulates document summary function on Edge's Bing sidebar
// unknown character limit, at least up to 7k
if (context) {
obj.arguments[0].previousMessages.push({
author: 'user',
description: context,
contextType: 'WebPage',
messageType: 'Context',
messageId: 'discover-web--page-ping-mriduna-----',
});
}

if (obj.arguments[0].previousMessages.length === 0) {
delete obj.arguments[0].previousMessages;
}

const messagePromise = new Promise((resolve, reject) => {
Expand Down