Skip to content

Commit

Permalink
🎻
Browse files Browse the repository at this point in the history
  • Loading branch information
transitive-bullshit committed Dec 20, 2023
1 parent 5288d11 commit 025c5fe
Show file tree
Hide file tree
Showing 3 changed files with 214 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# ByteDance is using GPT-4 to train its models

[ByteDance got busted](https://www.theverge.com/2023/12/15/24003151/bytedance-china-openai-microsoft-competitor-llm?utm_source=bensbites\&utm_medium=referral\&utm_campaign=bytedance-is-using-gpt-4-to-train-its-models) using responses from OpenAI’s AI models to secretly build their own chatbot. Not cool since that breaks OpenAI and Microsoft’s rules. Alex from The Verge reported this and since then, OpenAI has suspended ByteDance’s account on their platform.

## What’s going on here?

ByteDance leaned hard on OpenAI's API to develop Project Seed, knowing it was illegal.

[![](https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/44d72c78-f9fe-4fea-87bf-80ed4f61cd64/image.png?t=1702891730)](https://twitter.com/alexeheath/status/1735758297893085621?utm_source=bensbites\&utm_medium=referral\&utm_campaign=bytedance-is-using-gpt-4-to-train-its-models)

## What does that mean?

ByteDance used OpenAI’s API for pretty much every part of making its chatbot (Project Seed), including training it and testing how well it works.

For context: OpenAI (and most AI companies) don’t allow training new models on the outputs of their models. [Mistral allows that, we wrote about them.](https://bensbites.beehiiv.com/p/mistral-ai-openai-competitor-rocketed-2bn-12-months)

While ByteDance recently ordered the team to stop using GPT-generated text, the API is still secretly used to evaluate Seed's performance. The ByteDance team is under mad pressure to match GPT-3.5 by the end of 2023 (we’re already there) and GPT-4 by mid-2024.

Open AI has suspended ByteDance’s API account in the meantime. However, the majority of use from ByteDance happened via Microsoft Azure.

## Why should I care?

Startups have been using synthetic data created by GPT-4 to train models for a few months and haven’t seen much pushback from OpenAI. The same’s been true for open-source models being fine-tuned over GPT responses. But with bigger companies like ByteDance doing the same, we’ll likely see OpenAI tighten its response.
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Daily Digest: Sneaky move from Chinese LLMs

### PLUS: frontend UI, live classes and AI gf.

[Sign up](https://www.bensbites.co/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)|[Advertise](https://sponsor.bensbites.co/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)|[Ben’s Bites News](https://news.bensbites.co/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\
Daily Digest #308

Hello folks, here’s what we have today;

###### **PICKS**

1. [ByteDance is secretly using OpenAI’s tech to build a competitor](https://www.theverge.com/2023/12/15/24003151/bytedance-china-openai-microsoft-competitor-llm?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)- They got busted**using responses from OpenAI**’s AI models to secretly build their own chatbot. Not cool since that\*\*breaks OpenAI and Microsoft’s rules.\*\*Alex from The Verge reported this and since then, OpenAI has suspended ByteDance’s account on their platform.🍿[Our Summary](https://bensbites.beehiiv.com/p/bytedance-using-gpt4-train-models)(also below)

from our sponsor

**Need real-time data for your AI project?**

The Brave Search API can fuel your AI applications with high quality Web data.

- Empower your AI with the ability to search the Web

- Improve LLM responses with context from extra search snippets

- Get real-time info like weather and news

- Autosuggest, Spellcheck, and more…

Get started for**FREE**(up to 2k calls/month) at[brave.com/api](https://brave.com/search/api/?mtm_source=bens-bites\&mtm_medium=newsletter\&mtm_campaign=search-api\&mtm_content=API\&utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms). Paid plans as little as $0.50 CPM.

###### **TOP TOOLS**

- [Osum](https://osum.com?utm_source=bensbites)\* - Helps you perform**deep market research**in seconds.

- [XMind AI](https://xmind.ai/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)- Get AI to assist in generating ideas for your**mind maps.**

- [v0 by Vercel](https://v0.dev/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)- Generate**UI in seconds**with text or images. Now open for everyone.

- [TryHairstyle](https://tryhairstyle.com/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)-**Hairstyle try-on**that saves you time & money.

- [FormsGPT](https://chat.openai.com/g/g-EPIdBLOCz-forms-gpt?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)- GPT to build and share**custom forms**with unlimited submissions.

- [Digi AI](https://digi.ai/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)-\*\*Romance, reimagined.\*\*New mainstream AI gf/bf startup.

- [App Tracking Transparency AI](https://apptrackingtransparency.ai/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)- Show your**ATT pop-up**at the perfect time to increase opt-ins.

- [Turbo Art by Modal](https://turbo.art/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)-**Real-time art**generation with image canvas.

- [Million AI Copilot](https://twitter.com/aidenybai/status/1736579007632626020?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)- Detect and**fix slow components**in your React application.

- [Live Lecture by StudyFetch](https://www.studyfetch.com/features/livelecture?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)- Automatically**take notes**from your live lectures.

[View more →](https://news.bensbites.co/tags/show?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\
\*sponsored

###### **NEWS**

What’s new:

- \*\*[Salesforce adds RAG](https://twitter.com/clarashih/status/1735543743925924253?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*and semantic search to its data cloud.

- \*\*[Ola’s founder launches Krutrim AI](https://www.bloomberg.com/news/articles/2023-12-15/ola-founder-s-ai-startup-launches-indian-large-language-model?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*\*\*,\*\*an Indian LLM supporting 10 Indian languages.

- Ashley, an\*\*[AI-powered political campaign caller](https://www.reuters.com/technology/meet-ashley-worlds-first-ai-powered-political-campaign-caller-2023-12-12/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*is on duty this election season.

Look out for:

- Google plans new\*\*[‘Pixie’ AI assistant](https://www.theinformation.com/briefings/google-plans-new-pixie-ai-assistant-for-pixel-phones?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*for Pixel phones.

- OpenAI, Meta, and Microsoft chase\*\*[wearable AI.](https://www.theinformation.com/articles/tech-giants-chase-wearable-ai?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*

- How could generative AI change\*\*[early-years education?](https://nesta.shorthandstories.com/how-could-generative-ai-change-early-years-education/?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*

- Deloitte is taking AI’s help to\*\*[avoid future layoffs.](https://www.bloomberg.com/news/articles/2023-12-17/ai-could-be-helping-deloitte-avoid-mass-layoffs-in-the-future?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*

Learn-it kit:

- OpenAI’s official guide for\*\*[prompt engineering.](https://platform.openai.com/docs/guides/prompt-engineering?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*

- How to\*\*[train a custom GPT](https://blog.llamaindex.ai/how-to-train-a-custom-gpt-on-your-data-with-embedai-llamaindex-8a701d141070?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*on your data with EmbedAI + LlamaIndex.

- How to build your own\*\*[personal research assistant](https://github.com/LoganGrasby/LlamaResearcher?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*with Llama-2.

- Deep dive into\*\*[4 NeurIPS 2023 best paper](https://www.youtube.com/watch?v=LkED9wKI1TY\&utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)\*\*award winners.

[View more →](https://news.bensbites.co/tags/news/trending?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)

**Unclassifieds**- short, sponsored links

- **[Mindstone (Online) AI Meetup](https://app.mindstone.com/mindspace/mindstone_ai_meetup/home/event/mindstone_online_ai_meetup_global_edition?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)**, the best**talks**&**demos**on**practical AI**. Tuesday 19th of Dec, 17:00-18:30 UTC.[Sign-up and join](https://app.mindstone.com/mindspace/mindstone_ai_meetup/home/event/mindstone_online_ai_meetup_global_edition?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)for FREE.

###### **QUICK BITES**

[ByteDance got busted](https://www.theverge.com/2023/12/15/24003151/bytedance-china-openai-microsoft-competitor-llm?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)using responses from OpenAI’s AI models to secretly build their own chatbot. Not cool since that breaks OpenAI and Microsoft’s rules. Alex from The Verge reported this and since then, OpenAI has suspended ByteDance’s account on their platform.

**What is going on here?**

ByteDance leaned hard on OpenAI's API to develop Project Seed, knowing it was illegal.

[![](https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/44d72c78-f9fe-4fea-87bf-80ed4f61cd64/image.png?t=1702891730)](https://twitter.com/alexeheath/status/1735758297893085621?utm_source=bensbites\&utm_medium=referral\&utm_campaign=daily-digest-sneaky-move-from-chinese-llms)

**What does this mean?**

ByteDance used OpenAI’s API for pretty much every part of making its chatbot (Project Seed), including training it and testing how well it works.

For context: OpenAI (and most AI companies) don’t allow training new models on the outputs of their models.[Mistral allows that, we wrote about them.](https://bensbites.beehiiv.com/p/mistral-ai-openai-competitor-rocketed-2bn-12-months)

While ByteDance recently ordered the team to stop using GPT-generated text, the API is still secretly used to evaluate Seed's performance. The ByteDance team is under mad pressure to match GPT-3.5 by the end of 2023 (we’re already there) and GPT-4 by mid-2024.

Open AI has suspended ByteDance’s API account in the meantime. However, the majority of use from ByteDance happened via Microsoft Azure.

**Why should I care?**

Startups have been using synthetic data created by GPT-4 to train models for a few months and haven’t seen much pushback from OpenAI. The same’s been true for open-source models being fine-tuned over GPT responses. But with bigger companies like ByteDance doing the same, we’ll likely see OpenAI tighten its response.

[*Share this story*](https://bensbites.beehiiv.com/p/bytedance-using-gpt4-train-models)

### Ben’s Bites Insights

We have 2 databases that are updated daily which you can access by sharing Ben’s Bites using the link below;

- **All 10k+ links**we’ve covered, easily filterable (1 referral)

- **6k+ AI company funding rounds**from Jan 2022, including investors, amounts, stage etc (3 referrals)
68 changes: 68 additions & 0 deletions fixtures/bensbites.beehiiv.com/newsletter.json
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,74 @@
]
},
"posts": [
{
"id": "f45671b2-3746-4297-b9e7-9fb773286b4c",
"publication_id": "447f6e60-e36a-4642-b6f8-46beb19045ec",
"web_title": "Daily Digest: Sneaky move from Chinese LLMs",
"web_subtitle": "PLUS: frontend UI, live classes and AI gf.",
"status": "published",
"override_scheduled_at": "2023-12-18T14:00:00.000Z",
"slug": "daily-digest-sneaky-move-chinese-llms",
"image_url": "https://beehiiv-images-production.s3.amazonaws.com/uploads/asset/file/3686cedf-fa4c-400d-b538-8b2e693d8794/Red_Level_2.png?t=1702892235",
"meta_default_title": "Daily Digest: Sneaky move from Chinese LLMs",
"meta_default_description": "PLUS: frontend UI, live classes and AI gf.",
"meta_og_title": "Daily Digest: Sneaky move from Chinese LLMs",
"meta_og_description": "PLUS: frontend UI, live classes and AI gf.",
"meta_twitter_title": "Daily Digest: Sneaky move from Chinese LLMs",
"meta_twitter_description": "PLUS: frontend UI, live classes and AI gf.",
"audience": "free",
"comments_enabled": true,
"comments_state": "default",
"enforce_gated_content": false,
"enable_popup_on_scroll": true,
"email_capture_title": "Join 100,000+ others",
"email_capture_message": "Stay informed and up to date on AI",
"email_capture_cta": "Subscribe",
"authors": [],
"content_tags": [
{
"id": "64ea972c-91c5-406a-a512-1d6152696293",
"display": "📬 Daily Digest"
}
],
"created_at": "2023-12-18T04:50:39Z",
"updated_at": "2023-12-18T14:00:59Z",
"url": "https://bensbites.beehiiv.com/p/daily-digest-sneaky-move-chinese-llms"
},
{
"id": "39502f8d-cbfb-4a43-ae29-46be34f35b39",
"publication_id": "447f6e60-e36a-4642-b6f8-46beb19045ec",
"web_title": "ByteDance is using GPT-4 to train its models",
"web_subtitle": null,
"status": "published",
"override_scheduled_at": "2023-12-18T09:29:45.067Z",
"slug": "bytedance-using-gpt4-train-models",
"image_url": "https://beehiiv-images-production.s3.amazonaws.com/uploads/asset/file/44d72c78-f9fe-4fea-87bf-80ed4f61cd64/image.png?t=1702891730",
"meta_default_title": "ByteDance is using GPT-4 to train its models",
"meta_default_description": "Not cool since that breaks OpenAI and Microsoft’s rules.",
"meta_og_title": "ByteDance is using GPT-4 to train its models",
"meta_og_description": "Not cool since that breaks OpenAI and Microsoft’s rules.",
"meta_twitter_title": "ByteDance is using GPT-4 to train its models",
"meta_twitter_description": "Not cool since that breaks OpenAI and Microsoft’s rules.",
"audience": "free",
"comments_enabled": true,
"comments_state": "default",
"enforce_gated_content": false,
"enable_popup_on_scroll": true,
"email_capture_title": "Join 100,000+ others",
"email_capture_message": "Stay informed and up to date on AI",
"email_capture_cta": "Subscribe",
"authors": [],
"content_tags": [
{
"id": "4b4f44ed-2510-4e0e-b4d5-74f57e40d0f1",
"display": "🍿 Quick Bites"
}
],
"created_at": "2023-12-18T08:32:04Z",
"updated_at": "2023-12-18T09:30:01Z",
"url": "https://bensbites.beehiiv.com/p/bytedance-using-gpt4-train-models"
},
{
"id": "9d669bd5-0d41-45fa-a3af-b4b1ea0d48de",
"publication_id": "447f6e60-e36a-4642-b6f8-46beb19045ec",
Expand Down

1 comment on commit 025c5fe

@vercel
Copy link

@vercel vercel bot commented on 025c5fe Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.