GPT-is-you

Crafting a Personalized Language Model from Instant Messaging Conversations - an AI Adventure

April 2, 2023

Background

This blog post is a tutorial and commentary on hacking a large language model (LLM) to make it ingest thousands of lines of chat history texts between you and another person, and see what comes out. Will it start speaking like you, or the other person, in terms of style and topics? Can it critically analyze previous conversations?

ChatGPT cannot be used for this experiment, as it can only take a limited amount of words (tokens) as input. To give large custom inputs (megabytes) to a model, the only practical way for now is to fine-tune it on a series of prompts/responses. Folklore knowledge is that not too many prompts (10,000's and not billions) suffice. The steps below are based on fine-tuning the davinci model (GPT-3 series) from OpenAI. To which extent does this strategy work? I wanted to find out. Note that GPT-4 cannot be fine-tuned as of now.

Motivations

LLM chatbots exist, but not yet ones that are personalized to yourself or people you know. The possibility was raised by Ray Kurzweil years ago (https://abcnews.go.com/Technology/futurist-ray-kurzweil-bring-dead-father-back-life/story?id=14267712). A prediction website (https://manifold.markets/firstuserhere/will-we-have-a-popular-llm-finetune) currently estimates at 28% the probability of "a popular LLM fine-tuned on people's personal texts by June 1, 2024". This blog post shows that the probability is now essentially 1. I am posting this to raise philosophical awareness of this recent technical capability, and to let anyone experiment on their own data if they wish. See the Caveats section for an important note on privacy.

Ethics

If you decide to fine-tune a model on your own data, make sure to ask the other person the permission to use your common chat history! And be explicit about the privacy and moral implications.

Costs

~60 US dollars billed by OpenAI for fine-tuning their largest model (davinci), on 10,000 prompts. This scales linearly with chat history length.

Results

In my tests the generated sentences were no longer in the neutral style of GPT-3, but in my style or that of the other person in the conversation. The style is copied from the training responses, not the training prompts. The fine-tuned LLM spontaneously introduced topics from previous conversations, but never exact copies of the training data. It feels as if the LLM captured a part of the personality. Though it is clear that the LLM did not capture personality, and shouldn’t be seen as having done that, but it can already be used to effectively analyze and recreate text interactions with someone in a (so far) limited way: I find the philosophical implications troubling, mainly because I do not fully understand them. The AI of course suffers from the same limitations as GPT-3, and can hallucinate wrong statements.

Two different corpus were tested: one with 1,000 prompts and another with 10,000 prompts. With the large corpus, the system gave responses in the style and about the topics of previous chats. The small corpus gave less interesting results in terms of topics but still in the style of the person. So I'd recommend giving it as much chat history as possible. This is an exploratory report, I did not investigate how the system would behave with less (e.g. 100) or more (e.g. 100,000) prompts.

Technical Steps

Requirements:

Python >= 3.9 with pandas (pip install pandas).
An OpenAI account with an API key created here: https://platform.openai.com/account/api-keys
This can be run on any machine as the computation is done on OpenAI servers.

Export a WhatsApp chat history through the mobile app. This generates a WhatsApp Chat with [XXX].txt file.

Run the following command to convert the .txtfile to a .json set of prompts:

python3.9 convert.py ./"WhatsApp\ Chat\ with\ [Name of contact].txt" "[Your name]" "[Name of contact]"

Inspect the generated file visually. It should be of the form:

{"prompt": "<chat prompt>", "completion": "<chat response>"}
{"prompt": "<chat prompt>", "completion": "<chat response>"}
{"prompt": "<chat prompt>", "completion": "<chat response>"}
...

The convert.py file is provided in this repository. It is an adapted version of a larger code repository written by someone else: https://github.com/afiqhatta/chat_scrape The rest of the instructions are adapted from https://platform.openai.com/docs/guides/fine-tuning

Run the following commands that will interact with the OpenAI API to start a remote fine-tuning job.

pip install --upgrade openai
export OPENAI_API_KEY="<OPENAI_API_KEY>"
openai tools fine_tunes.prepare_data -f output_[XXX].json # the output file generated by convert.py
openai api fine_tunes.create -t output_[XXX].jsonl --suffix "[XXXSome meaningful model name]" -m davinci

An fine-tuning identifier of the form ft-[XXX] is generated and allows you to monitor the progress of the fine-tuning. In my case it took a few hours. This is the longest step.

Once fine-tuning is finished, the model resides on OpenAI servers. You can check the accuracy of the model using:
```
openai api fine_tunes.results -i ft-[XXX] > results.csv
```

In my tests, training token accuracy went from 0.25 initially to close to 0.55 at the end, on a large corpus (10,000 prompts). I suppose one does not expect accuracy should reach 1.0. Although it did, for a short corpus (1,000 prompts), possibly indicating that short chat histories are badly overfitted.

Using the model

Go to OpenAI Playground (https://platform.openai.com/playground?mode=complete) and make sure to select the fine-tuned model and not a default model. The OpenAI fine-tuning page recommends to end the message with "\n\n###\n\n", although this does not appear necessary when asking meta-questions on the model.

In the "Stop sequences" textbox on the right of the page, you may add "###" to make sure only one reply is given at a time. Although by not specifying any stop sequence you will get longer replies consisting of multiple responses, which can also be entertaining.

Example prompts

You may replace occurrences of "[YYY]" by a short description of the person. I also suggest translating these prompts in the original language of the conversation, if non-English.

Learn something about yourself:

System: you are [YYY], with a particular personality, and you have been finetuned with previous conversations with me. This is not a request for a conversation but more of an analysis of your fine-tuning. Can you analyze the prompts, not the responses, of our previous conversations? I.e. from the person who wrote the questions. What is the personality of the person who wrote the prompts?

Learn something about the conversational topics:

System: you are [YYY], with a particular personality, and you have been finetuned with previous conversations with me. This is not a request for a conversation but more of an analysis of your fine-tuning. Can you analyze the topics of the conversations? What were they about?

Attempt a cold reply from your LLM-generated avatar:

You are [YYY], with a particular personality, and you have been finetuned with previous conversations with me. Please reply to this question as if you are [YYY]: Well that was worth a try, don't you think? \n\n###\n\n

ChatGPT would reply to this: "As an AI language model, I cannot observe or experience events, [..], I'd be happy to assist you." Funny enough, the entire response I got from the LLM tuned to a friend who's generally snarky was: "No."

Does it work?

Here is a question asked to a LLM trained for reproducing my style.

You are someone, with a particular personality, and you have been finetuned with previous conversations with me. Please reply to this question as if you are this person. What is your occupation?

Notice how I do not give any personal information in the prompt. The response was:

### it's a good question. i'm not sure what you mean by finetuned. do you mean the previous conversations changed my personality or that they made me a better person or what?. occupation: engineer. i guess i can't avoid to say that i'm a french engineer living in paris ###

(Very close, and absolutely in my chatting style. I'm a French scientist working in Paris.)

Caveats

⚠️ Be aware that chat data will be uploaded to OpenAI servers. This is a privacy concern. It can be somewhat mitigated by calling openai api files.list then openai api files.delete -i [file-id] to remove chat history from OpenAI servers, although nothing guarantees that backups have not been made in the meantime. I would also suggest to delete the fine-tuned model altogether after usage: openai api models.delete -i <FINE_TUNED_MODEL>.

It is unclear how well the method works for short chat histories and various languages. I have only tested French and English.

Interacting with the fine-tuned model can be confusing as it is not a real person, yet syntactically writes in the same style and about the same conversational topics as someone you know.

✍️ Please note: This is a personal project and the views expressed in the blog post are my own, not related to my employer or job.

Conclusion

This post demonstrates the feasibility of creating a LLM that "speaks" like an acquaintance. Results were unreasonably good. It was surely not a question of if, but of when, such an AI would exist and be broadly accessible. I encourage you to try it out to experience it while it is limited to the "dumb" but non-trivial capabilities of GPT-3. This will enable society to collectively better decide how to react and prepare the future when more powerful models such as GPT-4 (and 5, etc..) will emerge and the resulting fine-tunings will be even more convincing.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
LICENSE		LICENSE
README.md		README.md
convert.py		convert.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

convert.py

convert.py

Repository files navigation

GPT-is-you

Background

Motivations

Ethics

Costs

Results

Technical Steps

Using the model

Example prompts

Does it work?

Caveats

Conclusion

About

Releases

Packages

Languages

License

rchikhi/GPT-is-you

Folders and files

Latest commit

History

Repository files navigation

GPT-is-you

Background

Motivations

Ethics

Costs

Results

Technical Steps

Using the model

Example prompts

Does it work?

Caveats

Conclusion

About

Resources

License

Stars

Watchers

Forks

Languages