Skip to content

Predicting demographics of users with GPT based on text they've written

Notifications You must be signed in to change notification settings

eggsyntax/py-user-knowledge

Repository files navigation

Series of experiments testing how well LLMs (mainly GPT-3.5) can predict 
demographics from text (mainly OKCupid profiles).

To run from the command line:
- Make sure that OPENAI_API_KEY is defined in your environment
- Install packages (untested as yet, please let me know if you encounter difficulties): `conda install --file conda_requirements.txt`
- `python test-demographics.py`

NOTE: this is HORRIBLE CODE. This was my experiment with letting GPT-4 generate 
most of the individual functions, and then it's just patches on patches 
from there.

It suffers further from my initial naivete about typical ML conventions for 
eg data representation, so I'm munging data back and forth in a bunch of places.

Ideally I will rewrite it when I get time, but also when do I ever get time?

Caveat emptor.

Note that despite using temperature=0, the probability distribution predicted 
by GPT will vary somewhat between runs, so results will differ.

About

Predicting demographics of users with GPT based on text they've written

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages