A Streamlit web app that generates datasets using GPT models.
Features:
- Choose between GPT 3.5 Turbo and text-davinci-003
- Export dataset to CSV
Note: the "text-davinci-002", "davinci" and "curie" models will not be supported as they don't perform as well for this use case
Prerequisites:
- Docker
- Windows or macOS: install Docker Desktop
- Linux: install Docker Engine and Docker Compose
- In the root of the project, build the images:
docker-compose build
- Run the services:
docker-compose up
- Go to
http://localhost:8501/
to access the frontend.
- Run
pip install -r requirements-dev.txt
- Install pre-commit hook:
pre-commit install
- (Optional) run hook:
pre-commit run --all-files
The backend
and frontend
directories also contain requirements
that need to be installed if running locally without Docker.
PyCharm:
Mark the src
directory as sources root:
To do this, go to Settings > Project > Project Structure. Then, click on the src
folder. Finally, click on the
blue Sources button.
The quality of the datasets generated depend on the responses by OpenAI GPT models. Consequently, they may not be factually correct. Please corroborate any data generated with factual sources.