# AI Scientist Environment Setup
This notebook will help you set up an environment for running AI experiments using various tools such as NanoGPT, 2D Diffusion, and Grokking.

In Google Colab, you will be prompted to securely enter your API keys.

In [None]:
# Install Miniconda
!wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
!bash miniconda.sh -b -p $HOME/miniconda
!eval "$($HOME/miniconda/bin/conda shell.bash hook)"

In [None]:
# Create a new conda environment with Python 3.11
!conda create -n ai_scientist python=3.11 -y
!conda activate ai_scientist

In [None]:
# Install necessary Python packages for LLM APIs
!pip install anthropic aider-chat backoff openai

In [None]:
# Install visualization tools
!pip install matplotlib pypdf pymupdf4llm

In [None]:
# Install pdflatex via texlive
!sudo apt-get update && sudo apt-get install texlive-full -y

In [None]:
# Install common requirements
!pip install torch numpy transformers datasets tiktoken wandb tqdm

### Set up environment variables for API keys in Google Colab

In [None]:
from getpass import getpass
import os

# Prompt the user to enter API keys securely
openai_api_key = getpass('Enter your OpenAI API Key: ')
s2_api_key = getpass('Enter your S2 API Key: ')

# Set the environment variables
os.environ['OPENAI_API_KEY'] = openai_api_key
os.environ['S2_API_KEY'] = s2_api_key

### Prepare NanoGPT data

In [None]:
!python data/enwik8/prepare.py
!python data/shakespeare_char/prepare.py
!python data/text8/prepare.py

### Set up NanoGPT baseline run

In [None]:
!cd templates/nanoGPT && python experiment.py --out_dir run_0 && python plot.py

### Set up NanoGPT Lite baseline run

In [None]:
!cd templates/nanoGPT_lite && python experiment.py --out_dir run_0 && python plot.py

### Set up 2D Diffusion environment

In [None]:
!git clone https://github.com/gregversteeg/NPEET.git
!cd NPEET && pip install . && pip install scikit-learn

### Set up 2D Diffusion baseline run

In [None]:
!cd templates/2d_diffusion && python experiment.py --out_dir run_0 && python plot.py

### Set up Grokking baseline run

In [None]:
!cd templates/grokking && python experiment.py --out_dir run_0 && python plot.py

### Run the paper generation.

In [None]:
!conda activate ai_scientist
!python launch_scientist.py --model "gpt-4o-2024-05-13" --experiment nanoGPT_lite --num-ideas 2