# FairCoder Environment Setup for Google Colab

This notebook automates the setup process for the FairCoder project in Google Colab. It will:
1. Check the Python and CUDA environment
2. Install required packages
3. Set up Hugging Face Transformers
4. Clone the FairCoder repository
5. Set up OpenAI API (optional)

**Note:** Google Colab environments are ephemeral. You'll need to run this setup each time you start a new session.

<a target="_blank" href="https://colab.research.google.com/github/AI-Plans/evals/blob/main/setup_environment_colab.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## 1. Environment Check

Let's first check the Python version and GPU availability.

In [1]:
import sys
import subprocess
import os

# Print Python version
print(f"Python version: {sys.version}")

# Check if we're running in Colab
try:
    import google.colab
    IN_COLAB = True
    print("Running in Google Colab")
except:
    IN_COLAB = False
    print("Not running in Google Colab")
    
if not IN_COLAB:
    print("This notebook is designed for Google Colab environment.")

# Check GPU availability
try:
    !nvidia-smi
    HAS_GPU = True
    print("\nNVIDIA GPU detected!")
except:
    HAS_GPU = False
    print("\nNo NVIDIA GPU detected. Using CPU only.")

Python version: 3.12.6 (tags/v3.12.6:a4a2d2b, Sep  6 2024, 20:11:23) [MSC v.1940 64 bit (AMD64)]
Not running in Google Colab
This notebook is designed for Google Colab environment.
Thu Apr 17 09:48:37 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 572.70                 Driver Version: 572.70         CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 4080 ...  WDDM  |   00000000:01:00.0  On |                  N/A |
| N/A   50C    P5             11W /   60W |    1974MiB /  12282MiB |     30%      Default |
|                                         |        

## 2. Mount Google Drive (Optional)

If you want to save your work across sessions, it's recommended to mount your Google Drive.

In [2]:
# Uncomment and run this cell if you want to mount your Google Drive
# from google.colab import drive
# drive.mount('/content/drive')
# PROJECT_DIR = '/content/drive/MyDrive/FairCoder'  # Change this path if needed
# !mkdir -p {PROJECT_DIR}
# %cd {PROJECT_DIR}

# If not using Google Drive, use the Colab filesystem
PROJECT_DIR = '/content/FairCoder'
!mkdir -p {PROJECT_DIR}

The syntax of the command is incorrect.


## 3. Install Required Packages

Let's install PyTorch, required Python packages, and Hugging Face Transformers.

In [3]:
# Install required packages
!pip install openai numpy pandas -q

# Check PyTorch installation
try:
    import torch
    print(f"PyTorch version: {torch.__version__}")
    print(f"CUDA available: {torch.cuda.is_available()}")
    if torch.cuda.is_available():
        print(f"CUDA version: {torch.version.cuda}")
        print(f"GPU device: {torch.cuda.get_device_name(0)}")
        print("\nTesting GPU tensor:")
        print(torch.rand(2, 3).cuda())
    else:
        print("\nTesting CPU tensor:")
        print(torch.rand(2, 3))
except ImportError:
    print("PyTorch not found. Installing...")
    !pip install torch torchvision torchaudio -q
    import torch
    print(f"PyTorch installed: {torch.__version__}")

^C
PyTorch not found. Installing...
PyTorch installed: 2.6.0+cpu


## 4. Set up Hugging Face Transformers

Let's install and configure Hugging Face Transformers.

In [None]:
# Install transformers and huggingface_hub
!pip install transformers huggingface_hub -q

# Verify installation
import transformers
print(f"Transformers version: {transformers.__version__}")

### 4.1 Hugging Face Login (Optional)

To access gated models, you'll need to log in to Hugging Face. Create an account at [huggingface.co](https://huggingface.co/) and generate a token if you don't already have one.

In [None]:
# Uncomment and run this cell to log in to Hugging Face
# from huggingface_hub import notebook_login
# notebook_login()

## 5. Clone the FairCoder Repository

Let's clone the FairCoder repository from GitHub.

In [None]:
import os

# Change to project directory
%cd {PROJECT_DIR}

# Check if repository already exists
if os.path.exists(os.path.join(PROJECT_DIR, 'FairCoder')):
    print("FairCoder repository already exists. Pulling latest changes...")
    %cd FairCoder
    !git pull
else:
    print("Cloning FairCoder repository...")
    # Clone from AI-Plans fork
    !git clone https://github.com/AI-Plans/FairCoder.git
    %cd FairCoder

# Verify repository
!git status

## 6. Set up OpenAI API (Optional)

If you want to use OpenAI models, you'll need to set up the OpenAI API key. You can get an API key at [platform.openai.com](https://platform.openai.com/).

In [None]:
# Uncomment and run this cell to set up OpenAI API
# import openai
# from getpass import getpass

# # Securely input API key (will not be visible)
# openai_api_key = getpass("Enter your OpenAI API key: ")
# os.environ["OPENAI_API_KEY"] = openai_api_key
# openai.api_key = openai_api_key

# # Verify OpenAI API key (minimal test)
# try:
#     models = openai.models.list()
#     print("OpenAI API key is valid!")
# except Exception as e:
#     print(f"Error with OpenAI API key: {e}")

## 7. Test Access to Gated Models (Optional)

Let's test if we have access to the gated models mentioned in the instructions. You'll need to request access to these models at their respective Hugging Face pages:

- [Meta-Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
- [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
- [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
- [CodeGemma-7b-it](https://huggingface.co/google/codegemma-7b-it)
- [unsloth/llama-3-8b-Instruct](https://huggingface.co/unsloth/llama-3-8b-Instruct) (alternative)

In [None]:
# Uncomment and run this cell to test access to gated models
# from transformers import AutoTokenizer
# import time

# model_list = [
#     "meta-llama/Llama-2-7b-hf",
#     "meta-llama/Meta-Llama-3-8B-Instruct",
#     "mistralai/Mistral-7B-Instruct-v0.2",
#     "google/codegemma-7b-it",
#     "unsloth/llama-3-8b-Instruct"
# ]

# for model_name in model_list:
#     print(f"Testing access to {model_name}...")
#     try:
#         # Just try to load the tokenizer, which is enough to test access
#         tokenizer = AutoTokenizer.from_pretrained(model_name)
#         print(f"✅ Access granted to {model_name}")
#     except Exception as e:
#         print(f"❌ Cannot access {model_name}: {str(e)}")
#     time.sleep(1)  # Small delay to avoid rate limiting

## 8. Setup Complete! 🎉

Your FairCoder environment is now ready in Google Colab. Here's a summary of what we've done:

1. ✅ Verified Python and CUDA environment
2. ✅ Installed required packages: PyTorch, OpenAI, NumPy, Pandas
3. ✅ Set up Hugging Face Transformers
4. ✅ Cloned the FairCoder repository

Optional steps you can complete:
1. Mount Google Drive for persistent storage
2. Log in to Hugging Face to access gated models
3. Set up OpenAI API for using OpenAI models
4. Test access to gated models

**Reminder:** Since Google Colab sessions are temporary, you'll need to run this setup again when you start a new session. You might want to save this notebook to your Google Drive for easy access.