# Appendix A -- Python Environment Setup
## *Python for AI/ML: A Complete Learning Journey*

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/timothy-watt/python-for-ai-ml/blob/main/APP_A_Environment_Setup.ipynb)
&nbsp;&nbsp;[![Back to TOC](https://img.shields.io/badge/Back_to-Table_of_Contents-1B3A5C?style=flat-square)](https://colab.research.google.com/github/timothy-watt/python-for-ai-ml/blob/main/Python_for_AIML_TOC.ipynb)

---

This appendix covers everything you need to run the book's notebooks locally
on your own machine, and to set up a professional Python development environment
for AI/ML work beyond Colab.

**Sections:**

- A.1 -- Local Python setup with `conda` and `venv`
- A.2 -- Reproducing the Colab environment locally
- A.3 -- GPU setup for PyTorch (NVIDIA CUDA)
- A.4 -- Managing dependencies with `requirements.txt` and `environment.yml`
- A.5 -- VS Code setup for notebook development
- A.6 -- Version pinning and reproducibility


---

## A.1 -- Local Python Setup

The book uses Google Colab, which provides a managed Python environment.
For local development, you have two main options: **conda** (recommended for
data science) and **venv** (Python's built-in virtual environment tool).

### Why use a virtual environment at all?

Without virtual environments, all projects share a single Python installation.
Project A needs `numpy==1.24`, Project B needs `numpy==2.0` -- they cannot
coexist in the same installation. Virtual environments solve this by giving
each project its own isolated package space.


In [None]:
# A.1.1 -- Detect the current environment
# Run this cell to see what Python and key packages are installed

import sys
import subprocess

print(f'Python executable: {sys.executable}')
print(f'Python version:    {sys.version}')

packages = ['numpy', 'pandas', 'matplotlib', 'seaborn', 'sklearn',
            'scipy', 'torch', 'transformers', 'shap']

print()
print(f'{"Package":<20} {"Version":<15} {"Status"}')
print('-' * 45)
for pkg in packages:
    try:
        mod = __import__(pkg if pkg != 'sklearn' else 'sklearn')
        ver = getattr(mod, '__version__', 'unknown')
        print(f'{pkg:<20} {ver:<15} installed')
    except ImportError:
        print(f'{pkg:<20} {"":15} NOT INSTALLED')


---

## A.2 -- Conda Setup (Recommended)

[Miniconda](https://docs.conda.io/en/latest/miniconda.html) is the minimal
installer for conda. Install it, then use the commands below.

```bash
# 1. Create a new environment named 'pyaiml' with Python 3.11
conda create -n pyaiml python=3.11 -y

# 2. Activate it
conda activate pyaiml

# 3. Install the core data science stack
conda install numpy pandas matplotlib seaborn scipy scikit-learn jupyter -y

# 4. Install PyTorch (CPU version -- see A.3 for GPU)
conda install pytorch cpuonly -c pytorch -y

# 5. Install remaining packages with pip
pip install transformers datasets accelerate shap lime nltk plotly

# 6. Register the environment as a Jupyter kernel
pip install ipykernel
python -m ipykernel install --user --name pyaiml --display-name 'Python (pyaiml)'

# 7. Launch Jupyter
jupyter notebook
```

To deactivate the environment when you are done: `conda deactivate`

---

## A.3 -- venv Setup (Built-in, no conda required)

```bash
# 1. Create a virtual environment in a folder called .venv
python3.11 -m venv .venv

# 2. Activate it (macOS/Linux)
source .venv/bin/activate
# On Windows:
# .venv\Scripts\activate

# 3. Upgrade pip first
pip install --upgrade pip

# 4. Install everything
pip install numpy pandas matplotlib seaborn scipy scikit-learn
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip install transformers datasets accelerate shap lime nltk plotly jupyter
```


---

## A.4 -- GPU Setup for PyTorch (NVIDIA CUDA)

GPU training is required for Chapter 7 fine-tuning at scale and highly
recommended for Chapter 8. The setup depends on your NVIDIA driver version.

### Step 1: Check your NVIDIA driver

```bash
nvidia-smi
```

The output shows your driver version and the maximum CUDA version it supports.

### Step 2: Install the matching PyTorch build

Go to [pytorch.org/get-started/locally](https://pytorch.org/get-started/locally/)
and use the selector to generate the correct install command for your CUDA version.

Example for CUDA 12.1:
```bash
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
```

### Step 3: Verify GPU is detected


In [None]:
# A.4.1 -- Verify PyTorch GPU setup
import torch

print(f'PyTorch version:    {torch.__version__}')
print(f'CUDA available:     {torch.cuda.is_available()}')

if torch.cuda.is_available():
    print(f'CUDA version:       {torch.version.cuda}')
    print(f'GPU count:          {torch.cuda.device_count()}')
    for i in range(torch.cuda.device_count()):
        props = torch.cuda.get_device_properties(i)
        print(f'GPU {i}:             {props.name}')
        print(f'  Memory:           {props.total_memory / 1e9:.1f} GB')
        print(f'  CUDA capability:  {props.major}.{props.minor}')
else:
    print('No GPU detected -- running on CPU')
    print('For Colab GPU: Runtime -> Change runtime type -> T4 GPU')
    print('For local GPU: see A.4 instructions above')


---

## A.5 -- Managing Dependencies

### requirements.txt (pip)

Lock exact package versions so your environment is reproducible:

```bash
# Save current environment
pip freeze > requirements.txt

# Recreate on another machine
pip install -r requirements.txt
```

### environment.yml (conda)

```bash
# Save current conda environment
conda env export > environment.yml

# Recreate on another machine
conda env create -f environment.yml
```

### Pinned requirements.txt for this book

The cell below prints a `requirements.txt` for the exact versions used
in this book's Colab environment:


In [None]:
# A.5.1 -- Generate requirements.txt for this book
import importlib

book_packages = {
    'numpy':          'numpy',
    'pandas':         'pandas',
    'matplotlib':     'matplotlib',
    'seaborn':        'seaborn',
    'scipy':          'scipy',
    'scikit-learn':   'sklearn',
    'torch':          'torch',
    'transformers':   'transformers',
    'datasets':       'datasets',
    'accelerate':     'accelerate',
    'shap':           'shap',
    'nltk':           'nltk',
    'plotly':         'plotly',
}

print('# requirements.txt for Python for AI/ML')
print('# Generated from Colab environment')
print()
for pkg_name, import_name in book_packages.items():
    try:
        mod = importlib.import_module(import_name)
        ver = getattr(mod, '__version__', 'unknown')
        print(f'{pkg_name}=={ver}')
    except ImportError:
        print(f'# {pkg_name}  -- not installed in this environment')


---

## A.6 -- VS Code Setup for Notebook Development

VS Code with the Python and Jupyter extensions provides a full notebook
experience locally with better debugging, git integration, and variable
inspection than the browser-based Jupyter interface.

**Setup steps:**

1. Install [VS Code](https://code.visualstudio.com/)
2. Install extensions: **Python** (Microsoft) and **Jupyter** (Microsoft)
3. Open a `.ipynb` file -- VS Code handles it natively
4. Select your kernel: click the kernel selector top-right â†’ choose `pyaiml`
   (or whichever environment you created in A.2/A.3)

**Recommended additional extensions:**

- **GitLens** -- enhanced git history and blame annotations
- **Pylance** -- fast type checking and autocomplete
- **Black Formatter** -- auto-format Python code on save
- **Rainbow CSV** -- colour-coded CSV file preview

**Useful VS Code settings for data science** (add to `settings.json`):

```json
{
    "editor.formatOnSave": true,
    "python.formatting.provider": "black",
    "jupyter.askForKernelRestart": false,
    "notebook.lineNumbers": "on",
    "editor.rulers": [88]
}
```

---

*End of Appendix A -- Python for AI/ML*  
[![Back to TOC](https://img.shields.io/badge/Back_to-Table_of_Contents-1B3A5C?style=flat-square)](https://colab.research.google.com/github/timothy-watt/python-for-ai-ml/blob/main/Python_for_AIML_TOC.ipynb)
