# DAT540: Generative AI

* Lectures: Vinay Setty, Petra Galuscakova.
* Teaching Assistant: Gabriel Iturra-Bocaz.

## Setting up Python, Conda, and Visual Studio Code (VS Code)


This Colab notebook explains:
- What [**Python**](https://www.python.org/) is (in a practical sense)
- What [**Conda**](https://anaconda.org/) is and when to use it
- What [**VS Code**](https://code.visualstudio.com/) is and how it fits into your workflow
- How to install libraries with **pip** vs **conda**
- The real differences (and common mistakes)

> **Note about Colab:** Google Colab already comes with Python and many libraries installed.  
> Colab is great for learning and experiments, but **Conda is mainly a local tool**.  
> We’ll still show Conda commands (for your laptop/HPC), and show how installs work in Colab (pip + optional micromamba).

---

## 1) What is Python?
**Python** is:
- A programming language (the code you write)
- A runtime/interpreter (the program that executes `.py` files)
- An ecosystem of libraries (NumPy, pandas, PyTorch, etc.)

When people say "install Python", they usually mean:
- Install the **Python interpreter**
- Install a **package manager** (like pip) or a distribution (like Conda)


## 2) What is Conda?
**Conda** is both:
1. An **environment manager** (create isolated environments)
2. A **package manager** (install libraries)

It’s especially useful when:
- You need compiled packages (NumPy/SciPy, CUDA toolkits, etc.)
- You want reproducible environments (exact versions)
- You work across machines (laptop + server + HPC)

Conda environments prevent “dependency conflicts” by isolating:
- Python version
- Installed libraries and their versions
- (Often) system-level compiled dependencies


## 3) What is Visual Studio Code (VS Code)?
**VS Code** is a code editor / IDE. It helps you:
- Write and navigate code
- Run Python scripts
- Use a debugger
- Manage Git
- Work with notebooks (`.ipynb`)
- Select Python interpreters (important!)

### VS Code + Conda (how they connect)
If you use Conda, VS Code should be set to the correct interpreter:
- `Python: Select Interpreter` → choose your conda environment

If VS Code uses the wrong interpreter, you’ll see errors like:
- “ModuleNotFoundError” even though you installed it


# Part A — Installing libraries in Colab
Colab is already a Linux machine with Python installed.  
In Colab, you typically install packages using **pip** (not conda).

### Basic rule in Colab
- Use: `pip install ...`
- Restart runtime only if needed (some packages require it)


In [None]:
# Check your Python version in Colab
import sys
print(sys.version)

3.12.12 (main, Oct 10 2025, 08:52:57) [GCC 11.4.0]


In [None]:
# See where Python is coming from (the interpreter path)
import sys
print(sys.executable)

/usr/bin/python3


## A1) Install a library with pip (Colab)
Example: install `rich` (a nice printing library).


In [None]:
!pip -q install rich

In [None]:
from rich import print
print("[bold green]pip install worked![/bold green]")

## A2) Check what pip installed (version + location)
Useful for debugging.


In [None]:
!python -m pip show rich

Name: rich
Version: 13.9.4
Summary: Render rich text, tables, progress bars, syntax highlighting, markdown and more to the terminal
Home-page: https://github.com/Textualize/rich
Author: Will McGugan
Author-email: willmcgugan@gmail.com
License: MIT
Location: /usr/local/lib/python3.12/dist-packages
Requires: markdown-it-py, pygments
Required-by: bigframes, flax, keras, keras-hub, plum-dispatch, pymc, typer


## A3) Installing a specific version with pip
Sometimes reproducibility requires pinning a version:


In [None]:
# Example (commented out): pin a version
# !pip install "rich==13.7.1"


# Part B — Conda vs pip (core differences)

## 1) What they manage
### pip
- Installs Python packages from **PyPI** (Python Package Index)
- Mostly “Python-level” packages, but can also install compiled wheels

### conda
- Installs packages from **conda channels** (like conda-forge)
- Can manage **non-Python dependencies** too:
  - system libraries
  - compilers
  - CUDA toolkits (depending on setup)
  - MKL/OpenBLAS, etc.

## 2) Reproducibility
- Both can be reproducible if you pin versions.
- Conda environments are often easier to reproduce across OSes for scientific stacks.

## 3) Common advice
- Prefer **conda** for heavy scientific / compiled stacks when working locally (NumPy/SciPy/PyTorch/CUDA).
- Prefer **pip** when:
  - the package is new and not on conda yet
  - you’re in Colab
  - you’re in a venv/poetry/pip-tools workflow

## 4) Mixing conda and pip (allowed, but do it carefully)
Best practice:
1. Create environment with conda
2. Install as much as possible with conda
3. Use pip **only for the missing packages**
4. If you use pip, run it as: `python -m pip install ...` (ensures correct interpreter)


# Part C — Conda commands (for your laptop / server / HPC)
Colab usually does **not** come with conda.  
So this section is a “reference” you can copy to your terminal locally.

## C1) Create a conda environment
```bash
conda create -n myenv python=3.11 -y
conda activate myenv
```

## C2) Install packages with conda
```bash
conda install numpy pandas scikit-learn -y
```

## C3) Install packages from conda-forge (recommended channel)
```bash
conda install -c conda-forge faiss-cpu -y
```

## C4) Export and recreate environment (reproducibility)
Export:
```bash
conda env export --no-builds > environment.yml
```

Create from file:
```bash
conda env create -f environment.yml
conda activate <env_name_from_yml>
```


# Part D — pip commands (works in Colab and locally)

## D1) Install a package
```bash
pip install numpy
```

## D2) Install a specific version
```bash
pip install "numpy==1.26.4"
```

## D3) Save your environment (requirements)
```bash
pip freeze > requirements.txt
```

## D4) Recreate
```bash
pip install -r requirements.txt
```

### Tip: use `python -m pip`
In multi-Python setups (conda/envs/VS Code), do:
```bash
python -m pip install <package>
```
so you install into the Python you’re actually running.


# Part E — Optional: Using a Conda-like tool inside Colab (micromamba)
If you really want conda-style installs in Colab, a common approach is **micromamba**.
It’s lighter than conda and can create environments.

This is OPTIONAL and sometimes unnecessary. Colab + pip is usually enough.

Below is a minimal example that:
- Installs micromamba
- Creates an environment
- Installs packages via conda-forge

> If this cell fails, don’t worry — it depends on Colab runtime changes.


In [None]:
# OPTIONAL: micromamba in Colab (may take a minute)
# Uncomment to try.

# !curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
# !./bin/micromamba --version

# Create env in a local folder (not system-wide)
# !./bin/micromamba create -y -p ./mamba_env -c conda-forge python=3.11 numpy pandas

# Use the env's python to run code:
# !./mamba_env/bin/python -c "import numpy, pandas; print(numpy.__version__, pandas.__version__)"


# Part F — VS Code setup (practical checklist)

Visual Studio Code (VS Code) is a lightweight but powerful code editor used to write, run, and debug Python and many other languages.
It becomes a full IDE through extensions, especially the Python and Jupyter extensions, which let you run scripts, notebooks, and use a debugger.
VS Code does not manage libraries itself—it uses the selected Python interpreter (conda/env/venv), so choosing the correct interpreter is crucial. You can download it here: https://code.visualstudio.com/

## F1) Install these in VS Code
- **Python** extension (by Microsoft)
- **Jupyter** extension (for notebooks)

## F2) Choose the correct interpreter
In VS Code:
- `Ctrl/Cmd + Shift + P` → `Python: Select Interpreter`
- Pick the interpreter from your conda env (it will show `myenv`)

## F3) Verify in the VS Code terminal
Open VS Code terminal and run:
```bash
python -c "import sys; print(sys.executable)"
python -m pip -V
```
Make sure these point to the same environment.

## F4) Common bug pattern
You installed with pip in one environment, but run code in another.
Fix: select the correct interpreter + install using `python -m pip`.


# Quick decision guide: pip or conda?

Use **pip** when:
- You’re in Colab
- The package is only on PyPI (or newest version is on PyPI)
- You’re using venv/poetry workflows

Use **conda** when:
- You need heavy scientific stacks / compiled dependencies
- You want easier cross-platform reproducibility
- You rely on CUDA toolkits / native libs (common in ML research)

Mix them only when needed:
- conda first, pip second (for missing packages)


# Mini exercise (Colab): install and import a few libraries
Try installing something you don’t already have, then import it.


In [None]:
# Example: install 'tiktoken' (often not preinstalled)
!pip -q install tiktoken

In [None]:
import tiktoken
print("tiktoken version:", tiktoken.__version__)