# Module 2a: Understanding Environment Management
## Why Your Projects Need Separate Toolboxes

**Time required:** 15 minutes (reading only — no installation)  
**Prerequisites:** Module 1b (Python extension installed)  
**What you'll learn:** Why we need virtual environments and package managers

## The Problem: Why Can't We Just Install Python?

You might be wondering: *"Why all this complexity? Can't I just install Python and start coding?"*

Technically, yes. But here's what happens in practice:

### Scenario 1: Version Conflicts

Imagine you have two projects:

- **Project A** (from 2022): Uses `pandas` version 1.4, which your old analysis scripts depend on
- **Project B** (new): Needs `pandas` version 2.0 for new features

If Python packages are installed globally (for your whole computer), you can only have ONE version of pandas. Updating for Project B breaks Project A. This is called **dependency hell**.

### Scenario 2: The "Works on My Machine" Problem

You write a script that works perfectly on your computer. You send it to a colleague, and it crashes. Why?

- You have `numpy` version 1.24, they have 1.21
- You have `matplotlib` 3.8, they have 3.5
- Some function you used doesn't exist in their older versions

Without a way to specify *exactly* which versions you used, sharing code becomes frustrating.

### Scenario 3: System Python Chaos

Many operating systems come with Python pre-installed (macOS, Linux). This "system Python" is used by the operating system itself. If you accidentally break it by installing incompatible packages, you can break system tools.

```{warning}
Never modify the system Python on macOS or Linux. It can break your operating system's tools!
```

## The Solution: Virtual Environments

A **virtual environment** is an isolated Python setup for a single project. Each project gets its own:

- Python version
- Installed packages
- Package versions

### The Laboratory Analogy

Think of it like scientific laboratories:

| Without Virtual Environments | With Virtual Environments |
|------------------------------|---------------------------|
| One shared lab for all experiments | Each experiment has its own lab |
| Equipment conflicts between experiments | Equipment isolated per experiment |
| Changing one setup affects everything | Changes only affect that experiment |
| Hard to reproduce exact conditions | Easy to recreate the same setup |

### How It Works (Conceptually)

```
Your Computer
├── Project A (flood-analysis/)
│   └── .venv/                    ← Virtual environment
│       ├── Python 3.12
│       ├── pandas 1.4.0
│       └── matplotlib 3.5.0
│
├── Project B (groundwater-model/)
│   └── .venv/                    ← Different virtual environment
│       ├── Python 3.12
│       ├── pandas 2.0.0          ← Different version!
│       ├── matplotlib 3.8.0
│       └── flopy 3.4.0
│
└── Project C (rainfall-stats/)
    └── .venv/                    ← Yet another environment
        ├── Python 3.11           ← Even different Python version!
        └── scipy 1.11.0
```

Each project is completely independent. You can:
- Use different Python versions per project
- Have different package versions that would otherwise conflict
- Delete a project without affecting others
- Share exact requirements with colleagues

## The Traditional Approach (And Why It's Painful)

Historically, Python developers used these tools:

### 1. Manual Python Installation
- Download Python from python.org
- Run the installer
- Hope it doesn't conflict with existing installations
- Repeat for each Python version you need

### 2. pip (Package Installer for Python)
- The standard tool for installing packages
- `pip install pandas` installs pandas
- **Problem:** Slow, doesn't handle version conflicts well

### 3. venv (Virtual Environment Module)
- Built into Python 3.3+
- `python -m venv .venv` creates an environment
- **Problem:** You need Python installed first, manual activation required

### The Traditional Workflow

```bash
# 1. Install Python manually (varies by OS)
# 2. Create a virtual environment
python -m venv .venv

# 3. Activate it (different command per OS!)
# Windows:
.venv\Scripts\activate
# Mac/Linux:
source .venv/bin/activate

# 4. Install packages (slow)
pip install pandas numpy matplotlib

# 5. Remember to activate every time you work on the project!
```

**Problems with this approach:**
- Multiple steps to remember
- Different commands for different operating systems
- Forgetting to activate leads to confusing errors
- pip is slow and sometimes fails on complex dependencies
- Managing multiple Python versions is painful

## The Modern Approach: uv

**uv** is a modern Python package manager that solves all these problems. It's:

- **Fast** — Written in Rust, 10-100x faster than pip
- **Simple** — One tool does everything
- **Automatic** — Handles Python installation and environments for you
- **Reliable** — Better dependency resolution, fewer conflicts

### What uv Does For You

| Task | Traditional Way | With uv |
|------|-----------------|----------|
| Install Python | Download from python.org, run installer | `uv python install 3.12` |
| Create project | Multiple manual steps | `uv init my-project` |
| Create environment | `python -m venv .venv` + activate | Automatic! |
| Install packages | `pip install pandas` (slow) | `uv add pandas` (fast) |
| Run scripts | Activate env first, then `python script.py` | `uv run script.py` |
| Share with colleagues | Create requirements.txt manually | `pyproject.toml` + `uv.lock` created automatically |

### The uv Workflow (Preview)

Here's what working with uv looks like (you'll do this in the next module):

```bash
# Create a new project (creates folder, environment, config files)
uv init my-water-project
cd my-water-project

# Add packages (automatically installs into the environment)
uv add pandas numpy matplotlib

# Run your script (automatically uses the right environment)
uv run analyze_discharge.py
```

That's it! No activation commands, no remembering which environment you're in, no slow installations.

```{note}
uv was released in 2024 by Astral (the company behind the popular `ruff` Python linter). It's quickly becoming the recommended tool for Python project management.
```

## Key Files You'll Encounter

When using modern Python tools, you'll see these files in your projects:

### `pyproject.toml`

The main configuration file for your project. Contains:
- Project name and description
- Python version requirement
- List of dependencies (packages your project needs)

```toml
[project]
name = "flood-analysis"
version = "0.1.0"
requires-python = ">=3.12"
dependencies = [
    "pandas>=2.0.0",
    "matplotlib>=3.7.0",
]
```

### `uv.lock`

A "lockfile" that records the *exact* versions of every package installed. This ensures:
- You get the same versions every time
- Colleagues get the same versions you have
- Reproducible results!

### `.venv/` folder

The virtual environment itself. Contains:
- A copy of Python
- All installed packages
- Scripts and executables

```{tip}
The `.venv` folder can be large (hundreds of MB). It's not stored in Git—it's recreated from `pyproject.toml` when needed.
```

### `.python-version`

A small file specifying which Python version to use:

```
3.12
```

uv reads this and automatically uses (or installs) the right Python version.

## Why This Matters for Water Modellers

As a water modeller, proper environment management gives you:

### 1. Reproducible Analyses

When you write a flood frequency analysis in 2024, you want it to produce the *same results* in 2027. With locked dependencies, it will.

### 2. Easy Collaboration

Share your project folder with a colleague. They run `uv sync` and get *exactly* your environment. No more "it works on my machine" problems.

### 3. Clean Separation

Your groundwater modelling project (FloPy, MODFLOW) doesn't interfere with your surface water analysis (HydroMT, xarray). Each has its own environment.

### 4. Professional Practice

Consultancies, research institutions, and government agencies increasingly require reproducible workflows. Using virtual environments is standard practice.

### Real Example

Imagine this scenario:

```
2024: You analyze discharge data for a client
      - pandas 2.0, scipy 1.11, your custom scripts
      - Results: Q100 = 450 m³/s

2027: Client asks you to update the analysis with new data
      - You open the project, run `uv sync`
      - Same packages, same versions, same methodology
      - Add new data, rerun → Updated Q100 = 470 m³/s
      - Results are directly comparable!
```

Without environment management, you might get different results just because package versions changed—not because the data or methodology changed.

## Common Questions

### "Do I need to understand all this to use Python?"

No! You just need to know:
1. `uv init` creates a project
2. `uv add` installs packages
3. `uv run` runs your scripts

The conceptual understanding helps when things go wrong, but day-to-day usage is simple.

### "Why not just use Anaconda/conda?"

Conda is another popular tool, especially in scientific computing. However:
- It's heavier (larger installation)
- Mixing conda and pip packages can cause issues
- uv is faster and simpler for pure Python workflows

If your workplace uses conda, that's fine! The concepts are the same. We teach uv because it's modern, fast, and beginner-friendly.

### "What if I already have Python installed?"

That's okay! uv manages its own Python installations separately. Your existing Python won't interfere, and uv won't touch it.

### "Is uv stable enough to use?"

Yes. Despite being new (2024), uv is backed by Astral, a well-funded company with experienced developers. It's already used by many companies and recommended by Python experts.

## Summary: The Key Concepts

Before moving on, make sure you understand these concepts:

| Concept | What It Means |
|---------|---------------|
| **Virtual Environment** | An isolated Python setup for one project |
| **Package Manager** | A tool that installs and manages packages (uv, pip) |
| **Dependencies** | Packages your project needs to work |
| **Lockfile** | A file recording exact package versions (`uv.lock`) |
| **pyproject.toml** | Your project's configuration file |

### The Mental Model

```
┌─────────────────────────────────────────────────────────┐
│  Your Project Folder                                    │
│  ├── pyproject.toml    ← "Recipe" (what you need)      │
│  ├── uv.lock           ← "Shopping list" (exact items) │
│  ├── .venv/            ← "Kitchen" (where tools live)  │
│  └── your_scripts.py   ← Your actual work              │
└─────────────────────────────────────────────────────────┘
```

- `pyproject.toml` says "I need pandas and matplotlib"
- `uv.lock` says "specifically pandas 2.1.4 and matplotlib 3.8.2"
- `.venv/` contains the actual installed packages
- Your scripts use those packages to do real work

## What's Next?

Now that you understand *why* we need environment management, let's set it up!

In the next module, you'll:

1. Install uv on your computer
2. Create your first Python project
3. Install packages (pandas, numpy, matplotlib)
4. Run a test script to verify everything works

This is where the real fun begins—after Module 2b, you'll have a fully working Python environment!

---

**Ready?** Continue to [Module 2b: Installing and Using uv](02b_installing_uv.ipynb)