# Week 1 — Part 01: Environment Setup Lab

**Estimated Time:** 45-60 minutes

**Prerequisites:** Basic command line knowledge

---

## What success looks like (end of Part 01)

- You can create a new virtual environment (`.venv`).
- You can install `pandas` inside that environment.
- You can produce a `requirements.txt` and use it to reproduce installs later.

**Checkpoint evidence** (minimum):

- `python --version` prints what you expect.
- `pip --version` shows it is from your environment (not system-wide).
- `pip freeze > requirements.txt` produces a file.

---

## Learning Objectives

By completing this lab, you will:

- ✅ Create isolated Python environments using venv
- ✅ Install and manage dependencies with pip
- ✅ Understand the importance of reproducible environments
- ✅ Save and restore project dependencies
- ✅ Test environment reproducibility

## Key Concepts

- **Virtual Environments**: Isolated Python environments that prevent dependency conflicts
- **Dependency Management**: Recording and reproducing exact package versions
- **Reproducibility**: Ensuring consistent results across different environments and runs
- **Requirements Files**: Text files that list project dependencies

---

## Exercise 1: Understanding Virtual Environments

Virtual environments are isolated Python environments that allow you to install packages for a specific project without affecting other projects or your system Python installation. This prevents dependency conflicts between different projects.

Why do we need virtual environments?

1. Different projects often require different versions of the same package
2. Installing packages globally can lead to conflicts
3. Virtual environments make it easy to reproduce your development environment on other machines

In [None]:
# Check Python version
!python --version

### Task 1.1: Create a Virtual Environment

Let's create a virtual environment for our project. This will be an isolated Python environment where we can install our project dependencies.

In [None]:
# Create a virtual environment named .venv
!python -m venv .venv

print("Virtual environment created successfully!")

### Task 1.2: Activate the Virtual Environment

Now that we've created our virtual environment, we need to activate it. When activated, any Python packages we install will be isolated to this environment.

Note: Activation commands differ between operating systems. For this notebook, we'll show the commands for different systems.

In [None]:
# Display activation commands for different operating systems
print("To activate your environment, run one of the following commands:")
print("\nLinux/macOS:\n  source .venv/bin/activate")
print("\nWindows (Command Prompt):\n  .venv\\Scripts\\activate.bat")
print("\nWindows (PowerShell):\n  .venv\\Scripts\\Activate.ps1")

### Task 1.3: Upgrade pip

It's good practice to upgrade pip to the latest version after creating a new virtual environment.

In [None]:
# Upgrade pip to the latest version
!python -m pip install --upgrade pip

# Verify pip version
!pip --version

---

## Exercise 2: Installing Dependencies

Now that we have our virtual environment set up, let's install some dependencies for our project.

### Task 2.1: Install Required Dependencies

For this week's exercises, we need pandas for data manipulation. Let's install it in our virtual environment.

In [None]:
# Install pandas in our virtual environment
!pip install pandas

# Verify installation
try:
    import pandas as pd
    print("Pandas installed successfully!")
    print(f"Pandas version: {pd.__version__}")
except ImportError:
    print("Failed to import pandas. Please check your installation.")

### Task 2.2: Install Multiple Dependencies

Let's install a few more packages that we'll need for future weeks.

In [None]:
# Install additional dependencies
!pip install scikit-learn matplotlib

# Verify installations
try:
    import sklearn
    print("Scikit-learn installed successfully!")
    print(f"Scikit-learn version: {sklearn.__version__}")
except ImportError:
    print("Failed to import scikit-learn. Please check your installation.")

try:
    import matplotlib
    print("Matplotlib installed successfully!")
    print(f"Matplotlib version: {matplotlib.__version__}")
except ImportError:
    print("Failed to import matplotlib. Please check your installation.")

---

## Exercise 3: Managing Dependencies

To ensure reproducibility, we need to record the exact versions of all packages we've installed. This allows others (or ourselves in the future) to recreate the exact same environment.

### Task 3.1: Save Dependencies to requirements.txt

Let's generate a requirements.txt file with the exact package versions we've installed.

In [None]:
# Generate requirements.txt with exact package versions
!pip freeze > requirements.txt

# Display the contents of requirements.txt
with open('requirements.txt', 'r') as f:
    requirements = f.read()
    print("requirements.txt contents:")
    print(requirements)

### Task 3.2: Understanding requirements.txt

Let's examine the contents of our requirements.txt file to understand what information it contains.

In [None]:
# Parse and display information about our dependencies
with open('requirements.txt', 'r') as f:
    requirements = f.readlines()

print("Dependency Information:")
for req in requirements:
    req = req.strip()
    if req and not req.startswith('#'):
        package_info = req.split('==')
        if len(package_info) == 2:
            package_name, package_version = package_info
            print(f"  {package_name}: {package_version}")
        else:
            print(f"  {req}")

---

## Exercise 4: Testing Environment Reproducibility

Let's test if we can recreate our environment from scratch using only the requirements.txt file.

### Task 4.1: Create a New Environment

Let's create a new virtual environment and try to install our dependencies from requirements.txt.

In [None]:
# Create a new virtual environment for testing
!python -m venv test_env

print("Test environment created successfully!")

### Task 4.2: Install Dependencies from requirements.txt

Now let's try to install our dependencies in the new environment using the requirements.txt file.

Note: In a real scenario, you would activate the new environment first.

In [None]:
# In a real scenario, you would activate the test environment first
# For this notebook, we'll simulate the installation

print("To install dependencies in the test environment, you would run:")
print("  source test_env/bin/activate  # On Linux/macOS")
print("  pip install -r requirements.txt")

# For demonstration purposes, let's just show what packages would be installed
with open('requirements.txt', 'r') as f:
    requirements = f.read()
    print("\nPackages that would be installed:")
    print(requirements)

### Task 4.3: Verify Installation

Let's verify that the packages were installed correctly in our test environment.

Note: In a real scenario, you would run these commands in the activated test environment.

In [None]:
# In a real scenario, you would run these commands in the activated test environment
print("To verify installation in the test environment, you would run:")
print("  python -c \"import pandas; print('Pandas version:', pandas.__version__)\"")
print("  python -c \"import sklearn; print('Scikit-learn version:', sklearn.__version__)\"")
print("  python -c \"import matplotlib; print('Matplotlib version:', matplotlib.__version__)\"")

# For demonstration purposes, let's verify in our current environment
print("\nVerification in current environment:")
try:
    import pandas as pd
    print(f"  Pandas version: {pd.__version__}")
except ImportError:
    print("  Pandas not available")

try:
    import sklearn
    print(f"  Scikit-learn version: {sklearn.__version__}")
except ImportError:
    print("  Scikit-learn not available")

try:
    import matplotlib
    print(f"  Matplotlib version: {matplotlib.__version__}")
except ImportError:
    print("  Matplotlib not available")

---

## Exercise 5: Practice Challenges

Now it's your turn to apply what you've learned. Try to complete the following challenges:

### Challenge 5.1: Create a Project Setup Script

Create a shell script that automates the entire environment setup process:

1. Create a virtual environment
2. Activate it
3. Upgrade pip
4. Install dependencies from requirements.txt (if it exists)
5. If requirements.txt doesn't exist, install a default set of packages

Bonus: Make the script work on both Linux/macOS and Windows.

In [None]:
# TODO: Create a project setup script
#
# Goal:
# - Create a script that a teammate can run on a fresh clone/folder.
# - It should create a venv, upgrade pip, and install dependencies.
#
# Hint: start with a minimal script, then add OS detection if you want.
setup_script = ""  # TODO

print("TODO: write a setup script string, then save it to setup_project.sh")

### Challenge 5.2: Dependency Management Best Practices

Research and implement best practices for dependency management:

1. Create separate requirements files for different environments (development, testing, production)
2. Use version ranges instead of exact versions for some packages
3. Add comments to your requirements.txt to explain why certain packages are needed
4. Regularly update dependencies and check for security vulnerabilities

In [None]:
# TODO: Create separate requirements files for different environments
#
# Goal:
# - requirements-dev.txt (dev tools like pytest, jupyter)
# - requirements-prod.txt (only runtime deps)
#
# Keep this section as an exercise. Solutions are in the appendix.
print("TODO: create requirements-dev.txt and requirements-prod.txt")

---

## Summary and Key Takeaways

### Concepts Practiced

- **Virtual Environments**: Isolated Python environments for project-specific dependencies
- **Dependency Management**: Using pip and requirements.txt for reproducible environments
- **Environment Reproducibility**: Ensuring consistent environments across different machines
- **Automation**: Creating scripts to automate environment setup

### Best Practices

- ✅ Always use virtual environments for Python projects
- ✅ Record exact dependency versions in requirements.txt
- ✅ Regularly update and audit dependencies
- ✅ Use separate requirements files for different environments
- ✅ Test environment reproducibility regularly

### Next Steps

- Practice creating more complex environment setup scripts
- Learn about alternative dependency management tools like conda or poetry
- Explore tools for dependency security scanning
- Investigate containerization with Docker for even more reproducible environments

## Appendix: Solutions (peek only after trying)

Reference implementations for the two challenge tasks.

In [None]:
# Solution for Challenge 5.1
setup_script = '''\
#!/bin/bash

set -euo pipefail

echo "Setting up project environment..."

python -m venv .venv

# shellcheck disable=SC1091
source .venv/bin/activate

python -m pip install --upgrade pip

if [ -f requirements.txt ]; then
  echo "Installing dependencies from requirements.txt..."
  pip install -r requirements.txt
else
  echo "Installing default packages..."
  pip install pandas scikit-learn matplotlib
fi

echo "Environment setup complete"
'''

with open('setup_project.sh', 'w', encoding='utf-8') as f:
    f.write(setup_script)

print("Wrote setup_project.sh")


# Solution for Challenge 5.2

# Development requirements
# These packages are needed for development but not for production

dev_requirements = '''\
# Development requirements
pandas>=1.3.0
scikit-learn>=1.0.0
matplotlib>=3.4.0
jupyter>=1.0.0
pytest>=6.2.0
'''

# Production requirements
# These packages are needed for production deployment

prod_requirements = '''\
# Production requirements
pandas>=1.3.0
scikit-learn>=1.0.0
'''

with open('requirements-dev.txt', 'w', encoding='utf-8') as f:
    f.write(dev_requirements)

with open('requirements-prod.txt', 'w', encoding='utf-8') as f:
    f.write(prod_requirements)

print("Wrote requirements-dev.txt and requirements-prod.txt")
