### Virtual Environments & Dependency Management

# Why Use a Virtual Environment?

A **virtual environment** is an isolated Python environment that allows you to:
- Keep project dependencies separate from system-wide packages
- Avoid version conflicts between different projects
- Make projects more reproducible for others

Using virtual environments is a **best practice** in Python development, especially when working on AI, machine learning, and data science projects.

---


## Step 1: Checking Python Version
Before creating a virtual environment, check if Python is installed and which version you are using. Open a terminal and execute the following command:

```bash
python --version
```

Most students are probably working with Python 3.10 or higher now. Sometimes the latest Python releases aren't yet fully supported for all packages. 
We will work with PyTorch version 2.10. The working python version for this are between 3.10 and 3.13. Don't use Python 3.14 for now

If you run into problems because you have a different Python or Pytorch version, check the compatibility matrix on the link below to figure out which version you need 

[https://github.com/pytorch/pytorch/blob/main/RELEASE.md](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)

---


## Step 2: Creating a Virtual Environment
You can create a virtual environment using the built-in `venv` module. These instructions assume you're working in the terminal. However, you may recall, it's also possible to create a virtual environment using the VS Code interface.

### Windows:
```bash
python -m venv venv
```

### macOS/Linux:
```bash
python3 -m venv venv
```


This will create a folder called `venv/` that contains a clean Python installation. You can chance 'venv' in the examples above to any other name if you wish. For example, here it might make sense to name it ai-venv. You can also add a `.` in front of the name `.venv` or `.ai-venv` to create a hidden folder. Some developers prefer to keep it hidden so the project folder is cleaner.

---


## Step 3: Activating the Virtual Environment
Once created, you need to **activate** the environment. VS Code might also automatically offer to activate it for you, but then it doesn't always show that you're working in the venv in the terminal, so I prefer to activate the venv myself.

### Windows (Command Prompt):
```bash
venv\Scripts\activate
```

### Windows (PowerShell):
```bash
venv\Scripts\Activate.ps1
```

### macOS/Linux:
```bash
source venv/bin/activate
```

After activation, you should see `(venv)` at the beginning of your terminal prompt, indicating the virtual environment is active.

---


## Step 4: Installing Dependencies from `requirements.txt`
This part is likely new for many students. This is typically how we work with virtual environments in real life situations where multiple people are working on the same project. A requirements.txt file keeps track of the packages necessary for this project. If a project has a `requirements.txt` file listing necessary packages, you can install all of them using:

```bash
pip install -r requirements.txt
```

To check installed packages:
```python
!pip list
```

### Updates to Dependencies

If new packages are needed later, we will push a new requirements.txt file. Then you can install only the new packages without reinstalling everything with the following command:
```bash
pip install -r requirements.txt --upgrade
```

### More Refined Approaches

This a helpful way to ensure that everyone has the necessary packages. There are even more refined approaches you may encounter in an internship or job, such as (simpler) `pip-tools` or (more advanced) `poetry` to help manage dependencies.

---
## Step 5: Installing the correct version of PyTorch

### Hardware

Pytorch can be run on a dedicated NVIDIA GPU, a dedicated AMD GPU, or on your CPU (but will be much slower).
The first step is to figure out what GPU you have, and to either install the correct CUDA version suited to your device, or the ROCm platform. 
Afterwards, you can install PyTorch. Go to the [https://pytorch.org/get-started/locally](https://pytorch.org/get-started/locally) and click on the correct platform (linux/windows) and your GPU/Cuda/CPU/ROC version. It will give you a link to run in your virtual environment which will help you install torch, like `pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu126`



#### Dedicated NVIDIA GPU
* Install CUDA: If you haven't already, install the CUDA toolkit, which provides the necessary drivers and libraries for using NVIDIA GPUs with PyTorch. You can download it from the [NVIDIA website](https://developer.nvidia.com/cuda-toolkit).
You first need to figure out which is the correct version of CUDA suited to your device. 

* Other things I needed to do to get CUDA working:
    * run `pip uninstall torch torchvision torchaudio`: removes the CPU only version of pytorch
    * run `wmic path win32_VideoController get name`: find out what model GPU you have
    * run `nvidia-smi`: find out you driver version, CUDA version, GPU model, and current GPU utilization
    * run `pip install nvidia-pyindex --use-pep517 --no-cache-dir` or `pip install nvidia-cuda-runtime-cu12`: I had issues with pyindex so I tried these install options. Here cu12 is specific to my CUDA version.
    * run `pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128`: this is specific to my CUDA version, your link will depend on your SMI output

* Run the script below to check if you have CUDA available:


---


In [2]:
import sys
import torch

print("System Information:")
print("Python version:", sys.version)
print("PyTorch version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())

try:
    print("\nCUDA Details:")
    print("CUDA version:", torch.version.cuda)
    print("Number of CUDA devices:", torch.cuda.device_count())
    
    if torch.cuda.is_available():
        for i in range(torch.cuda.device_count()):
            print(f"\nDevice {i} Details:")
            print("Device name:", torch.cuda.get_device_name(i))
            print("Device properties:", torch.cuda.get_device_properties(i))
except Exception as e:
    print("Error retrieving CUDA information:", str(e))

print("\nEnvironment Checks:")
import os
print("CUDA_HOME:", os.environ.get('CUDA_HOME', 'Not set'))
print("PATH environment variable contains CUDA paths:", 
      any('cuda' in path.lower() for path in os.environ.get('PATH', '').split(os.pathsep)))

System Information:
Python version: 3.13.12 (tags/v3.13.12:1cbe481, Feb  3 2026, 18:22:25) [MSC v.1944 64 bit (AMD64)]
PyTorch version: 2.10.0+cpu
CUDA available: False

CUDA Details:
CUDA version: None
Number of CUDA devices: 0

Environment Checks:
CUDA_HOME: Not set
PATH environment variable contains CUDA paths: False


#### Dedicated AMD GPU
* Install ROCm: Install the ROCm platform, which is AMD's equivalent of CUDA. You can find installation instructions on the [AMD website](https://www.amd.com/en/products/software/rocm.html).
* Set PyTorch to use ROCm:

In [3]:
import torch

device = torch.device("cuda")  # Use the default CUDA device (which will be the AMD GPU)
print("Using device:", device)

Using device: cuda


#### Integrated GPU/Only CPU

* Check for CUDA/ROCm support: Some integrated GPUs may have limited support for CUDA or ROCm. Check the specifications of your integrated GPU and the PyTorch documentation to see if it's supported.
* Install drivers: If your integrated GPU supports CUDA or ROCm, install the appropriate drivers and libraries.
* Use the same code as for dedicated GPUs: If your integrated GPU is supported, you can use the same code as for dedicated GPUs to move the model and data to the GPU. However, keep in mind that integrated GPUs typically have less memory and processing power than dedicated GPUs, so training might be slower.

### Considerations
* GPU Memory: Be mindful of GPU memory limitations, especially when working with large datasets or complex models. You might need to adjust the batch size or use techniques like gradient accumulation to fit the data into GPU memory.
* Mixed Precision Training: Consider using mixed precision training (torch.cuda.amp) to potentially speed up training on NVIDIA GPUs.
* Multiple GPUs: If you have multiple GPUs, you can use PyTorch's nn.DataParallel or nn.DistributedDataParallel to distribute the training workload across them.


## Step 5: Deactivating the Virtual Environment
When you're done working, deactivate the virtual environment:
```bash
deactivate
```

---



## Step 6: Adding New Dependencies
For the course material, you shouldn't need to update the requirements.txt file yourself. However, for your AI project, you may find it convenient to work in a similar way. If you install additional packages, update `requirements.txt` so others can use the same setup:

```bash
pip freeze > requirements.txt
```

This updates the file with all installed packages and versions.

---



## Summary
- **Create** a virtual environment: `python -m venv venv`
- **Activate** it: `source venv/bin/activate` (Mac/Linux) or `venv\Scripts\Activate.ps1` (Windows)
- **Install dependencies**: `pip install -r requirements.txt`
- **get Torch and your GPU working**: There is no summary for this, go and look it up 
- **Deactivate** when finished: `deactivate`
- **Update dependencies**: `pip freeze > requirements.txt`

Using a virtual environment and especially a dependency management tool ensures **reproducibility** and prevents dependency conflicts in Python projects.

---
