# 🚀 Notebook 01: Environment Setup & Verification

**Phase 1: Foundations**  
**Goal:** To build and verify a robust, professional development environment—the essential first step for any serious AI practitioner.

---

## 📋 Objectives

A correct setup prevents countless hours of debugging and ensures your work is reproducible. By the end of this notebook, you will have:
1.  **Verified Your Python Environment:** Confirmed that you are using a compatible version of Python, the language of modern AI.
2.  **Confirmed Core Libraries:** Ensured that all essential data science and machine learning packages are installed and ready.
3.  **Inspected Your Hardware:** Checked for GPU availability, which is critical for accelerating deep learning tasks.
4.  **Mastered Jupyter Essentials:** Learned key commands and shortcuts to make your interactive development workflow fast and efficient.
5.  **Configured Git for Version Control:** Set up the industry-standard tool for tracking changes and collaborating on code.

**Estimated Time:** 30-60 minutes

---

## 📚 Why This Matters

In data science and AI, your **environment is everything**. An inconsistent or poorly configured setup leads to errors, non-reproducible results, and immense frustration. This initial verification step ensures that the foundation of your "digital lab" is solid, so you can focus on learning and building, confident that your tools will work as expected.

**Let's begin by ensuring our foundation is rock-solid! 🎯**

## 1️⃣ Step 1: Python Environment Verification

First, let's confirm that your Python environment is set up correctly. The following code cell will print key details about your Python installation, including its version, the path to the executable, and the underlying operating system.

**Why this matters:** A successful AI project starts with a compatible Python version (this course requires **3.11 or newer**). Different versions can have subtle differences in library support and behavior. This check ensures you won't run into unexpected compatibility issues later on. It also confirms you are using the Python interpreter you intended, especially if you have multiple versions installed.

In [None]:
import sys
import platform

# --- Python Environment Information ---
print("🐍 Python Environment Information")
print("=" * 50)
print(f"✅ Python Version: {sys.version}")
print(f"✅ Executable Path: {sys.executable}")
print(f"✅ Platform: {platform.platform()}")
print(f"✅ Architecture: {platform.machine()}")
print("=" * 50)

# --- Compatibility Check ---
# We verify that the Python version is 3.11 or higher, which is required for many modern libraries
# and syntax features used in this course.
version_info = sys.version_info
if version_info.major >= 3 and version_info.minor >= 11:
    print("\n🎉 Success! Your Python version is compatible (3.11+).")
else:
    print("\n❌ Action Required: Please upgrade your Python to version 3.11 or higher.")
    print("   - If using Conda: `conda install python=3.11`")
    print("   - If using Homebrew (macOS): `brew upgrade python`")

## 2️⃣ Step 2: Core Package Verification

Next, we'll verify that all the essential Python packages for data science and machine learning are installed. These libraries form the bedrock of our toolkit for the entire program. We will dynamically import each one and check its version.

**Why this matters:** Just like a chef needs their knives, an AI engineer needs their libraries. This step ensures your environment is equipped with the correct tools for:
- **Numerical Computing (`numpy`):** For high-performance array operations.
- **Data Manipulation (`pandas`):** For working with structured data in DataFrames.
- **Visualization (`matplotlib`, `seaborn`):** For plotting and exploring data.
- **Machine Learning (`scikit-learn`):** For classical ML algorithms and tools.
- **Deep Learning (`torch`):** The foundational framework for building neural networks.

Ensuring these are installed correctly now prevents `ImportError` messages in future notebooks.

In [None]:
import importlib
import subprocess

# --- List of Critical Packages ---
# These packages are essential for the foundational modules of the course.
packages_to_check = [
    "numpy",          # For numerical operations
    "pandas",         # For data manipulation and analysis
    "matplotlib",     # For plotting and visualizations
    "seaborn",        # For statistical data visualization
    "scikit-learn",   # For machine learning algorithms
    "jupyter",        # For interactive notebooks
    "torch",          # For deep learning (used in later modules)
    "langchain",      # For building LLM applications
    "transformers",   # For state-of-the-art NLP models
]

print("📦 Checking Core Package Installations...")
print("=" * 50)

missing_packages = []
installed_packages = []

# --- Verification Loop ---
for package in packages_to_check:
    try:
        # Dynamically import the module to check for its existence.
        # We handle the special case where 'scikit-learn' is imported as 'sklearn'.
        module_name = "sklearn" if package == "scikit-learn" else package
        module = importlib.import_module(module_name)
        
        # Retrieve the version number from the imported module.
        version = getattr(module, "__version__", "version not found")
        print(f"✅ {package:<15} | Installed (Version: {version})")
        installed_packages.append(package)
        
    except ImportError:
        print(f"❌ {package:<15} | Not Installed")
        missing_packages.append(package)

# --- Summary and Action Items ---
print("=" * 50)
print(f"\n📊 Summary: {len(installed_packages)} out of {len(packages_to_check)} essential packages are installed.")

if missing_packages:
    print(f"\n⚠️ Action Required: The following packages are missing: {', '.join(missing_packages)}")
    print("   Attempting to install them now...")
    
    # Construct the pip install command
    command = [sys.executable, "-m", "pip", "install"] + missing_packages + ["--quiet"]
    
    # Execute the command
    try:
        subprocess.check_call(command)
        print("\n✅ Successfully installed missing packages.")
        print("   Please **rerun this cell** to verify the installation.")
    except subprocess.CalledProcessError as e:
        print(f"\n❌ Error during installation: {e}")
        print("   Please try installing them manually by running this command in your terminal:")
        print(f"   pip install {' '.join(missing_packages)}")
else:
    print("\n🎉 Success! All core packages are installed and ready to go.")

## 3️⃣ Step 3: GPU Availability Check

For deep learning tasks, having a Graphics Processing Unit (GPU) can accelerate model training by orders of magnitude. This step checks if your system has a CUDA-enabled GPU that PyTorch—our deep learning framework of choice—can recognize and use.

**Why this matters:** While not strictly required for the initial modules, knowing your GPU status is crucial for the deep learning sections of this program. If no GPU is found, PyTorch will automatically fall back to using the CPU. This check informs you which hardware you'll be using, which has significant implications for training time in later weeks. A "T4" or "A100" GPU, common in cloud environments like Colab or Codespaces, can be 10-50x faster than a CPU.

In [None]:
import torch

print("🖥️  Hardware Check: GPU Availability")
print("=" * 50)

# --- Check for CUDA-enabled GPU ---
# PyTorch's `cuda.is_available()` is the standard and most reliable way to check for a usable GPU.
cuda_available = torch.cuda.is_available()
print(f"CUDA Support Detected: {cuda_available}")

if cuda_available:
    # If a GPU is found, print its details for confirmation.
    gpu_count = torch.cuda.device_count()
    gpu_name = torch.cuda.get_device_name(0)
    # Get memory properties and convert bytes to gigabytes (GB).
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / (1024**3)
    
    print(f"✅ Success! GPU is available and configured.")
    print(f"   - GPU Name: {gpu_name}")
    print(f"   - Number of GPUs: {gpu_count}")
    print(f"   - Total GPU Memory: {gpu_memory:.2f} GB")
else:
    # If no GPU is found, inform the user that the CPU will be used for computations.
    print("⚠️ No GPU detected. Operations will run on the CPU.")
    print("   - Note: This is perfectly fine for the initial modules, but deep learning tasks in later weeks will be significantly slower.")
    print("   - Consider using a GPU-enabled environment (like a paid Google Colab or a GPU-powered GitHub Codespace) for Weeks 3 and beyond.")

print("=" * 50)

## 4️⃣ Step 4: Jupyter Notebook Essentials

Jupyter Notebooks are the primary tool for interactive data science and AI development. They allow you to mix executable code, explanatory text, images, and plots in a single document. Knowing a few key shortcuts and "magic commands" can dramatically improve your productivity.

**Why this matters:** Efficiency is key in an iterative process like model development. Mastering these commands will help you write, run, and debug code faster, allowing you to focus more on analysis and experimentation and less on manual clicking and waiting. It's the difference between a clunky workflow and a fluid one.

### 🎯 Essential Keyboard Shortcuts

Jupyter has two main modes for interacting with cells:
- **Command Mode** (blue cell border): For notebook-level actions. Press `Esc` to enter.
- **Edit Mode** (green cell border): For typing code or text into a cell. Press `Enter` to enter.

**Command Mode (Press `Esc`)**
-   `A`: Insert a new cell **A**bove the current one.
-   `B`: Insert a new cell **B**elow the current one.
-   `D, D` (press `D` twice): **D**elete the current cell.
-   `M`: Convert the current cell to **M**arkdown (for text).
-   `Y`: Convert the current cell back to code.
-   `Z`: Undo cell deletion.
-   `Shift + Enter`: Run the current cell and select the one below.
-   `Ctrl + Enter` (or `Cmd + Enter`): Run the current cell and stay on it.

**Edit Mode (Press `Enter`)**
-   `Tab`: Trigger code completion or indent the current line.
-   `Shift + Tab`: Show the docstring (documentation) for the function you're currently calling. A lifesaver!
-   `Ctrl + /` (or `Cmd + /` on Mac): Comment or uncomment the selected lines of code.

### 💡 Magic Commands

Magic commands, prefixed with `%` (line magic) or `%%` (cell magic), provide powerful extensions to the notebook's functionality. Here are a few of the most useful ones:

In [None]:
# --- Magic Command Examples ---

# %timeit: Measures the execution time of a single line of code by running it multiple times.
# This is great for quickly comparing the performance of different approaches.
print("Timing a simple sum operation with %timeit:")
%timeit sum(range(1000))
print("-" * 40)

# %%time: Measures the execution time of an entire cell.
# Unlike %timeit, it runs the code only once.
print("\nTiming the entire cell with %%time:")
%%time
total = 0
for i in range(1000):
    total += i
print(f"The sum is {total}")
print("-" * 40)

# !: Executes a shell command directly from the notebook.
# This is useful for file operations, package installations, or checking system status.
print("\nRunning a shell command to list files:")
!ls -l
print("-" * 40)

# %matplotlib inline: Ensures Matplotlib plots are rendered directly within the notebook.
# This is a standard command to include at the top of most data science notebooks.
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

plt.style.use('seaborn-v0_8-whitegrid') # Use a nice style for the plot
fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(np.random.randn(50).cumsum(), marker='o', linestyle='--', color='b')
ax.set_title("A Matplotlib Plot Rendered Inline")
ax.set_xlabel("X-axis")
ax.set_ylabel("Y-axis")
plt.show()
print("-" * 40)

# %whos: Lists all variables currently in the namespace, along with their type and value/info.
# This is incredibly useful for debugging and keeping track of what's in memory.
print("\nListing all active variables with %whos:")
a_variable = 10
b_string = "hello world"
c_list = [1, 2, 3, 4, 5]
%whos

## 5️⃣ Step 5: Git Setup for Version Control

Version control is a non-negotiable skill in modern software development and data science. We use **Git** to track changes to our code, collaborate with others, and maintain a complete history of our projects. This final step ensures Git is installed and configured on your system.

**Why this matters:** Git is your safety net. It saves you from the chaos of manual file versioning (e.g., `project_final.ipynb`, `project_final_v2.ipynb`, `project_final_v2_actually_final.ipynb`). It allows you to experiment freely, knowing you can always revert to a previous working state if something breaks. For this course, it enables you to save your progress, pull updates to the course materials, and manage your own solutions effectively.

In [None]:
# --- Git Verification and Configuration ---

# 1. Check if Git is installed by checking its version.
# The `!` prefix allows us to run a shell command directly from a Jupyter cell.
print("Checking Git installation...")
!git --version
print("-" * 40)

# 2. (Optional but Recommended) Configure your Git user name and email.
#    If this is your first time using Git on this machine, you should configure it.
#    This information is embedded into every commit you make.
#
#    Uncomment and run the following lines after replacing the placeholder text
#    with your actual name and email associated with your GitHub account.

# print("\nConfiguring Git user (if not already set)...")
# !git config --global user.name "Your Name"
# !git config --global user.email "your.email@example.com"
# print("Git user configured successfully!")
# print("-" * 40)


# 3. Display the current Git configuration to verify.
#    This command lists all the current settings. Look for `user.name` and `user.email`.
print("\nCurrent Git configuration:")
!git config --list

## 🎉 Congratulations! Your Environment is Ready!

You have successfully completed the environment setup and verification process. This is a critical first step, and you are now ready to dive into the core content of the course.

### Summary of Your Achievements:
-   ✅ **Python Verified**: Your Python installation is correct (3.11+) and ready for modern AI development.
-   ✅ **Packages Installed**: All essential data science and deep learning libraries are present and accounted for.
-   ✅ **Hardware Checked**: You know whether a GPU is available to accelerate your deep learning tasks in later weeks.
-   ✅ **Jupyter Mastered**: You are equipped with key shortcuts and magic commands to boost your productivity.
-   ✅ **Git Configured**: You are ready to use industry-standard version control for your projects.

### 📚 Next Steps

You are now fully prepared to begin your learning journey. The next notebook will dive into the fundamentals of the Python programming language, tailored specifically for data science and AI.

---

<div align="center" style="font-size: 1.2em; font-weight: bold;">
    <a href="./02_python_essentials.ipynb">Continue to Notebook 02: Python Essentials →</a>
</div>