# Notebook 00 - Environment Setup

**Purpose:** Verify Python version, install dependencies, check CPU-only execution.

**Checkpoint:** No CUDA warnings on import.

## Cell 1: Verify Python 3.12

In [1]:
import sys

print(f"Python version: {sys.version}")
print(f"Python executable: {sys.executable}")

# Verify Python 3.12
assert sys.version_info[:2] == (3, 12), f"Python 3.12 required, got {sys.version_info[:2]}"
print("\n✅ Python 3.12 verified")

Python version: 3.12.3 (main, Aug 14 2025, 17:47:21) [GCC 13.3.0]
Python executable: /home/mikhailarutyunov/projects/time-series-flu/.venv/bin/python3

✅ Python 3.12 verified


## Cell 2: Install and Import Packages

In [2]:
# Note: Packages already installed via uv
# If needed, run: uv pip install ruptures statsmodels lightgbm scikit-learn pandas numpy matplotlib seaborn tabpfn-client python-dotenv
print("📦 Packages should be pre-installed via uv")

📦 Packages should be pre-installed via uv


In [3]:
# Import core packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings

# Statistical packages
import ruptures
import statsmodels.api as sm
from statsmodels.tsa.deterministic import DeterministicProcess, Fourier
from lightgbm import LGBMRegressor

# TabPFN
import tabpfn_client
from dotenv import load_dotenv
import os

print("✅ All packages imported successfully")
print(f"\nPackage versions:")
print(f"  pandas: {pd.__version__}")
print(f"  numpy: {np.__version__}")
print(f"  statsmodels: {sm.__version__}")
print(f"  ruptures: {ruptures.__version__}")

✅ All packages imported successfully

Package versions:
  pandas: 2.3.1
  numpy: 2.3.3
  statsmodels: 0.14.5
  ruptures: v1.1.10


In [4]:
# Pin random seeds
np.random.seed(42)

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')

print("✅ Random seeds pinned (np.random.seed=42)")

✅ Random seeds pinned (np.random.seed=42)


In [5]:
# Verify CPU-only execution (no CUDA)
# Check if torch is installed (will be for chronos later)
try:
    import torch
    if torch.cuda.is_available():
        print("⚠️  WARNING: CUDA is available but should NOT be used")
        print(f"   CUDA devices: {torch.cuda.device_count()}")
        print("   Force CPU usage in model configs")
    else:
        print("✅ CPU-only execution verified (no CUDA available)")
except ImportError:
    print("✅ PyTorch not installed yet (will install with chronos in nb/03)")

✅ PyTorch not installed yet (will install with chronos in nb/03)


## Cell 3: Create Folder Tree and Verify TabPFN API

In [6]:
# Project root
project_root = Path.cwd().parent

# Create directories
data_dir = project_root / "data"
results_dir = project_root / "results"
forecasts_dir = results_dir / "forecasts"
figures_dir = results_dir / "figures"

# Ensure all directories exist
data_dir.mkdir(parents=True, exist_ok=True)
forecasts_dir.mkdir(parents=True, exist_ok=True)
figures_dir.mkdir(parents=True, exist_ok=True)

print("✅ Directory structure verified:")
print(f"   Data: {data_dir}")
print(f"   Forecasts: {forecasts_dir}")
print(f"   Figures: {figures_dir}")

# Verify data file exists
data_file = data_dir / "NHSdata_dailypercentace_flupositive.csv"
if data_file.exists():
    print(f"\n✅ Data file found: {data_file.name}")
else:
    print(f"\n⚠️  WARNING: Data file not found: {data_file}")

✅ Directory structure verified:
   Data: /home/mikhailarutyunov/projects/time-series-flu/data
   Forecasts: /home/mikhailarutyunov/projects/time-series-flu/results/forecasts
   Figures: /home/mikhailarutyunov/projects/time-series-flu/results/figures

✅ Data file found: NHSdata_dailypercentace_flupositive.csv


In [7]:
# Load and verify TabPFN API key
env_file = project_root / ".env"

if env_file.exists():
    load_dotenv(env_file)
    api_key = os.getenv("PRIORLABS_API_KEY")
    
    if api_key:
        # Set TabPFN access token (following examples/06 pattern)
        tabpfn_client.set_access_token(api_key)
        print("✅ TabPFN client initialized (API key loaded from .env)")
    else:
        print("⚠️  WARNING: PRIORLABS_API_KEY not found in .env file")
else:
    print("⚠️  WARNING: .env file not found")
    print("   TabPFN will not be available in rolling forecast loop")

✅ TabPFN client initialized (API key loaded from .env)


## Summary

Environment setup complete. All checkpoints passed:
- ✅ Python 3.12 verified
- ✅ Core packages installed
- ✅ Random seeds pinned
- ✅ CPU-only execution (no CUDA warnings)
- ✅ Directory structure created
- ✅ Data file verified
- ✅ TabPFN API key loaded

Ready to proceed to Notebook 01 (Data Preparation).