## Section 1: System Requirements and Prerequisites

Before starting, ensure you have:

### Minimum Requirements:
- **OS:** Windows 10+, macOS 10.14+, or Linux (Ubuntu 18.04+)
- **Python Version:** 3.8 or higher (preferably 3.10 or 3.11)
- **RAM:** 2 GB minimum (4 GB recommended)
- **Disk Space:** 2-3 GB for Python and libraries
- **Internet:** Required for downloading packages

### Recommended Setup:
- **RAM:** 8 GB or more
- **Disk Space:** 5+ GB (for datasets and future projects)
- **IDE:** VS Code, PyCharm, or Jupyter Notebook

### Prerequisites Knowledge:
- Basic command line/terminal usage
- Understanding of Python basics
- Familiarity with file systems

## Section 2: Install Python and Package Manager

### Step 1: Download Python

**All Platforms (Windows, macOS, Linux):**
1. Visit: https://www.python.org/downloads/
2. Download Python 3.11 or 3.12 (latest stable)
3. Run the installer appropriate for your OS

**Windows Specific:**
- Run the `.exe` file
- ✓ **CHECK:** "Add Python to PATH" (IMPORTANT!)
- Click "Install Now"

**macOS Specific:**
- Run the `.pkg` file
- Follow the installation wizard
- PATH is automatically configured

**Linux Specific:**
```bash
sudo apt-get update
sudo apt-get install python3 python3-pip
```

### Step 2: Verify Python Installation

Open Terminal/Command Prompt and run:

In [None]:
# Check Python version
import sys
print(f"Python Version: {sys.version}")
print(f"Python Executable: {sys.executable}")

**Expected Output:** Python 3.10.x or higher

### Step 3: Verify pip (Python Package Manager)

Pip is used to install Python packages. Check if it's installed:

In [None]:
import subprocess

# Check pip version
result = subprocess.run([sys.executable, "-m", "pip", "--version"], capture_output=True, text=True)
print(result.stdout)
print(result.stderr if result.stderr else "pip is successfully installed!")

**Expected Output:** pip x.x.x from ... (python 3.x)

## Section 3: Install Jupyter Notebook

### Installation Command:

Open Terminal/Command Prompt and run:

```bash
pip install jupyter notebook ipython
```

### Verify Jupyter Installation:

In [None]:
# Verify Jupyter installation
try:
    import jupyter
    import notebook
    print(f"✓ Jupyter installed")
    print(f"✓ Jupyter location: {jupyter.__file__}")
except ImportError as e:
    print(f"✗ Jupyter not found: {e}")

### Launching Jupyter Notebook:

In Terminal/Command Prompt, navigate to your project folder and run:

```bash
jupyter notebook
```

This will:
1. Start a local Jupyter server (http://localhost:8888)
2. Open your default web browser
3. Display a file browser interface

## Section 4: Install Essential ML Libraries

### Installation Command (All at Once):

```bash
pip install numpy pandas matplotlib scikit-learn seaborn
```

Or install individually:

```bash
pip install numpy           # Numerical computing
pip install pandas          # Data manipulation
pip install matplotlib      # Plotting and visualization
pip install scikit-learn    # Machine learning algorithms
pip install seaborn         # Statistical data visualization
```

### Library Purposes:

| Library | Purpose | Use Case |
|---------|---------|----------|
| **NumPy** | Numerical computing | Array operations, mathematical functions |
| **Pandas** | Data manipulation | Loading, cleaning, transforming datasets |
| **Matplotlib** | Plotting/Visualization | Line plots, histograms, scatter plots |
| **Scikit-learn** | Machine Learning | Algorithms, model training, evaluation |
| **Seaborn** | Statistical Visualization | Advanced statistical plots |

### Installation Progress:

```
Collecting numpy
Downloading numpy-1.24.0-cp311-cp311-win_amd64.whl
Installing collected packages: numpy
Successfully installed numpy-1.24.0
```

## Section 5: Verify All Installations

Run the following cells to verify that all libraries are installed correctly:

In [None]:
# Import all required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
from sklearn import __version__ as sklearn_version
import seaborn as sns

print("✓ All libraries imported successfully!")

In [None]:
# Display version information
print("="*60)
print("INSTALLED PACKAGE VERSIONS")
print("="*60)
print(f"Python Version:        {sys.version.split()[0]}")
print(f"NumPy Version:         {np.__version__}")
print(f"Pandas Version:        {pd.__version__}")
print(f"Matplotlib Version:    {matplotlib.__version__}")
print(f"Scikit-learn Version:  {sklearn_version}")
print(f"Seaborn Version:       {sns.__version__}")
print("="*60)

In [None]:
# Display system information
import platform
import os

print("="*60)
print("SYSTEM INFORMATION")
print("="*60)
print(f"Operating System:      {platform.system()} {platform.release()}")
print(f"Processor:             {platform.processor()}")
print(f"Architecture:          {platform.architecture()[0]}")
print(f"Current Working Dir:   {os.getcwd()}")
print("="*60)

## Section 6: Create Project Directory Structure

Set up a proper project structure for organizing your ML work:

In [None]:
import os
from pathlib import Path

# Define project structure
project_dirs = [
    'data/raw',              # Store raw datasets
    'data/processed',        # Store processed datasets
    'notebooks',             # Jupyter notebooks
    'scripts',               # Python scripts
    'models',                # Saved models
    'results',               # Analysis results
    'visualizations',        # Generated plots and charts
]

print("Creating project directory structure...")
print("="*60)

for directory in project_dirs:
    Path(directory).mkdir(parents=True, exist_ok=True)
    print(f"✓ Created: {directory}")

print("="*60)
print("Project structure created successfully!")

In [None]:
# Display the created directory structure
print("\nCurrent Directory Structure:")
print("="*60)

for root, dirs, files in os.walk('.'):
    level = root.replace('.', '').count(os.sep)
    indent = ' ' * 2 * level
    print(f'{indent}{os.path.basename(root)}/')
    subindent = ' ' * 2 * (level + 1)
    for d in dirs:
        if not d.startswith('.'):
            print(f'{subindent}{d}/')

## Section 7: Test Your Environment - Library Demonstrations

### 7.1: NumPy - Numerical Computing

In [None]:
# NumPy: Working with arrays
print("NumPy Demonstration")
print("="*60)

# Create arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print(f"1D Array: {arr1}")
print(f"Shape: {arr1.shape}")
print(f"\n2D Array:\n{arr2}")
print(f"Shape: {arr2.shape}")

# Basic operations
print(f"\nSum: {np.sum(arr1)}")
print(f"Mean: {np.mean(arr1)}")
print(f"Std Dev: {np.std(arr1):.2f}")
print(f"Max: {np.max(arr1)}")
print(f"Min: {np.min(arr1)}")

### 7.2: Pandas - Data Manipulation

In [None]:
# Pandas: Creating and manipulating DataFrames
print("\nPandas Demonstration")
print("="*60)

# Create a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, 28, 32],
    'Score': [85.5, 90.0, 78.5, 88.0, 92.5],
    'Department': ['IT', 'HR', 'Finance', 'IT', 'Sales']
}

df = pd.DataFrame(data)

print("\nDataFrame:")
print(df)

print("\nDataFrame Info:")
print(f"Shape: {df.shape}")
print(f"Columns: {df.columns.tolist()}")

print("\nStatistical Summary:")
print(df.describe())

### 7.3: Matplotlib - Data Visualization

In [None]:
# Matplotlib: Creating visualizations
print("\nMatplotlib Demonstration")
print("="*60)

# Create sample data
x = np.linspace(0, 10, 100)
y_sin = np.sin(x)
y_cos = np.cos(x)

# Create figure with subplots
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
fig.suptitle('Matplotlib Visualization Examples', fontsize=16, fontweight='bold')

# Plot 1: Line plot
axes[0, 0].plot(x, y_sin, 'b-', label='sin(x)', linewidth=2)
axes[0, 0].plot(x, y_cos, 'r-', label='cos(x)', linewidth=2)
axes[0, 0].set_title('Trigonometric Functions')
axes[0, 0].set_xlabel('X')
axes[0, 0].set_ylabel('Y')
axes[0, 0].legend()
axes[0, 0].grid(True, alpha=0.3)

# Plot 2: Scatter plot
scatter_x = np.random.randn(50)
scatter_y = np.random.randn(50)
axes[0, 1].scatter(scatter_x, scatter_y, c='green', s=100, alpha=0.6)
axes[0, 1].set_title('Scatter Plot')
axes[0, 1].set_xlabel('X')
axes[0, 1].set_ylabel('Y')
axes[0, 1].grid(True, alpha=0.3)

# Plot 3: Histogram
data_hist = np.random.normal(0, 1, 1000)
axes[1, 0].hist(data_hist, bins=30, color='purple', alpha=0.7, edgecolor='black')
axes[1, 0].set_title('Distribution Histogram')
axes[1, 0].set_xlabel('Value')
axes[1, 0].set_ylabel('Frequency')
axes[1, 0].grid(True, alpha=0.3, axis='y')

# Plot 4: Bar plot
categories = ['A', 'B', 'C', 'D', 'E']
values = [23, 45, 56, 78, 32]
colors = ['red', 'blue', 'green', 'yellow', 'orange']
axes[1, 1].bar(categories, values, color=colors, alpha=0.7, edgecolor='black')
axes[1, 1].set_title('Bar Chart')
axes[1, 1].set_xlabel('Category')
axes[1, 1].set_ylabel('Value')
axes[1, 1].grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

print("✓ Matplotlib visualizations created successfully!")

### 7.4: Scikit-learn - Machine Learning Basics

In [None]:
# Scikit-learn: Simple Machine Learning Example
print("\nScikit-learn Demonstration")
print("="*60)

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

print(f"\nDataset Information:")
print(f"Number of samples: {X.shape[0]}")
print(f"Number of features: {X.shape[1]}")
print(f"Number of classes: {len(np.unique(y))}")
print(f"Feature names: {iris.feature_names}")
print(f"Target names: {iris.target_names}")

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

print(f"\nTraining set size: {X_train.shape[0]}")
print(f"Testing set size: {X_test.shape[0]}")

# Train a Random Forest classifier
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f"\nModel Accuracy: {accuracy:.2%}")

print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=iris.target_names))

### 7.5: Seaborn - Statistical Visualization

In [None]:
# Seaborn: Statistical visualization
print("\nSeaborn Demonstration")
print("="*60)

# Create sample data
data = pd.DataFrame({
    'X': np.random.randn(100),
    'Y': np.random.randn(100),
    'Category': np.random.choice(['A', 'B', 'C'], 100)
})

# Create visualizations
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
fig.suptitle('Seaborn Statistical Visualizations', fontsize=16, fontweight='bold')

# Plot 1: Scatter with regression line
sns.regplot(data=data, x='X', y='Y', ax=axes[0], scatter_kws={'alpha': 0.6})
axes[0].set_title('Scatter Plot with Regression Line')

# Plot 2: Box plot by category
sns.boxplot(data=data, x='Category', y='Y', ax=axes[1], palette='Set2')
axes[1].set_title('Box Plot by Category')

plt.tight_layout()
plt.show()

print("✓ Seaborn visualizations created successfully!")

## Section 8: Final Environment Check

Comprehensive verification that your environment is ready for ML development:

In [None]:
# Final comprehensive check
print("\n" + "="*60)
print("FINAL ENVIRONMENT VERIFICATION REPORT")
print("="*60)

checks = {
    'Python': sys.version.split()[0],
    'NumPy': np.__version__,
    'Pandas': pd.__version__,
    'Matplotlib': matplotlib.__version__,
    'Scikit-learn': sklearn_version,
    'Seaborn': sns.__version__,
    'Jupyter': 'Installed' if 'jupyter' in sys.modules else 'Not Found'
}

print("\nInstalled Packages:")
for package, version in checks.items():
    status = "✓" if version != "Not Found" else "✗"
    print(f"  {status} {package:<15} : {version}")

print("\n" + "-"*60)
print("Directory Structure:")
for directory in project_dirs:
    exists = "✓" if os.path.exists(directory) else "✗"
    print(f"  {exists} {directory}")

print("\n" + "="*60)
print("✅ ENVIRONMENT SETUP COMPLETE!")
print("="*60)
print("\nYou are ready to start Machine Learning development!")
print("\nNext Steps:")
print("  1. Review the Practical 1 installation guide")
print("  2. Start Practical 2: Data Preprocessing")
print("  3. Explore the datasets in the 'data/raw' folder")
print("  4. Create your own notebooks in the 'notebooks' folder")

## Troubleshooting Guide

### Common Issues and Solutions:

**Issue:** `ModuleNotFoundError: No module named 'numpy'`  
**Solution:** Run `pip install numpy` in terminal

**Issue:** `Python not recognized`  
**Solution:** Reinstall Python and select "Add Python to PATH"

**Issue:** Jupyter port already in use  
**Solution:** Run `jupyter notebook --port 8889`

**Issue:** Permission denied on macOS/Linux  
**Solution:** Use `pip install --user package_name`

**Issue:** Installation very slow  
**Solution:** Use alternative PyPI mirror or check internet connection

---

## Resources

- Official Python Documentation: https://docs.python.org/3/
- NumPy Documentation: https://numpy.org/doc/
- Pandas Documentation: https://pandas.pydata.org/docs/
- Matplotlib Documentation: https://matplotlib.org/
- Scikit-learn Documentation: https://scikit-learn.org/
- Seaborn Documentation: https://seaborn.pydata.org/
- Jupyter Documentation: https://jupyter.org/

---

## Conclusion

Congratulations! You have successfully:
- ✅ Installed Python and essential packages
- ✅ Set up Jupyter Notebook
- ✅ Installed all ML libraries (NumPy, Pandas, Matplotlib, Scikit-learn, Seaborn)
- ✅ Created a proper project structure
- ✅ Tested all libraries with sample code
- ✅ Learned basic usage of each library

**Your Machine Learning development environment is ready for Practical 2!**