# Environment Verification - Week 1 Workshop

Welcome to the INRIVA AI Academy Week 1 Workshop! This notebook will verify that your environment is set up correctly.

## Instructions
1. Run each cell by pressing **Shift + Enter**
2. Look for ✅ (success) or ❌ (error) indicators
3. If you see any errors, ask for help before we continue

Let's begin!

## 1. Basic Imports Test

In [None]:
# Test all required imports
import sys
print(f"Python version: {sys.version}")

try:
    import metaflow
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    from sklearn.datasets import load_iris
    
    print("\n✅ All imports successful!")
    print(f"Metaflow version: {metaflow.__version__}")
    print(f"Pandas version: {pd.__version__}")
    print(f"NumPy version: {np.__version__}")
    
except ImportError as e:
    print(f"❌ Import error: {e}")
    print("Please check your environment setup!")

## 2. Metaflow Basic Test

In [None]:
# Test Metaflow basic functionality
from metaflow import FlowSpec, step, NBRunner

class VerificationFlow(FlowSpec):
    """
    Simple flow to test Metaflow setup
    """
    
    @step
    def start(self):
        print("🚀 Metaflow verification starting...")
        self.message = "Hello from Metaflow!"
        self.next(self.end)
    
    @step
    def end(self):
        print(f"✅ {self.message}")
        print("🎉 Metaflow verification complete!")

# Test flow creation
try:
    flow = NBRunner(VerificationFlow).nbrun()
    print("✅ Metaflow FlowSpec created successfully!")
except Exception as e:
    print(f"❌ Metaflow error: {e}")

## 3. Data Science Stack Test

In [None]:
# Test pandas functionality
print("Testing pandas...")
df = pd.DataFrame({
    'A': [1, 2, 3, 4, 5],
    'B': [10, 20, 30, 40, 50],
    'C': ['x', 'y', 'x', 'z', 'y']
})
print(f"✅ Created DataFrame with shape: {df.shape}")
print(df.head())

# Test numpy functionality
print("\nTesting numpy...")
arr = np.random.normal(0, 1, 100)
print(f"✅ Created numpy array with mean: {arr.mean():.3f}, std: {arr.std():.3f}")

# Test matplotlib
print("\nTesting matplotlib...")
fig, ax = plt.subplots(figsize=(6, 4))
ax.plot([1, 2, 3, 4], [1, 4, 2, 3])
ax.set_title('Test Plot')
plt.show()
print("✅ Matplotlib plot created successfully!")

## 4. Machine Learning Test

In [None]:
# Test scikit-learn functionality
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

print("Testing scikit-learn with iris dataset...")

# Load data
iris = load_iris()
X, y = iris.data, iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train model
model = RandomForestClassifier(n_estimators=10, random_state=42)
model.fit(X_train, y_train)

# Evaluate
predictions = model.predict(X_test)
accuracy = accuracy_score(y_test, predictions)

print(f"✅ Model trained successfully!")
print(f"   Training samples: {len(X_train)}")
print(f"   Test samples: {len(X_test)}")
print(f"   Accuracy: {accuracy:.3f}")

if accuracy > 0.8:
    print("🎉 Great! Everything is working perfectly!")
else:
    print("⚠️ Accuracy seems low, but setup is working")

## 5. Workshop Data Access Test

In [None]:
# Test access to datasets we'll use in the workshop
from sklearn.datasets import load_wine

print("Testing workshop datasets...")

# Wine dataset (main workshop dataset)
wine = load_wine()
print(f"✅ Wine dataset loaded: {wine.data.shape[0]} samples, {wine.data.shape[1]} features")
print(f"   Classes: {wine.target_names}")

# Convert to pandas for exploration
wine_df = pd.DataFrame(wine.data, columns=wine.feature_names)
wine_df['target'] = wine.target

print(f"✅ Converted to pandas DataFrame: {wine_df.shape}")
print("\nFirst few rows:")
print(wine_df.head())

# Quick visualization test
plt.figure(figsize=(8, 4))

plt.subplot(1, 2, 1)
wine_df['target'].value_counts().plot(kind='bar')
plt.title('Wine Classes Distribution')

plt.subplot(1, 2, 2)
wine_df['alcohol'].hist(bins=15, alpha=0.7)
plt.title('Alcohol Content Distribution')

plt.tight_layout()
plt.show()

print("✅ Visualization test successful!")

## 🎉 Verification Complete!

If you see all ✅ symbols above, you're ready for the workshop!

### What's Next?
1. **Keep this environment active**
2. **Join the Google Meet** at workshop time
3. **Be ready to code along** with the facilitator
4. **Ask questions freely** - we're here to help!

### Need Help?
- ❌ **If you see any errors**: Post in Google Chat #urgent-help
- 🤔 **If something doesn't look right**: Ask during the workshop
- 📧 **For other questions**: Email the facilitator

**See you in the workshop! 🚀**