[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/susanavenda/data_cambridge/blob/main/Untitled-1.ipynb)

# Data & AI Course — Project Prep

University of Cambridge | Getting ready for class projects

This notebook helps you verify your setup and refresh key skills before starting your Data & AI course project.

## 1. Environment Check

Run this to confirm Python and key libraries are available.

In [None]:
import sys
print(f"Python: {sys.version}")

# Core data science stack
try:
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
    print("✓ numpy, pandas, matplotlib, seaborn — ready")
except ImportError as e:
    print(f"Install missing packages: !pip install numpy pandas matplotlib seaborn")
    raise e

## 2. Data Basics (Quick Refresh)

Load data, inspect, and visualise — skills you'll use in every project.

In [None]:
# Example: create a small dataset
import pandas as pd
import numpy as np

df = pd.DataFrame({
    "x": np.random.randn(100),
    "y": np.random.randn(100),
    "label": np.random.choice(["A", "B", "C"], 100)
})

print(df.head())
print(f"\nShape: {df.shape}")

In [None]:
# Quick visualisation
import matplotlib.pyplot as plt
import seaborn as sns

fig, axes = plt.subplots(1, 2, figsize=(10, 4))
sns.scatterplot(data=df, x="x", y="y", hue="label", ax=axes[0])
sns.histplot(data=df, x="x", hue="label", ax=axes[1], alpha=0.6)
plt.tight_layout()
plt.show()

## 3. ML / AI Libraries (Optional)

If your project uses scikit-learn or other ML tools:

In [None]:
# Uncomment and run if you need these:
# !pip install scikit-learn

try:
    import sklearn
    print(f"scikit-learn: {sklearn.__version__}")
except ImportError:
    print("Run: !pip install scikit-learn")

## 4. Project Tips

- **Reproducibility:** Use `np.random.seed(...)` when using randomness.
- **Structure:** Separate data loading, cleaning, analysis, and visualisation into clear sections.
- **Document:** Use markdown cells to explain your steps and findings.
- **Save:** Use **File → Save a copy in Drive** or **Save a copy in GitHub** to keep your work.