# 🐍 Welcome to FB2NEP: Python + Colab Onboarding
In this notebook, you'll learn how to:
- Open and explore a dataset
- Understand Python syntax basics
- Use Colab for your weekly assignments

In [None]:
# 🛠️ Install packages (only needed in Colab)
!pip install -q pandas matplotlib seaborn

In [None]:
# 📦 Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

pd.set_option('display.max_columns', None)
sns.set(style='whitegrid')

## 📂 Step 1: Load the Dataset
We use a synthetic dataset with 1000 observations. Each row is a person.
You can load it from GitHub using the URL below.

In [None]:
# 🔗 Load data from GitHub
url = 'https://raw.githubusercontent.com/ggkuhnle/FB2NEP_datascience/main/data/fb2nep_data.csv'
df = pd.read_csv(url)
df.head()

## 🔍 Step 2: Inspect the Data

In [None]:
# Number of rows and columns
print(f'This dataset has {df.shape[0]} rows and {df.shape[1]} columns.')
df.columns.tolist()

## 📊 Step 3: Summary Statistics

In [None]:
df.describe(include='all')

## ✏️ Task 1: Describe Your Dataset
Answer the following based on the output above:
- What is the average age?
- How many participants are female vs male?
- What is the most common social class?
- What is the range of nutrient intake?

## 📉 Step 4: Plotting – BMI by Sex

In [None]:
sns.histplot(data=df, x='BMI', hue='sex', element='step', stat='density')
plt.title('BMI Distribution by Sex')
plt.xlabel('BMI')
plt.ylabel('Density')
plt.show()

## ✍️ Task 2: Reflection
- What kinds of questions could this dataset help answer?
- What might be its limitations?

You can write your response below or submit via Blackboard.

🎉 Well done! You're now ready to move into Week 2.