# üß™ Mini Lab: Pandas Basics ‚Äî Fitness Tracker Analysis

**Dataset:** `fitness_tracker.csv`

> Use this notebook to complete the lab. Follow the instructions in each section.
> Do **not** include solutions in this file if you're submitting as an assignment.


## üéØ Objective
Practice foundational pandas operations using a small fitness tracker dataset.
You'll apply what you've learned so far: loading, inspecting, sorting, working with columns, string operations,
unique values, concatenation, and saving.


## üõ† Setup
Make sure `pandas` is installed. The dataset should be in the same folder as this notebook.

If needed, install:
```bash
pip install pandas
```


## üìÇ Step 1: Load the Data
1. Import pandas as `pd`.
2. Load the CSV file into a DataFrame called `df`.
3. Display the first 5 rows.
4. Show the DataFrame‚Äôs shape and info summary.


In [63]:
# === Step 1: Load the Data (Instructions Only) ===
# TODO:
# 1) import pandas as pd
# 2) df = pd.read_csv('fitness_tracker.csv')
# 3) display first 5 rows
# 4) show shape and info


import pandas as pd
df = pd.read_csv(r'C:\Users\99617\Desktop\JUPYTER EXERCISE\DS Pandas I\fitness_tracker.csv')
df.head()
df.shape
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 7 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   User_ID          8 non-null      int64  
 1   Name             8 non-null      object 
 2   Activity         8 non-null      object 
 3   Duration_min     7 non-null      float64
 4   Calories_burned  7 non-null      float64
 5   Date             8 non-null      object 
 6   Device           8 non-null      object 
dtypes: float64(2), int64(1), object(4)
memory usage: 580.0+ bytes


## üìä Step 2: Inspect and Sort
1. Sort the DataFrame by `Calories_burned` in descending order.
2. Sort again by `Duration_min` in ascending order.
3. Print only the top 3 rows after sorting by calories.


In [64]:
# === Step 2: Inspect and Sort (Instructions Only) ===
# TODO:
# - sort by 'Calories_burned' descending
# - sort by 'Duration_min' ascending
# - print top 3 rows after sorting by calories

df.sort_values(['Calories_burned','Duration_min'],ascending =[False, True]).head(3)


Unnamed: 0,User_ID,Name,Activity,Duration_min,Calories_burned,Date,Device
6,107,Jae,Cycling,55.0,420.0,2025-10-04,Apple Watch
1,102,Mark,Cycling,45.0,400.0,2025-10-01,Garmin
5,106,Mia,Swim,40.0,300.0,2025-10-04,Fitbit


## üß± Step 3: Work with Columns
1. Rename the column `Duration_min` ‚Üí `DurationMinutes`.
2. Add a new column called `Calories_per_min` = `Calories_burned` / `DurationMinutes`.
3. Drop the column `Date`.
4. Replace any missing `Calories_burned` values with `0`.


In [69]:
# === Step 3: Work with Columns (Instructions Only) ===
# TODO:
# - rename 'Duration_min' to 'DurationMinutes'
# - create 'Calories_per_min' as Calories_burned / DurationMinutes
# - drop 'Date'
# - replace missing values in 'Calories_burned' with 0
df.rename(columns={'Duration_min':'DurationMinutes'}, inplace=True)
df['Calories_per_min'] = df.Calories_burned / df.DurationMinutes
df.drop('Date', axis='columns')

Unnamed: 0,User_ID,Name,Activity,DurationMinutes,Calories_burned,Device,Calories_per_min
0,101,Ana,Running,30.0,250.0,Fitbit,8.333333
1,102,Mark,Cycling,45.0,400.0,Garmin,8.888889
2,103,Jo,Run,25.0,200.0,Fitbit,8.0
3,104,Lea,Yoga,,180.0,Apple Watch,
4,105,Marco,running,60.0,,Garmin,
5,106,Mia,Swim,40.0,300.0,Fitbit,7.5
6,107,Jae,Cycling,55.0,420.0,Apple Watch,7.636364
7,108,Anna,Yoga,50.0,200.0,Fitbit,4.0


## üßµ Step 4: String Operations
1. Convert all `Activity` names to lowercase.
2. Replace any occurrence of `"run"` with `"running"` in the `Activity` column.
3. Extract only the first 3 letters of each `Device` value and store it in a new column called `Device_Code`.


In [101]:
# === Step 4: String Operations (Instructions Only) ===
# TODO:
# - to lowercase on 'Activity'
# - replace 'run' with 'running' in 'Activity'
# - create 'Device_Code' as first 3 letters of 'Device'
df['Activity'] = df['Activity'].str.lower()
df['Device_Code'] = df['Device'].str[:3]
df.head()

Unnamed: 0,User_ID,Name,Activity,DurationMinutes,Calories_burned,Date,Device,Calories_per_min,Device_Code
0,101,Ana,running,30.0,250.0,2025-10-01,Fitbit,8.333333,Fit
1,102,Mark,cycling,45.0,400.0,2025-10-01,Garmin,8.888889,Gar
2,103,Jo,run,25.0,200.0,2025-10-02,Fitbit,8.0,Fit
3,104,Lea,yoga,,180.0,2025-10-03,Apple Watch,,App
4,105,Marco,running,60.0,,2025-10-03,Garmin,,Gar


## üîç Step 5: Exploring Values
1. Show all **unique activities**.
2. Display how many **unique devices** there are.
3. Count how many times each activity appears.


In [111]:
# === Step 5: Exploring Values (Instructions Only) ===
# TODO:
# - show unique activities
# - count unique devices
# - show counts per activity
df['Activity'].unique()
df['Device'].nunique()
df.Activity.value_counts()

Activity
running    2
cycling    2
yoga       2
run        1
swim       1
Name: count, dtype: int64

## üß© Step 6: Concatenate
1. Create a new small DataFrame with two new rows (pick any made-up users).
2. Concatenate this new DataFrame with the original one.
3. Display the new shape of the combined DataFrame.


In [117]:
# === Step 6: Concatenate (Instructions Only) ===
# TODO:
# - create a small new DataFrame with 2 rows matching the original schema
# - concatenate with the original df
# - display new shape
df.head(15)

df_new = pd.DataFrame({
    'User_ID':['109', '110'],
    'Name': ['Eva', 'Adan']
})

df_final = pd.concat([df,df_new])
df_final.reset_index(drop=True)
    

Unnamed: 0,User_ID,Name,Activity,DurationMinutes,Calories_burned,Date,Device,Calories_per_min,Device_Code
0,101,Ana,running,30.0,250.0,2025-10-01,Fitbit,8.333333,Fit
1,102,Mark,cycling,45.0,400.0,2025-10-01,Garmin,8.888889,Gar
2,103,Jo,run,25.0,200.0,2025-10-02,Fitbit,8.0,Fit
3,104,Lea,yoga,,180.0,2025-10-03,Apple Watch,,App
4,105,Marco,running,60.0,,2025-10-03,Garmin,,Gar
5,106,Mia,swim,40.0,300.0,2025-10-04,Fitbit,7.5,Fit
6,107,Jae,cycling,55.0,420.0,2025-10-04,Apple Watch,7.636364,App
7,108,Anna,yoga,50.0,200.0,2025-10-05,Fitbit,4.0,Fit
8,109,Eva,,,,,,,
9,110,Adan,,,,,,,


## üíæ Step 7: Save Your Work
1. Save the final version of your DataFrame as `fitness_tracker_cleaned.csv`.


In [118]:
# === Step 7: Save Your Work (Instructions Only) ===
# TODO:
# - save the final DataFrame to 'fitness_tracker_cleaned.csv' (index=False)
df_final.to_csv('fitness_tracker_cleaned.csv', index=False)

---

‚úÖ **End of Lab**  
