You have been going to the gym for a while and have been tracking the exercises you do in a csv file. Are you making progress? 

Some notes on the file:
- the file with your results is at ./data/workout-log.csv
- for body weight exercises, you note down 1 for weight
- you may have exercises listed, but no reps. This means that you didn't finish all the sets

Before determining whether your workouts are working, you need to make sure that your data is consistent and clean.. Let's do that

In [3]:
# import pandas 
import pandas as pd 

In [4]:
# read in your data file 
df = pd.read_csv("./data/workout-log.csv")

In [5]:
# look at the top few rows to make sure that it loaded ok
df.head()

Unnamed: 0,WorkoutName,BodyArea,Date,Exercise,Lb,Reps,Set
0,push 2,chest,8/11/22,bench press,35,10.0,1
1,push 2,chest,8/11/22,bench press,35,10.0,1
2,push 2,chest,8/11/22,bench press,35,9.0,1
3,push 2,chest,8/11/22,bench press,35,8.0,1
4,push 2,chest,7/30/22,bench press,35,10.0,1


In [6]:
# explore what exercises you did for each body area... 
# group by body area and exercise, look at the mean of each group

area_exercise_grp = df.groupby(['BodyArea', 'Exercise'])
area_exercise_grp.mean()

Unnamed: 0_level_0,Unnamed: 1_level_0,Lb,Reps,Set
BodyArea,Exercise,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
chest,bench press,32.324841,9.77707,1.643312
chest,push-up,1.0,10.4,3.133333
chest,push-ups,1.0,10.666667,4.0
legs,deadlift,20.95,13.716667,2.6
legs,squat,2.457627,13.542373,2.016949
triceps,skull crusher,15.0,12.310345,2.0
triceps,squat,1.0,15.0,4.0


In [7]:
# get a count of empty values in all columns
df.isna().sum()

WorkoutName    45
BodyArea        0
Date            0
Exercise        0
Lb              0
Reps            3
Set             0
dtype: int64

What you should be able to notice so far: 
1. push-ups are listed as 2 different things: 'push-up' and 'push-ups'
2. squats are listed in both 'legs' and 'quads' 
3. reps are missing for 3 exercises 
4. WorkoutName is missing in 45 instances
5. There's a weird "Unnamed: 7" column that seems to be just junk

**Let's fix this...**

In [8]:
# re-name 'push-ups' to 'push-up'
df.Exercise.replace("push-ups", "push-up")

#try to select rows with "push-ups" in Exercise - you should not get any such rows
df.loc[df.Exercise == "push-ups"]

Unnamed: 0,WorkoutName,BodyArea,Date,Exercise,Lb,Reps,Set
232,push 1,chest,8/4/22,push-ups,1,10.0,4
233,push 1,chest,8/4/22,push-ups,1,10.0,4
234,push 1,chest,8/4/22,push-ups,1,,4
235,push 1,chest,8/4/22,push-ups,1,,4
236,push 1,chest,7/26/22,push-ups,1,10.0,4
237,push 1,chest,7/26/22,push-ups,1,10.0,4
238,push 1,chest,7/26/22,push-ups,1,10.0,4
239,push 1,chest,7/26/22,push-ups,1,10.0,4
240,push 1,chest,7/20/22,push-ups,1,12.0,4
241,push 1,chest,7/20/22,push-ups,1,12.0,4


In [9]:
# set all squats to be 'legs'
# hint: you may need to do an internet search on how to change the value of one column based on the value of another column
# hint: this documentation may be helpful: https://www.statology.org/pandas-loc-multiple-conditions/
df.loc[df.Exercise == "squat", 'BodyArea'] = "legs"

# try to select rows with "triceps" in BodyArea and "squat" in Exercise - you should not get any such rows
df.loc[((df.BodyArea == "triceps") & (df.Exercise == 'squat'))]

Unnamed: 0,WorkoutName,BodyArea,Date,Exercise,Lb,Reps,Set


In [11]:
# remove the 3 lines with no reps.
# hint look here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html
df = df.dropna(subset=['Reps'])

# confirm that there are no lines with nan in "Reps" column

df.isna().sum()

WorkoutName    45
BodyArea        0
Date            0
Exercise        0
Lb              0
Reps            0
Set             0
dtype: int64

In [None]:
# drop the WorkoutName and Unnamed: 7 columns. 
# it doesn't impact the effectiveness of your workout in any way

df = df.drop(['WorkoutName', 'Unnamed: 7'], axis=1)

# confirm that the columns are gone
df.head()

Unnamed: 0,BodyArea,Date,Exercise,Lb,Reps,Set
0,chest,8/11/22,bench press,35,10.0,1
1,chest,8/11/22,bench press,35,10.0,1
2,chest,8/11/22,bench press,35,9.0,1
3,chest,8/11/22,bench press,35,8.0,1
4,chest,7/30/22,bench press,35,10.0,1
