# Counting Sheep, Courting Sleep:

## Using Python Data Analysis Tools to Find the Perfect Recipe for a Good Night's Sleep

First, we will import the pandas library to help us manipulate our data.

In [2]:
import pandas as pd

Then we will use pandas to read in our sleep tracking CSV file and, to make sure the data came in correctly, we will check the first five rows.

In [3]:
sleep_df = pd.read_csv('sleep_data_tracking_2023.csv')
sleep_df.head()

Unnamed: 0,Timestamp,sleep_date,day_of_the_week,hours_sleep,awake_pct,rem_pct,core_pct,deep_pct,hr_bpm_min,hr_bpm_max,resp_rate_min,resp_rate_max,melatonin,magnesium,neuriva,chamomile,bath,shower,meditation
0,2/8/2023 9:10:02,2/7/2023,Tuesday,7.3,3.0,12.0,80.0,5.0,61.0,69.0,8.0,12.5,No,Yes,No,No,Yes,No,Yes
1,2/9/2023 7:56:47,2/8/2023,Wednesday,6.83,5.0,22.0,67.0,6.0,62.0,78.0,8.0,12.5,Yes,Yes,No,No,No,No,Yes
2,2/10/2023 12:10:24,2/9/2023,Thursday,6.68,4.0,11.0,79.0,6.0,61.0,73.0,8.5,12.5,Yes,Yes,No,No,Yes,No,No
3,2/12/2023 17:32:05,2/10/2023,Friday,7.5,4.0,14.0,78.0,4.0,59.0,70.0,9.0,14.0,No,Yes,No,No,No,No,Yes
4,2/12/2023 17:34:12,2/11/2023,Saturday,6.55,3.0,21.0,71.0,5.0,58.0,76.0,8.0,15.0,No,No,No,No,No,No,No


Looks good! Now let's get a feel for our DataFrame and the data types it contains.

In [4]:
sleep_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 31 entries, 0 to 30
Data columns (total 19 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   Timestamp        31 non-null     object 
 1   sleep_date       31 non-null     object 
 2   day_of_the_week  31 non-null     object 
 3   hours_sleep      31 non-null     float64
 4   awake_pct        30 non-null     float64
 5   rem_pct          30 non-null     float64
 6   core_pct         30 non-null     float64
 7   deep_pct         30 non-null     float64
 8   hr_bpm_min       30 non-null     float64
 9   hr_bpm_max       30 non-null     float64
 10  resp_rate_min    30 non-null     float64
 11  resp_rate_max    30 non-null     float64
 12  melatonin        31 non-null     object 
 13  magnesium        31 non-null     object 
 14  neuriva          31 non-null     object 
 15  chamomile        31 non-null     object 
 16  bath             31 non-null     object 
 17  shower           3

We will be dealing with floats and objects in this DataFrame. Now let's get a count of the number of rows and columns.

In [5]:
sleep_df.shape

(31, 19)

Our data set has 31 rows and 19 columns. Quickly comparing the number of rows to the number of non-null values given in the output of cell 3 above, our data is pretty clean in terms of null values. We will have a little cleanup to do with those later.

The "Timestamp" column came in from our Google Sheet because that can't be deleted in Sheets; it can only be hidden. That information is not needed for our analysis; that's simply the time each data submission went into the Google Form. Therefore, that column can be deleted.

In [6]:
del sleep_df['Timestamp']

Let's check our first five rows to make sure that worked.

In [7]:
sleep_df.head()

Unnamed: 0,sleep_date,day_of_the_week,hours_sleep,awake_pct,rem_pct,core_pct,deep_pct,hr_bpm_min,hr_bpm_max,resp_rate_min,resp_rate_max,melatonin,magnesium,neuriva,chamomile,bath,shower,meditation
0,2/7/2023,Tuesday,7.3,3.0,12.0,80.0,5.0,61.0,69.0,8.0,12.5,No,Yes,No,No,Yes,No,Yes
1,2/8/2023,Wednesday,6.83,5.0,22.0,67.0,6.0,62.0,78.0,8.0,12.5,Yes,Yes,No,No,No,No,Yes
2,2/9/2023,Thursday,6.68,4.0,11.0,79.0,6.0,61.0,73.0,8.5,12.5,Yes,Yes,No,No,Yes,No,No
3,2/10/2023,Friday,7.5,4.0,14.0,78.0,4.0,59.0,70.0,9.0,14.0,No,Yes,No,No,No,No,Yes
4,2/11/2023,Saturday,6.55,3.0,21.0,71.0,5.0,58.0,76.0,8.0,15.0,No,No,No,No,No,No,No


And now let's get a new count of the number of rows and columns.

In [8]:
sleep_df.shape

(31, 18)

In [10]:
dups = sleep_df[sleep_df.duplicated('sleep_date', keep=False)]
print(dups)

   sleep_date day_of_the_week  hours_sleep  awake_pct  rem_pct  core_pct  \
16  2/23/2023        Thursday         6.48        2.0     17.0      74.0   
17  2/23/2023        Thursday         6.53        4.0     20.0      74.0   

    deep_pct  hr_bpm_min  hr_bpm_max  resp_rate_min  resp_rate_max melatonin  \
16       7.0        59.0        73.0            8.5           13.0        No   
17       2.0        62.0        77.0            9.0           14.5        No   

   magnesium neuriva chamomile bath shower meditation  
16       Yes     Yes        No  Yes     No        Yes  
17       Yes     Yes        No  Yes     No        Yes  
