# DataFrame

DataFrame is probably the most important and most commonly used object in pandas. A DataFrame is basically a collection of a number of Series that share common indices. A DataFrame arranges data in rows and columns in a tabular structure. For an indepth study of DataFrames see [this tutorial](https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python) from datacamp.

In [1]:
import pandas as pd

With the import in place, let's now define the problem we are going to solve using DataFrame.

## Problem 1

You are given the heights (in inch) and weights (in lbs) of 5 people as shown in the table below:

| Person | Height(inch) | Weight(lbs) |
|--------|--------------|-------------|
|    A   |     72       |      186    |
|    B   |     69       |      205    |
|    C   |     70       |      201    |
|    D   |     62       |      125    |
|    E   |     57       |      89     |

It is later found that the table actually misses entries for persons F and G whose heights are 65 inch and 60 inch respectively and the weight for F is 121 lbs, but data for G's weight is missing. Also, values for all the heights is found to be 5 inch less than it should have been, whereas values for all the weights is found to be 5 lbs more than it should have been. Find the correct Body Mass Index (BMI) for each person, if possible.

In [2]:
height_weight_df = pd.DataFrame(data=[['A',72,186],['B',69,205],['C',70,201],['D',62,125],['E',57,89]], columns=['Person','Height','Weight'])

In [3]:
height_weight_df.head(n=3)

Unnamed: 0,Person,Height,Weight
0,A,72,186
1,B,69,205
2,C,70,201


In [4]:
height_weight_df.describe()

Unnamed: 0,Height,Weight
count,5.0,5.0
mean,66.0,161.2
std,6.284903,51.577127
min,57.0,89.0
25%,62.0,125.0
50%,69.0,186.0
75%,70.0,201.0
max,72.0,205.0


In [5]:
height_weight_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
Person    5 non-null object
Height    5 non-null int64
Weight    5 non-null int64
dtypes: int64(2), object(1)
memory usage: 200.0+ bytes


In [6]:
height_weight_df.set_index(['Person'])

Unnamed: 0_level_0,Height,Weight
Person,Unnamed: 1_level_1,Unnamed: 2_level_1
A,72,186
B,69,205
C,70,201
D,62,125
E,57,89


In [7]:
height_weight_df.set_index(['Person'],inplace=True)

In [8]:
height_weight_df

Unnamed: 0_level_0,Height,Weight
Person,Unnamed: 1_level_1,Unnamed: 2_level_1
A,72,186
B,69,205
C,70,201
D,62,125
E,57,89


In [9]:
height_weight_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 5 entries, A to E
Data columns (total 2 columns):
Height    5 non-null int64
Weight    5 non-null int64
dtypes: int64(2)
memory usage: 120.0+ bytes


In [10]:
height_weight_df['Height']

Person
A    72
B    69
C    70
D    62
E    57
Name: Height, dtype: int64

In [11]:
height_weight_df['Weight']

Person
A    186
B    205
C    201
D    125
E     89
Name: Weight, dtype: int64

In [12]:
type(height_weight_df['Height'])

pandas.core.series.Series

In [13]:
missing_df = pd.DataFrame(data=[[65,121],[60]],index=['F','G'],columns=['Height','Weight'])

In [14]:
missing_df

Unnamed: 0,Height,Weight
F,65,121.0
G,60,


In [15]:
updated_df = height_weight_df.append(missing_df)
updated_df

Unnamed: 0,Height,Weight
A,72,186.0
B,69,205.0
C,70,201.0
D,62,125.0
E,57,89.0
F,65,121.0
G,60,


In [16]:
updated_df['Height'] += 5
updated_df['Weight'] -=5

In [17]:
updated_df

Unnamed: 0,Height,Weight
A,77,181.0
B,74,200.0
C,75,196.0
D,67,120.0
E,62,84.0
F,70,116.0
G,65,


In [18]:
updated_df.fillna(updated_df.mean())

Unnamed: 0,Height,Weight
A,77,181.0
B,74,200.0
C,75,196.0
D,67,120.0
E,62,84.0
F,70,116.0
G,65,149.5


In [19]:
updated_df

Unnamed: 0,Height,Weight
A,77,181.0
B,74,200.0
C,75,196.0
D,67,120.0
E,62,84.0
F,70,116.0
G,65,


In [20]:
updated_df.dropna(inplace=True)

In [21]:
updated_df

Unnamed: 0,Height,Weight
A,77,181.0
B,74,200.0
C,75,196.0
D,67,120.0
E,62,84.0
F,70,116.0


In [22]:
lbs_to_kg_ratio = 0.453592
inch_to_meter_ratio = 0.0254

In [23]:
updated_df['Height'] *= inch_to_meter_ratio
updated_df['Weight'] *= lbs_to_kg_ratio

In [24]:
updated_df

Unnamed: 0,Height,Weight
A,1.9558,82.100152
B,1.8796,90.7184
C,1.905,88.904032
D,1.7018,54.43104
E,1.5748,38.101728
F,1.778,52.616672


In [25]:
updated_df['BMI'] = updated_df['Weight']/(updated_df['Height']**2)

In [26]:
updated_df

Unnamed: 0,Height,Weight,BMI
A,1.9558,82.100152,21.46323
B,1.8796,90.7184,25.678196
C,1.905,88.904032,24.498049
D,1.7018,54.43104,18.794449
E,1.5748,38.101728,15.363631
F,1.778,52.616672,16.644083


In [27]:
updated_df.to_csv('BMI.csv',index_label='Person')

## Problem 2

In [28]:
travel_df = pd.read_csv('travel-times.csv')

In [29]:
travel_df.head()

Unnamed: 0,Date,StartTime,DayOfWeek,GoingTo,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime,Take407All,Comments
0,1/6/2012,16:37,Friday,Home,51.29,127.4,78.3,84.8,,39.3,36.3,No,
1,1/6/2012,08:20,Friday,GSK,51.63,130.3,81.8,88.9,,37.9,34.9,No,
2,1/4/2012,16:17,Wednesday,Home,51.27,127.4,82.0,85.8,,37.5,35.9,No,
3,1/4/2012,07:53,Wednesday,GSK,49.17,132.3,74.2,82.9,,39.8,35.6,No,
4,1/3/2012,18:57,Tuesday,Home,51.15,136.2,83.4,88.1,,36.8,34.8,No,


In [30]:
travel_df.describe()

Unnamed: 0,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,TotalTime,MovingTime
count,205.0,205.0,205.0,205.0,205.0,205.0
mean,50.981512,127.591707,74.477561,81.97561,41.90439,37.871707
std,1.321205,4.12845,11.409816,10.111544,6.849476,4.835072
min,48.32,112.2,38.1,50.3,28.2,27.1
25%,50.65,124.9,68.9,76.6,38.4,35.7
50%,51.14,127.4,73.6,81.4,41.3,37.6
75%,51.63,129.8,79.9,86.0,44.4,39.9
max,60.32,140.9,107.7,112.1,82.3,62.4


In [31]:
travel_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 205 entries, 0 to 204
Data columns (total 13 columns):
Date              205 non-null object
StartTime         205 non-null object
DayOfWeek         205 non-null object
GoingTo           205 non-null object
Distance          205 non-null float64
MaxSpeed          205 non-null float64
AvgSpeed          205 non-null float64
AvgMovingSpeed    205 non-null float64
FuelEconomy       188 non-null object
TotalTime         205 non-null float64
MovingTime        205 non-null float64
Take407All        205 non-null object
Comments          24 non-null object
dtypes: float64(6), object(7)
memory usage: 20.9+ KB


In [32]:
travel_df.drop(labels=['Comments'],axis=1,inplace=True)

In [33]:
travel_df.head()

Unnamed: 0,Date,StartTime,DayOfWeek,GoingTo,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime,Take407All
0,1/6/2012,16:37,Friday,Home,51.29,127.4,78.3,84.8,,39.3,36.3,No
1,1/6/2012,08:20,Friday,GSK,51.63,130.3,81.8,88.9,,37.9,34.9,No
2,1/4/2012,16:17,Wednesday,Home,51.27,127.4,82.0,85.8,,37.5,35.9,No
3,1/4/2012,07:53,Wednesday,GSK,49.17,132.3,74.2,82.9,,39.8,35.6,No
4,1/3/2012,18:57,Tuesday,Home,51.15,136.2,83.4,88.1,,36.8,34.8,No


In [34]:
travel_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 205 entries, 0 to 204
Data columns (total 12 columns):
Date              205 non-null object
StartTime         205 non-null object
DayOfWeek         205 non-null object
GoingTo           205 non-null object
Distance          205 non-null float64
MaxSpeed          205 non-null float64
AvgSpeed          205 non-null float64
AvgMovingSpeed    205 non-null float64
FuelEconomy       188 non-null object
TotalTime         205 non-null float64
MovingTime        205 non-null float64
Take407All        205 non-null object
dtypes: float64(6), object(6)
memory usage: 19.3+ KB


In [35]:
travel_df['FuelEconomy'] = pd.to_numeric(travel_df['FuelEconomy'],errors='coerce')

In [36]:
travel_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 205 entries, 0 to 204
Data columns (total 12 columns):
Date              205 non-null object
StartTime         205 non-null object
DayOfWeek         205 non-null object
GoingTo           205 non-null object
Distance          205 non-null float64
MaxSpeed          205 non-null float64
AvgSpeed          205 non-null float64
AvgMovingSpeed    205 non-null float64
FuelEconomy       186 non-null float64
TotalTime         205 non-null float64
MovingTime        205 non-null float64
Take407All        205 non-null object
dtypes: float64(7), object(5)
memory usage: 19.3+ KB


In [37]:
travel_df['FuelEconomy'].fillna(travel_df['FuelEconomy'].mean(),inplace=True)

In [38]:
travel_df.head()

Unnamed: 0,Date,StartTime,DayOfWeek,GoingTo,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime,Take407All
0,1/6/2012,16:37,Friday,Home,51.29,127.4,78.3,84.8,8.690591,39.3,36.3,No
1,1/6/2012,08:20,Friday,GSK,51.63,130.3,81.8,88.9,8.690591,37.9,34.9,No
2,1/4/2012,16:17,Wednesday,Home,51.27,127.4,82.0,85.8,8.690591,37.5,35.9,No
3,1/4/2012,07:53,Wednesday,GSK,49.17,132.3,74.2,82.9,8.690591,39.8,35.6,No
4,1/3/2012,18:57,Tuesday,Home,51.15,136.2,83.4,88.1,8.690591,36.8,34.8,No


In [39]:
travel_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 205 entries, 0 to 204
Data columns (total 12 columns):
Date              205 non-null object
StartTime         205 non-null object
DayOfWeek         205 non-null object
GoingTo           205 non-null object
Distance          205 non-null float64
MaxSpeed          205 non-null float64
AvgSpeed          205 non-null float64
AvgMovingSpeed    205 non-null float64
FuelEconomy       205 non-null float64
TotalTime         205 non-null float64
MovingTime        205 non-null float64
Take407All        205 non-null object
dtypes: float64(7), object(5)
memory usage: 19.3+ KB


In [40]:
travel_df_by_date = travel_df.groupby(['Date','DayOfWeek'])
type(travel_df_by_date)

pandas.core.groupby.DataFrameGroupBy

In [41]:
travel_df_by_date_combined = travel_df_by_date.sum()

In [42]:
travel_df_by_date_combined.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime
Date,DayOfWeek,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1/2/2012,Monday,100.38,251.5,160.4,173.2,17.381183,75.1,69.6
1/3/2012,Tuesday,102.95,272.0,167.9,176.9,17.381183,73.6,69.8
1/4/2012,Wednesday,100.44,259.7,156.2,168.7,17.381183,77.3,71.5
1/6/2012,Friday,102.92,257.7,160.1,173.7,17.381183,77.2,71.2
10/11/2011,Tuesday,100.46,265.9,153.0,171.6,15.62,80.3,71.0


In [43]:
travel_df_by_date_combined['MaxSpeed'] = travel_df_by_date['MaxSpeed'].max()
travel_df_by_date_combined['AvgSpeed'] = travel_df_by_date['AvgSpeed'].mean()
travel_df_by_date_combined['AvgMovingSpeed'] = travel_df_by_date['AvgMovingSpeed'].mean()
travel_df_by_date_combined['FuelEconomy'] = travel_df_by_date['FuelEconomy'].mean()

In [44]:
travel_df_by_date_combined.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime
Date,DayOfWeek,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1/2/2012,Monday,100.38,128.3,80.2,86.6,8.690591,75.1,69.6
1/3/2012,Tuesday,102.95,136.2,83.95,88.45,8.690591,73.6,69.8
1/4/2012,Wednesday,100.44,132.3,78.1,84.35,8.690591,77.3,71.5
1/6/2012,Friday,102.92,130.3,80.05,86.85,8.690591,77.2,71.2
10/11/2011,Tuesday,100.46,135.1,76.5,85.8,7.81,80.3,71.0


In [45]:
travel_df_by_date_combined.tail()

Unnamed: 0_level_0,Unnamed: 1_level_0,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime
Date,DayOfWeek,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
9/28/2011,Wednesday,101.88,128.8,90.7,93.6,8.93,68.5,66.4
9/29/2011,Thursday,102.04,128.4,74.4,79.4,8.93,83.6,77.9
9/6/2011,Tuesday,107.24,132.5,95.25,98.15,8.5,67.6,65.6
9/7/2011,Wednesday,100.42,132.8,64.0,71.55,8.5,95.2,84.5
9/8/2011,Thursday,100.17,137.0,60.2,70.4,8.5,100.8,86.2


In [46]:
travel_df_by_date_combined.iloc[0]

Distance          100.380000
MaxSpeed          128.300000
AvgSpeed           80.200000
AvgMovingSpeed     86.600000
FuelEconomy         8.690591
TotalTime          75.100000
MovingTime         69.600000
Name: (1/2/2012, Monday), dtype: float64

In [47]:
travel_df_by_date_combined.iloc[-1]

Distance          100.17
MaxSpeed          137.00
AvgSpeed           60.20
AvgMovingSpeed     70.40
FuelEconomy         8.50
TotalTime         100.80
MovingTime         86.20
Name: (9/8/2011, Thursday), dtype: float64

In [48]:
travel_df_by_date_combined.iloc[10]

Distance          103.07
MaxSpeed          125.20
AvgSpeed           70.70
AvgMovingSpeed     79.40
FuelEconomy         8.75
TotalTime          87.50
MovingTime         77.90
Name: (10/20/2011, Thursday), dtype: float64

In [49]:
travel_df_by_date_combined.reset_index(inplace=True)

In [50]:
travel_df_by_date_combined

Unnamed: 0,Date,DayOfWeek,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime
0,1/2/2012,Monday,100.38,128.3,80.20,86.60,8.690591,75.1,69.6
1,1/3/2012,Tuesday,102.95,136.2,83.95,88.45,8.690591,73.6,69.8
2,1/4/2012,Wednesday,100.44,132.3,78.10,84.35,8.690591,77.3,71.5
3,1/6/2012,Friday,102.92,130.3,80.05,86.85,8.690591,77.2,71.2
4,10/11/2011,Tuesday,100.46,135.1,76.50,85.80,7.810000,80.3,71.0
5,10/12/2011,Wednesday,101.98,128.4,59.60,66.55,8.750000,102.7,92.0
6,10/13/2011,Thursday,50.66,128.3,105.50,111.30,8.750000,28.8,27.3
7,10/17/2011,Monday,101.91,137.1,86.15,91.60,8.750000,71.5,67.4
8,10/18/2011,Tuesday,103.10,133.1,75.15,82.15,8.750000,82.8,75.4
9,10/19/2011,Wednesday,102.98,130.2,69.55,77.45,8.750000,89.6,80.0


In [51]:
travel_df_by_date_combined.set_index(['Date'],inplace=True)

In [52]:
travel_df_by_date_combined

Unnamed: 0_level_0,DayOfWeek,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1/2/2012,Monday,100.38,128.3,80.20,86.60,8.690591,75.1,69.6
1/3/2012,Tuesday,102.95,136.2,83.95,88.45,8.690591,73.6,69.8
1/4/2012,Wednesday,100.44,132.3,78.10,84.35,8.690591,77.3,71.5
1/6/2012,Friday,102.92,130.3,80.05,86.85,8.690591,77.2,71.2
10/11/2011,Tuesday,100.46,135.1,76.50,85.80,7.810000,80.3,71.0
10/12/2011,Wednesday,101.98,128.4,59.60,66.55,8.750000,102.7,92.0
10/13/2011,Thursday,50.66,128.3,105.50,111.30,8.750000,28.8,27.3
10/17/2011,Monday,101.91,137.1,86.15,91.60,8.750000,71.5,67.4
10/18/2011,Tuesday,103.10,133.1,75.15,82.15,8.750000,82.8,75.4
10/19/2011,Wednesday,102.98,130.2,69.55,77.45,8.750000,89.6,80.0


In [53]:
type(travel_df_by_date_combined)

pandas.core.frame.DataFrame

In [54]:
distance_above_ninety = travel_df_by_date_combined['Distance']>90
distance_above_ninety

Date
1/2/2012       True
1/3/2012       True
1/4/2012       True
1/6/2012       True
10/11/2011     True
10/12/2011     True
10/13/2011    False
10/17/2011     True
10/18/2011     True
10/19/2011     True
10/20/2011     True
10/21/2011    False
10/24/2011    False
10/25/2011     True
10/26/2011    False
10/27/2011    False
10/28/2011    False
10/3/2011      True
10/31/2011     True
10/4/2011      True
10/5/2011      True
10/6/2011      True
10/7/2011      True
11/1/2011     False
11/10/2011     True
11/14/2011     True
11/15/2011     True
11/16/2011     True
11/17/2011     True
11/2/2011      True
              ...  
8/22/2011      True
8/23/2011      True
8/24/2011      True
8/25/2011      True
8/26/2011      True
8/29/2011      True
8/3/2011       True
8/30/2011      True
8/31/2011      True
8/4/2011       True
8/5/2011       True
8/8/2011       True
8/9/2011       True
9/1/2011       True
9/12/2011      True
9/13/2011      True
9/14/2011      True
9/15/2011      True
9/19/2011      

In [55]:
distance_below_hundred = travel_df_by_date_combined['Distance']<100
distance_below_hundred

Date
1/2/2012      False
1/3/2012      False
1/4/2012      False
1/6/2012      False
10/11/2011    False
10/12/2011    False
10/13/2011     True
10/17/2011    False
10/18/2011    False
10/19/2011    False
10/20/2011    False
10/21/2011     True
10/24/2011     True
10/25/2011    False
10/26/2011     True
10/27/2011     True
10/28/2011     True
10/3/2011     False
10/31/2011    False
10/4/2011     False
10/5/2011     False
10/6/2011     False
10/7/2011     False
11/1/2011      True
11/10/2011    False
11/14/2011    False
11/15/2011    False
11/16/2011    False
11/17/2011    False
11/2/2011     False
              ...  
8/22/2011     False
8/23/2011     False
8/24/2011     False
8/25/2011     False
8/26/2011      True
8/29/2011     False
8/3/2011      False
8/30/2011     False
8/31/2011     False
8/4/2011      False
8/5/2011      False
8/8/2011      False
8/9/2011      False
9/1/2011      False
9/12/2011     False
9/13/2011     False
9/14/2011     False
9/15/2011     False
9/19/2011     F

In [56]:
distance_above_ninety & distance_below_hundred

Date
1/2/2012      False
1/3/2012      False
1/4/2012      False
1/6/2012      False
10/11/2011    False
10/12/2011    False
10/13/2011    False
10/17/2011    False
10/18/2011    False
10/19/2011    False
10/20/2011    False
10/21/2011    False
10/24/2011    False
10/25/2011    False
10/26/2011    False
10/27/2011    False
10/28/2011    False
10/3/2011     False
10/31/2011    False
10/4/2011     False
10/5/2011     False
10/6/2011     False
10/7/2011     False
11/1/2011     False
11/10/2011    False
11/14/2011    False
11/15/2011    False
11/16/2011    False
11/17/2011    False
11/2/2011     False
              ...  
8/22/2011     False
8/23/2011     False
8/24/2011     False
8/25/2011     False
8/26/2011      True
8/29/2011     False
8/3/2011      False
8/30/2011     False
8/31/2011     False
8/4/2011      False
8/5/2011      False
8/8/2011      False
8/9/2011      False
9/1/2011      False
9/12/2011     False
9/13/2011     False
9/14/2011     False
9/15/2011     False
9/19/2011     F

In [57]:
ninety_to_hundred_df = travel_df_by_date_combined[distance_above_ninety & distance_below_hundred]
ninety_to_hundred_df

Unnamed: 0_level_0,DayOfWeek,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
7/25/2011,Monday,99.37,126.6,66.9,78.6,8.45,96.8,75.9
7/27/2011,Wednesday,99.8,124.9,69.35,74.85,8.45,86.4,80.2
7/29/2011,Friday,99.75,135.6,90.45,94.05,8.45,68.4,65.5
8/26/2011,Friday,99.89,132.7,78.7,85.1,8.54,76.2,70.5


In [58]:
ninety_to_hundred_df.loc[['8/26/2011'],['Distance','FuelEconomy']]

Unnamed: 0_level_0,Distance,FuelEconomy
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
8/26/2011,99.89,8.54


In [59]:
travel_df.groupby(['DayOfWeek']).sum()

Unnamed: 0_level_0,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime
DayOfWeek,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Friday,1375.89,3444.1,2204.8,2374.3,234.201183,1023.9,948.1
Monday,1981.04,4953.7,2854.7,3174.8,340.032957,1684.7,1487.7
Thursday,2239.72,5631.4,3272.1,3643.6,384.471183,1811.8,1646.4
Tuesday,2454.12,6155.3,3541.5,3882.9,414.212957,2041.0,1844.5
Wednesday,2400.44,5971.8,3394.8,3729.4,408.652957,2029.0,1837.0


In [60]:
travel_df.groupby(['DayOfWeek']).max()

Unnamed: 0_level_0,Date,StartTime,GoingTo,Distance,MaxSpeed,AvgSpeed,AvgMovingSpeed,FuelEconomy,TotalTime,MovingTime,Take407All
DayOfWeek,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Friday,9/2/2011,20:31,Home,55.57,135.6,107.7,112.1,9.76,47.9,43.2,Yes
Monday,9/26/2011,17:38,Home,54.52,137.8,104.4,106.2,10.05,82.3,62.4,Yes
Thursday,9/8/2011,17:58,Home,52.42,137.7,106.8,111.3,10.05,61.2,48.9,Yes
Tuesday,9/6/2011,18:57,Home,54.36,138.0,95.4,102.6,9.53,70.5,59.8,Yes
Wednesday,9/7/2011,18:05,Home,60.32,140.9,103.4,108.0,9.76,54.8,48.5,Yes


In [62]:
travel_df.groupby(['DayOfWeek'])['AvgMovingSpeed'].describe()

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
DayOfWeek,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Friday,27.0,87.937037,9.288792,77.1,81.7,86.9,89.95,112.1
Monday,39.0,81.405128,10.600297,50.3,75.6,82.4,85.8,106.2
Thursday,44.0,82.809091,10.534082,63.1,77.05,80.85,85.55,111.3
Tuesday,48.0,80.89375,9.921338,51.5,75.55,81.05,85.65,102.6
Wednesday,47.0,79.348936,8.801207,65.8,74.1,78.7,81.75,108.0
