<H1>Pandas</H1>

* Pandas is a powerful and open-source Python library
* It is built on top of the NumPy library
* It is well-suited for working with tabular data, such as spreadsheets or SQL tables
* Pandas are generally used for data science

<h3 style="color:skyblue;">What can Pandas do?</h3>

* Data set cleaning, merging, and joining.
* Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data.
* Columns can be inserted and deleted from DataFrame and higher dimensional objects.
* Powerful group by functionality for performing split-apply-combine operations on data sets.
* Data Visulaization

<h3 style="color:skyblue;">Install pandas</h3>

In [1]:
# !pip install pandas 

<h3 style="color:skyblue;">Basic data structures in pandas</h3> <br>
Pandas provides two types of classes for handling data: <br>

1. <b>Series:</b> <br>
   a one-dimensional labeled array holding data of any type (integers, strings, Python objects etc.)
   
3. <b>DataFrame:</b> <br>
   a two-dimensional data structure that holds data like a two-dimension array or a table with rows and columns

<h3 style="color:skyblue;">Pandas Series</h3> 

* A Pandas Series is like a column in a table.

In [2]:
# import pandas as pd
import pandas as pd
 
# simple array
data = [1, 2, 3, 4]
 
ser = pd.Series(data, index = ['a', 'b', 'c', 'd'])

# ser = pd.Series(data, index = ['a', 'b', 'c', 'd'])

print(ser)
print("The type of pandas series is:", type(ser))

a    1
b    2
c    3
d    4
dtype: int64
The type of pandas series is: <class 'pandas.core.series.Series'>


<b>Note:</b>  

* Pandas Series is nothing but a column in an excel sheet. <br>
* The axis labels are collectively called index.

In [3]:
import pandas as pd

vec = {"vechile": "car", "cost": "expensive", "color": "white"}

# myvar = pd.Series(vec) 

myvar = pd.Series(vec, index = ["vechile", "cost"])

print(myvar)

vechile          car
cost       expensive
dtype: object


<h3 style="color:skyblue;">DataFrames</h3> 

* DataFrame is like a multi-dimensional table
* Series is a column and a Dataframe is the whole table

In [4]:
import pandas as pd

data = {
  "subject": ["science", "maths", "social"],
  "marks": [80, 99, 45]
}

#load data into a DataFrame object
df = pd.DataFrame(data)

# print(df.to_csv("c.csv"))
print("The type of pandas dataframe is:", type(df))

The type of pandas dataframe is: <class 'pandas.core.frame.DataFrame'>


<h3 style="color:green;">REMEMBER:</h3> <b> Pandas use the loc attribute to return one or more specified rows </b>

In [5]:
print(df.loc[1])

subject    maths
marks         99
Name: 1, dtype: object


In [6]:
print(df.loc[[0, 2]])

   subject  marks
0  science     80
2   social     45


<h3 style="color:skyblue;">Read CSV Files</h3> 

* csv file is one of the way to store big datasets

In [8]:
import pandas as pd

df = pd.read_csv('data/heart_failure_clinical_records_dataset.csv')
df.head()

Unnamed: 0,age,anaemia,creatinine_phosphokinase,diabetes,ejection_fraction,high_blood_pressure,platelets,serum_creatinine,serum_sodium,sex,smoking,time,DEATH_EVENT
0,75.0,0,582,0,20,1,265000.0,1.9,130,1,0,4,1
1,55.0,0,7861,0,38,0,263358.03,1.1,136,1,0,6,1
2,65.0,0,146,0,20,0,162000.0,1.3,129,1,1,7,1
3,50.0,1,111,0,20,0,210000.0,1.9,137,1,0,7,1
4,65.0,1,160,1,20,0,327000.0,2.7,116,0,0,8,1


In [9]:
df.head(10) # view first 5 rows

Unnamed: 0,age,anaemia,creatinine_phosphokinase,diabetes,ejection_fraction,high_blood_pressure,platelets,serum_creatinine,serum_sodium,sex,smoking,time,DEATH_EVENT
0,75.0,0,582,0,20,1,265000.0,1.9,130,1,0,4,1
1,55.0,0,7861,0,38,0,263358.03,1.1,136,1,0,6,1
2,65.0,0,146,0,20,0,162000.0,1.3,129,1,1,7,1
3,50.0,1,111,0,20,0,210000.0,1.9,137,1,0,7,1
4,65.0,1,160,1,20,0,327000.0,2.7,116,0,0,8,1
5,90.0,1,47,0,40,1,204000.0,2.1,132,1,1,8,1
6,75.0,1,246,0,15,0,127000.0,1.2,137,1,0,10,1
7,60.0,1,315,1,60,0,454000.0,1.1,131,1,1,10,1
8,65.0,0,157,0,65,0,263358.03,1.5,138,0,0,10,1
9,80.0,1,123,0,35,1,388000.0,9.4,133,1,1,10,1


In [10]:
df.tail() #view last five rows

Unnamed: 0,age,anaemia,creatinine_phosphokinase,diabetes,ejection_fraction,high_blood_pressure,platelets,serum_creatinine,serum_sodium,sex,smoking,time,DEATH_EVENT
294,62.0,0,61,1,38,1,155000.0,1.1,143,1,1,270,0
295,55.0,0,1820,0,38,0,270000.0,1.2,139,0,0,271,0
296,45.0,0,2060,1,60,0,742000.0,0.8,138,0,0,278,0
297,45.0,0,2413,0,38,0,140000.0,1.4,140,1,1,280,0
298,50.0,0,196,0,45,0,395000.0,1.6,136,1,1,285,0


In [11]:
print(df.info())  # gives information about dataset

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 299 entries, 0 to 298
Data columns (total 13 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   age                       299 non-null    float64
 1   anaemia                   299 non-null    int64  
 2   creatinine_phosphokinase  299 non-null    int64  
 3   diabetes                  299 non-null    int64  
 4   ejection_fraction         299 non-null    int64  
 5   high_blood_pressure       299 non-null    int64  
 6   platelets                 299 non-null    float64
 7   serum_creatinine          299 non-null    float64
 8   serum_sodium              299 non-null    int64  
 9   sex                       299 non-null    int64  
 10  smoking                   299 non-null    int64  
 11  time                      299 non-null    int64  
 12  DEATH_EVENT               299 non-null    int64  
dtypes: float64(3), int64(10)
memory usage: 30.5 KB
None


In [12]:
df.isna().sum()

age                         0
anaemia                     0
creatinine_phosphokinase    0
diabetes                    0
ejection_fraction           0
high_blood_pressure         0
platelets                   0
serum_creatinine            0
serum_sodium                0
sex                         0
smoking                     0
time                        0
DEATH_EVENT                 0
dtype: int64

<b>Note</b> : Empty values, or Null values, can be bad when analyzing data

In [52]:
# read data.csv 
# find first five rows, 50 rows

<h2 style="color:skyblue;"> Cleaning Data</h2>

* GIGO (Garbage In Garbage Out)

Some possible cases of bad data are:

* Empty cells
* Data in wrong format
* Wrong data
* Duplicates

<h3 style="color:blue;"> Dealing with empty cells</h3> <br>
There are two ways to deal with empty cells. <br>

* Remove rows
* replace empty values

<h3 style="color:violet;"> Removing rows </h3> 

* we remove empty rows with dropna()

In [2]:
import pandas as pd

df = pd.read_csv('data/fitness.csv')
df.head(20)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,60,110,136,374.0
7,45,104,134,253.3
8,30,109,133,195.1
9,60,98,124,269.0


In [3]:
df1 = df.dropna()

<h3 style="color:green;">REMEMBER:</h3> <br> 

* <b> dropna() method returns a new DataFrame, and will not change the original</b>
* <b> use the inplace = True argument, if you want to change the original DataFrame </b>

In [4]:
df.dropna(inplace = True)

In [5]:
df.head(40)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,60,110,136,374.0
7,45,104,134,253.3
8,30,109,133,195.1
9,60,98,124,269.0


<h3 style="color:violet;"> Replace Empty Values </h3> 

* The <b>fillna()</b> method allows us to replace empty cells with a value

In [6]:
df.fillna(20, inplace = True)

In [7]:
df.head(20)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,60,110,136,374.0
7,45,104,134,253.3
8,30,109,133,195.1
9,60,98,124,269.0


In [9]:
import pandas as pd

df = pd.read_csv('data/fitness.csv')
df.head()

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0


In [10]:
df["Pulse"]

0      110
1      117
2      103
3      109
4      117
      ... 
164    105
165    110
166    115
167    120
168    125
Name: Pulse, Length: 169, dtype: int64

In [11]:
df["Calories"].fillna(130, inplace = True) #replacing for specified column

In [12]:
df.head(20)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,60,110,136,374.0
7,45,104,134,253.3
8,30,109,133,195.1
9,60,98,124,269.0


<b>Note: </b> We can use mean, median and mode to replace empty values. <br>

Pandas has mean() median() and mode() methods

In [15]:
import pandas as pd

df = pd.read_csv('data/fitness.csv')

x = df["Calories"].mean()   # df["Calories"].median()  # df["Calories"].mode()[0]
print(x)
df["Calories"].fillna(x, inplace = True)

375.79024390243904


In [14]:
df.head(20)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,60,110,136,374.0
7,45,104,134,253.3
8,30,109,133,195.1
9,60,98,124,269.0


In [17]:
import pandas as pd

df = pd.read_csv("data/fitness.csv")
df.head()

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0


In [18]:
df.describe() # prints the summary statistics of all numeric columns

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
count,169.0,169.0,169.0,164.0
mean,63.846154,107.461538,134.047337,375.790244
std,42.299949,14.510259,16.450434,266.379919
min,15.0,80.0,100.0,50.3
25%,45.0,100.0,124.0,250.925
50%,60.0,105.0,131.0,318.6
75%,60.0,111.0,141.0,387.6
max,300.0,159.0,184.0,1860.4


In [19]:
df.describe(percentiles=[0.3, 0.5, 0.7])  # summary statistics with specific percentiles 

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
count,169.0,169.0,169.0,164.0
mean,63.846154,107.461538,134.047337,375.790244
std,42.299949,14.510259,16.450434,266.379919
min,15.0,80.0,100.0,50.3
30%,45.0,100.0,125.4,270.4
50%,60.0,105.0,131.0,318.6
70%,60.0,110.0,138.6,374.0
max,300.0,159.0,184.0,1860.4


In [20]:
df.describe().T  # transpose of rows and columns 

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Duration,169.0,63.846154,42.299949,15.0,45.0,60.0,60.0,300.0
Pulse,169.0,107.461538,14.510259,80.0,100.0,105.0,111.0,159.0
Maxpulse,169.0,134.047337,16.450434,100.0,124.0,131.0,141.0,184.0
Calories,164.0,375.790244,266.379919,50.3,250.925,318.6,387.6,1860.4


In [21]:
df.shape # gives the number of rows and columns 

(169, 4)

In [22]:
df.shape[0]  # number of rows only

169

In [23]:
df.shape[1]  # number of columns only

4

In [24]:
# gives name of column only
df.columns 

Index(['Duration', 'Pulse', 'Maxpulse', 'Calories'], dtype='object')

In [25]:
#can be converted into list using list() function  
list(df.columns)

['Duration', 'Pulse', 'Maxpulse', 'Calories']

<b>Note:</b>  isnull() method checks if there is null value

In [26]:
#check null value
df.isnull()

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,False,False,False,False
1,False,False,False,False
2,False,False,False,False
3,False,False,False,False
4,False,False,False,False
...,...,...,...,...
164,False,False,False,False
165,False,False,False,False
166,False,False,False,False
167,False,False,False,False


In [27]:
# use sum() to find sum of null values on each column
df.isnull().sum()

Duration    0
Pulse       0
Maxpulse    0
Calories    5
dtype: int64

In [28]:
# double sum provides total null values
df.isnull().sum().sum()

5

<h3 style="color:blue;"> Slicing and Extracting Data in pandas </h3>

<h4 style="color:blue;"> Working in columns </h4>

In [5]:
df.head()

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110.0,130,409.1
1,60,117.0,145,479.0
2,60,103.0,135,340.0
3,45,109.0,175,282.4
4,45,117.0,148,406.0


In [7]:
#select a single column
df['Maxpulse']

0      130
1      145
2      135
3      175
4      148
      ... 
164    140
165    145
166    145
167    150
168    150
Name: Maxpulse, Length: 169, dtype: int64

In [8]:
#select a double or more columns
df[['Duration', 'Pulse']]

Unnamed: 0,Duration,Pulse
0,60,110.0
1,60,117.0
2,60,103.0
3,45,109.0
4,45,117.0
...,...,...
164,60,105.0
165,60,110.0
166,60,115.0
167,75,120.0


<h4 style="color:blue;"> Working in rows </h4>

In [10]:
#select a single row
df[df.index==56]

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
56,60,118.0,121,413.0


<b>NOTE:</b> two or more rows can be obtained by .isin() method instead of a == operator

In [22]:
#selecting two or more rows using [ ] 
df[df.index.isin(range(5,10))]

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
5,60,102.0,45,300.0
6,60,110.0,136,374.0
7,45,104.0,134,253.3
8,30,109.0,133,195.1
9,60,98.0,124,269.0


<h4 style="color:blue;"> Using .loc[] and .iloc[] to fetch rows </h4>


In [23]:
df.loc[1]

Duration     60.0
Pulse       117.0
Maxpulse    145.0
Calories    479.0
Name: 1, dtype: float64

<b>REMEMBER:</b> .loc[] returns a pandas Series instead of a DataFrame

In [24]:
df.loc[10:20]

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
10,60,103.0,147,329.3
11,60,100.0,120,250.7
12,60,,128,345.3
13,60,104.0,132,379.3
14,60,104.0,132,379.3
15,60,104.0,132,379.3
16,60,100.0,120,300.0
17,45,90.0,112,
18,60,103.0,123,323.0
19,45,97.0,125,243.0


In [12]:
df.iloc[10]

Duration     60.0
Pulse       103.0
Maxpulse    147.0
Calories    329.3
Name: 10, dtype: float64

In [66]:
df.iloc[10:30]

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
10,60,103.0,147,329.3
11,60,100.0,120,250.7
12,60,,128,345.3
13,60,104.0,132,379.3
14,45,104.0,132,379.3
15,60,104.0,132,379.3
16,60,100.0,120,300.0
17,45,90.0,112,
18,60,103.0,123,323.0
19,45,97.0,125,243.0


In [13]:
df.loc[[10, 20, 30, 45]]

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
10,60,103.0,147,329.3
20,52,108.0,131,364.2
30,60,92.0,115,243.0
45,60,99.0,119,273.0


In [27]:
df.iloc[[10, 20, 30]]

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
10,60,103.0,147,329.3
20,52,108.0,131,364.2
30,60,92.0,115,243.0


In [19]:
df.loc[100:110, ['Duration', 'Maxpulse']]

Unnamed: 0,Duration,Maxpulse
100,20,112
101,90,110
102,90,100
103,90,100
104,30,108
105,30,128
106,180,120
107,30,120
108,90,120
109,210,184


In [18]:
# df1.to_json("modified.json")

In [21]:
df.iloc[5:10, :2]

Unnamed: 0,Duration,Pulse
5,60,102.0
6,60,110.0
7,45,104.0
8,30,109.0
9,60,98.0


In [2]:
df.head(10)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110.0,130,409.1
1,60,117.0,145,479.0
2,60,103.0,135,340.0
3,45,109.0,175,282.4
4,45,117.0,148,406.0
5,60,102.0,45,300.0
6,60,110.0,136,374.0
7,45,104.0,134,253.3
8,30,109.0,133,195.1
9,60,98.0,124,269.0


In [5]:
df = df.loc[df['Duration']==60, ['Duration']] = 12

In [6]:
df

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,12,110.0,130,409.1
1,12,117.0,145,479.0
2,12,103.0,135,340.0
3,45,109.0,175,282.4
4,45,117.0,148,406.0
...,...,...,...,...
164,12,105.0,140,290.8
165,12,110.0,145,300.0
166,12,115.0,145,310.2
167,75,120.0,150,320.4


In [78]:
df.iloc[5:10, :4]

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
5,60,102.0,148,300.0
6,80,110.0,136,374.0
7,45,104.0,134,253.3
8,30,109.0,133,195.1
9,60,98.0,124,269.0


<h4 style="color:blue;"> Conditional slicing </h4>

pandas lets you filter data by conditions over row/column values


In [7]:
df[df.Duration == 12]

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,12,110.0,130,409.1
1,12,117.0,145,479.0
2,12,103.0,135,340.0
5,12,102.0,45,300.0
6,12,110.0,136,374.0
...,...,...,...,...
157,12,100.0,120,270.4
158,12,114.0,150,382.8
164,12,105.0,140,290.8
165,12,110.0,145,300.0


In [83]:
# displays 'Pulse', 'Maxpulse', 'Calories' whose 'Duration' is grater than 75
df.loc[df['Duration'] > 75, ['Pulse', 'Maxpulse', 'Calories']]

Unnamed: 0,Pulse,Maxpulse,Calories
6,110.0,136,374.0
51,123.0,146,643.1
60,108.0,160,1376.0
61,110.0,137,1034.4
62,109.0,135,853.0
65,90.0,130,800.4
66,105.0,135,873.4
67,107.0,130,816.0
69,108.0,143,1500.2
70,97.0,129,1115.0


In [8]:
df1 = df.copy()

In [9]:
df1.head()

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,12,110.0,130,409.1
1,12,117.0,145,479.0
2,12,103.0,135,340.0
3,45,109.0,175,282.4
4,45,117.0,148,406.0


In [10]:
#rename columns 

df1.rename(columns = {'Calories':'Cal'}, inplace = True)
df1.head()

Unnamed: 0,Duration,Pulse,Maxpulse,Cal
0,12,110.0,130,409.1
1,12,117.0,145,479.0
2,12,103.0,135,340.0
3,45,109.0,175,282.4
4,45,117.0,148,406.0


In [11]:
df1.columns = ['Duration', 'Pulse', 'Max', 'Calories']
df1.head()

Unnamed: 0,Duration,Pulse,Max,Calories
0,12,110.0,130,409.1
1,12,117.0,145,479.0
2,12,103.0,135,340.0
3,45,109.0,175,282.4
4,45,117.0,148,406.0


<h3 style="color:blue;"> Cleaning Data of Wrong Format </h3> <br>

* Data of wrong format makes it difficult/impossible to analyze data
* There are two options to fix data of wrong format <br>

      1. remove the rows 
      2. convert into similar format

<h3 style="color:blue;"> Removing Rows </h3>

In [29]:
import pandas as pd

df = pd.read_csv("data/fitness.csv")
df.head(6)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0


In [30]:
df.dropna(subset=['Maxpulse'], inplace = True)

In [31]:
df.head(8)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,60,110,136,374.0
7,45,104,134,253.3


<h3 style="color:blue;"> Convert Into a Correct Format </h3>

In [32]:
df["Maxpulse"].fillna(130, inplace = True)

In [33]:
df.head(20)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,60,110,136,374.0
7,45,104,134,253.3
8,30,109,133,195.1
9,60,98,124,269.0


<h3 style="color:blue;"> Fixing Wrong Data </h3>

It can be fixed in two methods: <br>
 1. Replacing values
 2. Removing rows

<h4 style="color:blue;"> Replacing values </h4>

In [34]:
df.loc[6, 'Duration'] = 45

In [35]:
df.head(8)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,45,110,136,374.0
7,45,104,134,253.3


In [36]:
df.iloc[6,0]=29
df.head(8)

Unnamed: 0,Duration,Pulse,Maxpulse,Calories
0,60,110,130,409.1
1,60,117,145,479.0
2,60,103,135,340.0
3,45,109,175,282.4
4,45,117,148,406.0
5,60,102,127,300.0
6,29,110,136,374.0
7,45,104,134,253.3


In [6]:
if 'Duration' in df.columns:   
    for x in df.index:
        if df.loc[x, "Duration"] > 120:
            df.loc[x, "Duration"] = 120

In [8]:
df.head()

Unnamed: 0,Duration
0,120
1,110
2,120
3,100
4,120


<h4 style="color:blue;"> Removing Rows </h4>

In [3]:
import pandas as pd

df = pd.read_csv('data/fitness.csv')

for x in df.index:
  if df.loc[x, "Duration"] > 120:
    df.drop(x, inplace = True)

#remember to include the 'inplace = True' argument to make the changes in the original DataFrame object instead of returning a copy

print(df.to_string())

     Duration  Pulse  Maxpulse  Calories
0          60  110.0       130     409.1
1          60  117.0       145     479.0
2          60  103.0       135     340.0
3          45  109.0       175     282.4
4          45  117.0       148     406.0
5          60  102.0       148     300.0
7          45  104.0       134     253.3
8          30  109.0       133     195.1
9          60   98.0       124     269.0
10         60  103.0       147     329.3
11         60  100.0       120     250.7
12         60    NaN       128     345.3
13         60  104.0       132     379.3
14         45  104.0       132     379.3
15         60  104.0       132     379.3
16         60  100.0       120     300.0
17         45   90.0       112       NaN
18         60  103.0       123     323.0
19         45   97.0       125     243.0
20         45  108.0       131     364.2
21         45  100.0       119     282.0
22         60  130.0       101     300.0
23         45  105.0       132     246.0
24         60  1