# Python for Data Science Practice Session 1: Economics and Finance

# Employees Performance Analysis

The [Productivity Prediction of Garment Employees Data Set](https://archive.ics.uci.edu/ml/datasets/Productivity+Prediction+of+Garment+Employees) includes important attributes of the garment manufacturing process and the productivity of the employees which had been collected manually and also been validated by the industry experts.

In this notebook, we will assume that this dataset includes data from just one company. The company's management is interested in extracting some specific information about their employee's performance. The tasks we are going to work on are:
1. **Performance check** - get specified columns of a sample from the dataset
2. **Performance ranking** - filter and sort the dataset following specified rules
3. **Teams Ranking** - create a scoreboard for each team
4. **Lottery** - take a random sample out of rows that satisfy given conditions

If you are struggling with anything, check the **Tips** section at the end of the notebook. At the end of some tasks, you can find a number in parentheses that references the Tips section.

Let's get started!

The first step is to import the libraries that we will be using - in our case, `pandas`. Import it as `pd`, so that we could refer to it easier in the future.

In [2]:
import pandas as pd

Now we need to import the dataset. The dataset is available [here](https://archive.ics.uci.edu/ml/machine-learning-databases/00597/) (click on the `garments_worker_productivity.csv`, and it should download automatically). Save it as `all_data`. *(1)*

In [3]:
all_data = pd.read_csv("garments_worker_productivity.csv",sep=",")

Now, check if you have correctly imported and saved `all_data`.

In [4]:
all_data

Unnamed: 0,date,quarter,department,day,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
0,1/1/2015,Quarter1,sweing,Thursday,8,0.80,26.16,1108.0,7080,98,0.0,0,0,59.0,0.940725
1,1/1/2015,Quarter1,finishing,Thursday,1,0.75,3.94,,960,0,0.0,0,0,8.0,0.886500
2,1/1/2015,Quarter1,sweing,Thursday,11,0.80,11.41,968.0,3660,50,0.0,0,0,30.5,0.800570
3,1/1/2015,Quarter1,sweing,Thursday,12,0.80,11.41,968.0,3660,50,0.0,0,0,30.5,0.800570
4,1/1/2015,Quarter1,sweing,Thursday,6,0.80,25.90,1170.0,1920,50,0.0,0,0,56.0,0.800382
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1192,3/11/2015,Quarter2,finishing,Wednesday,10,0.75,2.90,,960,0,0.0,0,0,8.0,0.628333
1193,3/11/2015,Quarter2,finishing,Wednesday,8,0.70,3.90,,960,0,0.0,0,0,8.0,0.625625
1194,3/11/2015,Quarter2,finishing,Wednesday,7,0.65,3.90,,960,0,0.0,0,0,8.0,0.625625
1195,3/11/2015,Quarter2,finishing,Wednesday,9,0.75,2.90,,1800,0,0.0,0,0,15.0,0.505889


First, to get a grasp of a dataset, check the number of rows and columns.

In [5]:
all_data.shape

(1197, 15)

Any missing data could cause serious problems for the program. Inspect `all_data` to see the count of values in each column. Base on those pieces of information you can see if any values are missing.

In [6]:
all_data.count(0)

date                     1197
quarter                  1197
department               1197
day                      1197
team                     1197
targeted_productivity    1197
smv                      1197
wip                       691
over_time                1197
incentive                1197
idle_time                1197
idle_men                 1197
no_of_style_change       1197
no_of_workers            1197
actual_productivity      1197
dtype: int64

As the row `count` tells us, some values are missing in the `wip` column - we need to keep that in mind.

## Performance check

Let's begin this part of the notebook by showing the sample of four rows of our dataset.

In [7]:
all_data.sample(4)

Unnamed: 0,date,quarter,department,day,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
1074,3/5/2015,Quarter1,finishing,Thursday,3,0.8,4.6,,1800,0,0.0,0,0,15.0,0.810111
25,1/3/2015,Quarter1,sweing,Saturday,3,0.8,28.08,913.0,6540,50,0.0,0,0,54.5,0.800323
108,1/7/2015,Quarter1,sweing,Wednesday,8,0.8,25.9,1135.0,10170,60,0.0,0,0,56.5,0.850137
479,1/28/2015,Quarter4,finishing,Wednesday,6,0.35,2.9,,1800,0,0.0,0,0,15.0,0.977556


As you can see, the sample includes a lot of different columns. To make working on the data easier, you can limit the displayed data using `loc[]`.

(Take for example `date`,`department`,`team`,`no_of_workers`,`targeted_productivity`,`actual_productivity`.)

In [8]:
all_data.loc[:,['date','department','team','no_of_workers','targeted_productivity','actual_productivity']].sample(4)

Unnamed: 0,date,department,team,no_of_workers,targeted_productivity,actual_productivity
965,2/28/2015,finishing,5,18.0,0.8,0.921605
685,2/9/2015,finishing,8,15.0,0.35,0.707111
674,2/9/2015,finishing,2,18.0,0.8,1.057963
290,1/17/2015,sweing,4,56.5,0.7,0.700542


In case you find something concerning in the sample, you might want to save it for further analysis. Let's save it and call the variable `random_check`.

In [9]:
#random_check =
random_check=all_data.loc[:,['date','department','team','no_of_workers','targeted_productivity','actual_productivity']].sample(4)

You can check whether you have saved the data correctly by showing your variable.

In [10]:
random_check

Unnamed: 0,date,department,team,no_of_workers,targeted_productivity,actual_productivity
266,1/15/2015,sweing,7,56.5,0.8,0.850137
959,2/26/2015,finishing,10,8.0,0.7,0.410833
52,1/4/2015,finishing,2,8.0,0.8,0.792104
736,2/12/2015,sweing,5,59.0,0.75,0.750799


If you think that some values in your sample are concerning (too high or too low), you can take the mean to check whether your observations should concern you. Let's say that the number of workers in your sample seems too low. Check if that's the case by taking the mean.

In [11]:
all_data["no_of_workers"].mean()


34.60985797827903

# Performance Ranking

Let's begin once again by showing our dataset.

In [12]:
all_data

Unnamed: 0,date,quarter,department,day,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
0,1/1/2015,Quarter1,sweing,Thursday,8,0.80,26.16,1108.0,7080,98,0.0,0,0,59.0,0.940725
1,1/1/2015,Quarter1,finishing,Thursday,1,0.75,3.94,,960,0,0.0,0,0,8.0,0.886500
2,1/1/2015,Quarter1,sweing,Thursday,11,0.80,11.41,968.0,3660,50,0.0,0,0,30.5,0.800570
3,1/1/2015,Quarter1,sweing,Thursday,12,0.80,11.41,968.0,3660,50,0.0,0,0,30.5,0.800570
4,1/1/2015,Quarter1,sweing,Thursday,6,0.80,25.90,1170.0,1920,50,0.0,0,0,56.0,0.800382
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1192,3/11/2015,Quarter2,finishing,Wednesday,10,0.75,2.90,,960,0,0.0,0,0,8.0,0.628333
1193,3/11/2015,Quarter2,finishing,Wednesday,8,0.70,3.90,,960,0,0.0,0,0,8.0,0.625625
1194,3/11/2015,Quarter2,finishing,Wednesday,7,0.65,3.90,,960,0,0.0,0,0,8.0,0.625625
1195,3/11/2015,Quarter2,finishing,Wednesday,9,0.75,2.90,,1800,0,0.0,0,0,15.0,0.505889


Sometimes you are interested only in some particular rows. To focus only on the data that you are interested in, you can use `loc[]`. (Show only the rows where `actual_productivity` is lower than `targeted_productivity`)

In [13]:
all_data.loc[all_data['actual_productivity']<all_data['targeted_productivity']]

Unnamed: 0,date,quarter,department,day,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
11,1/1/2015,Quarter1,sweing,Thursday,10,0.75,19.31,578.0,6480,45,0.0,0,0,54.0,0.712205
12,1/1/2015,Quarter1,sweing,Thursday,5,0.80,11.41,668.0,3660,50,0.0,0,0,30.5,0.707046
14,1/1/2015,Quarter1,finishing,Thursday,8,0.75,2.90,,960,0,0.0,0,0,8.0,0.676667
15,1/1/2015,Quarter1,finishing,Thursday,4,0.75,3.94,,2160,0,0.0,0,0,18.0,0.593056
16,1/1/2015,Quarter1,finishing,Thursday,7,0.80,2.90,,960,0,0.0,0,0,8.0,0.540729
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1192,3/11/2015,Quarter2,finishing,Wednesday,10,0.75,2.90,,960,0,0.0,0,0,8.0,0.628333
1193,3/11/2015,Quarter2,finishing,Wednesday,8,0.70,3.90,,960,0,0.0,0,0,8.0,0.625625
1194,3/11/2015,Quarter2,finishing,Wednesday,7,0.65,3.90,,960,0,0.0,0,0,8.0,0.625625
1195,3/11/2015,Quarter2,finishing,Wednesday,9,0.75,2.90,,1800,0,0.0,0,0,15.0,0.505889


We will be working on those data later. To access them easier, save them as `below_target`.

In [14]:
below_target=all_data.loc[all_data['actual_productivity']<all_data['targeted_productivity']]

Check if you have correctly saved the data filtered data.

In [15]:
below_target

Unnamed: 0,date,quarter,department,day,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
11,1/1/2015,Quarter1,sweing,Thursday,10,0.75,19.31,578.0,6480,45,0.0,0,0,54.0,0.712205
12,1/1/2015,Quarter1,sweing,Thursday,5,0.80,11.41,668.0,3660,50,0.0,0,0,30.5,0.707046
14,1/1/2015,Quarter1,finishing,Thursday,8,0.75,2.90,,960,0,0.0,0,0,8.0,0.676667
15,1/1/2015,Quarter1,finishing,Thursday,4,0.75,3.94,,2160,0,0.0,0,0,18.0,0.593056
16,1/1/2015,Quarter1,finishing,Thursday,7,0.80,2.90,,960,0,0.0,0,0,8.0,0.540729
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1192,3/11/2015,Quarter2,finishing,Wednesday,10,0.75,2.90,,960,0,0.0,0,0,8.0,0.628333
1193,3/11/2015,Quarter2,finishing,Wednesday,8,0.70,3.90,,960,0,0.0,0,0,8.0,0.625625
1194,3/11/2015,Quarter2,finishing,Wednesday,7,0.65,3.90,,960,0,0.0,0,0,8.0,0.625625
1195,3/11/2015,Quarter2,finishing,Wednesday,9,0.75,2.90,,1800,0,0.0,0,0,15.0,0.505889


If you want to filter the data once again, you can do it using `loc[]` (keep only rows with positive `wip`). *(2)*

In [16]:
below_target=below_target.loc[all_data['wip']>0]

Once you got your data filtered, you can sort it by your chosen value. You can also combine it with `.head()` or `.tail()` to see only a specified number of 'best' or 'worst' rows. (sort data by `wip` and show 25 best rows) *(3)*

In [21]:
below_target.sort_values('wip').tail(25)


Unnamed: 0,date,quarter,department,day,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
558,2/1/2015,Quarter1,sweing,Sunday,8,0.6,24.26,1196.0,6600,0,0.0,0,0,55.0,0.466821
56,1/4/2015,Quarter1,sweing,Sunday,10,0.7,28.08,1202.0,6900,40,0.0,0,0,57.5,0.699965
192,1/11/2015,Quarter2,sweing,Sunday,5,0.6,20.79,1208.0,7980,0,0.0,0,0,57.0,0.45298
1096,3/7/2015,Quarter1,sweing,Saturday,9,0.75,18.79,1226.0,3480,45,0.0,0,0,51.0,0.749987
476,1/27/2015,Quarter4,sweing,Tuesday,11,0.5,48.18,1282.0,6480,23,0.0,0,0,54.0,0.370467
235,1/13/2015,Quarter2,sweing,Tuesday,5,0.7,20.79,1297.0,10440,0,0.0,0,0,58.0,0.52681
1038,3/3/2015,Quarter1,sweing,Tuesday,10,0.7,21.82,1313.0,5040,30,0.0,0,0,51.0,0.699984
781,2/15/2015,Quarter3,sweing,Sunday,3,0.6,30.1,1361.0,6960,30,0.0,0,1,58.0,0.475718
594,2/3/2015,Quarter1,sweing,Tuesday,8,0.7,24.26,1363.0,6660,0,0.0,0,0,55.5,0.586465
316,1/18/2015,Quarter3,sweing,Sunday,1,0.8,49.1,1381.0,10350,24,0.0,0,0,57.5,0.403242


## Teams Ranking

Now, based on the commands that you have done before, take `team`,`targeted_productivity`,`actual_productivity`,`over_time`,`wip` columns, and save it as `teams`. Make a copy, not a reference.

In [22]:
teams =all_data.loc[:,["team","targeted_productivity","actual_productivity","over_time","wip"]].copy()

As we remember, our dataset has some missing values in `wip` column. To prevent any errors, fill them with zeros. *(4)*

In [23]:
teams['wip']=teams['wip'].fillna(0)

Now use the command that you have used at the beginning to check if you have correctly filled missing values.

In [24]:
teams.count(0)

team                     1197
targeted_productivity    1197
actual_productivity      1197
over_time                1197
wip                      1197
dtype: int64

If you want to combine values by chosen category (for example mean value of each column for each team), you can use `groupby`. Save the result of grouping as a new dataframe named `teams_performance`. *(5)*

In [25]:
teams_performance= teams.groupby("team").mean()

Let's check if you correctly grouped data. Show `teams_performance`.

In [26]:
teams_performance

Unnamed: 0_level_0,targeted_productivity,actual_productivity,over_time,wip
team,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,0.746667,0.821054,4793.428571,858.238095
2,0.739908,0.770855,4384.954128,693.559633
3,0.742105,0.80388,5375.684211,860.410526
4,0.717619,0.770035,5449.714286,684.780952
5,0.673656,0.697981,5330.967742,482.548387
6,0.731383,0.685385,3369.095745,587.840426
7,0.714271,0.668006,4857.1875,572.635417
8,0.708257,0.674148,4312.293578,505.733945
9,0.758173,0.734462,4519.038462,715.923077
10,0.7385,0.719736,4736.7,871.15


Because actual productivity is a poor reflection of productivity, create a new column called `relative_productivity`, which is equal to `actual_productivity` divided by `targeted_productivity`.

In [50]:
teams_performance['relative_productivity'] = teams_performance['actual_productivity']/teams_performance['targeted_productivity']

Unnamed: 0_level_0,targeted_productivity,actual_productivity,over_time,wip,relative_productivity
team,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,0.746667,0.821054,4793.428571,858.238095,1.099626
2,0.739908,0.770855,4384.954128,693.559633,1.041825
3,0.742105,0.80388,5375.684211,860.410526,1.083242
4,0.717619,0.770035,5449.714286,684.780952,1.073041
5,0.673656,0.697981,5330.967742,482.548387,1.036109
6,0.731383,0.685385,3369.095745,587.840426,0.937109
7,0.714271,0.668006,4857.1875,572.635417,0.935227
8,0.708257,0.674148,4312.293578,505.733945,0.951841
9,0.758173,0.734462,4519.038462,715.923077,0.968726
10,0.7385,0.719736,4736.7,871.15,0.974592


To get a grasp of the relative productivity distribution, show minimum and maximum of relative productivity. *(6)*

In [46]:
print("max:", teams_performance['relative_productivity'].max())
print("min:", teams_performance['relative_productivity'].min())


max: 1.0996264050892857
min: 0.9352271918039957


Now, to create a simple scoring system, follow the commands listed below:

Create `rel_min` and `rel_max` which are respectively minimum and maximum values of `relative_productivity` column. *(7)*

In [29]:
rel_min= teams_performance['relative_productivity'].min()

In [30]:
rel_max =teams_performance['relative_productivity'].max()

Do the same for `over_time` and `wip` columns (name them `time_min`,`time_max`, `wip_min` and `wip_max`)

In [31]:
time_min = teams_performance['over_time'].min()

In [32]:
time_max = teams_performance['over_time'].max()

In [33]:
wip_min =teams_performance['wip'].min()

In [34]:
wip_max =teams_performance['wip'].max()

Create a new empty dataframe called `score_board` with the same indexes as `teams_performance`. *(8)*

In [35]:
score_board = pd.DataFrame(index=range(1,12,1))

Create columns `productivity_points`, `overtime_points` and `wip_penalty` using given formula:

$$ points = \frac{(value - min\_value) \cdot max\_points}{max\_value - min\_value} $$

* min_value - minimum value of given column
* max_value - maximum value of given column
* max_points - points for a maximum score (100 for `productivity_points`, 50 for `overtime_points` and (-30) for `wip_penalty`

In [36]:
score_board['performance_points'] =((teams_performance['relative_productivity']-rel_min)*100)/(rel_max-rel_min)
score_board['performance_points']

1     100.000000
2      64.840998
3      90.033960
4      83.828891
5      61.363775
6       1.144378
7       0.000000
8      10.105916
9      20.376454
10     23.944652
11     20.396791
Name: performance_points, dtype: float64

In [40]:

score_board['overtime_points'] = ((teams_performance['over_time']-time_min)*50)/(time_max-time_min)
score_board['overtime_points']

1     34.607132
2     25.026558
3     48.263660
4     50.000000
5     47.214856
6      1.200085
7     36.102567
8     23.322340
9     28.171443
10    33.276590
11    24.030817
Name: overtime_points, dtype: float64

In [41]:
score_board['wip_penalty'] = ((teams_performance['wip']-wip_min)*-30)/(wip_max-wip_min)
score_board['wip_penalty']

1    -29.003202
2    -16.290044
3    -29.170914
4    -15.612331
5     -0.000000
6     -8.128533
7     -6.954708
8     -1.789922
9    -18.016499
10   -30.000000
11   -12.857064
Name: wip_penalty, dtype: float64

Now round all the numbers in `score_board` to intiger. *(9)*

In [48]:
score_board = score_board.round(0)
score_board

Unnamed: 0_level_0,targeted_productivity,actual_productivity,over_time,wip,relative_productivity
team,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,0.746667,0.821054,4793.428571,858.238095,1.099626
2,0.739908,0.770855,4384.954128,693.559633,1.041825
3,0.742105,0.80388,5375.684211,860.410526,1.083242
4,0.717619,0.770035,5449.714286,684.780952,1.073041
5,0.673656,0.697981,5330.967742,482.548387,1.036109
6,0.731383,0.685385,3369.095745,587.840426,0.937109
7,0.714271,0.668006,4857.1875,572.635417,0.935227
8,0.708257,0.674148,4312.293578,505.733945,0.951841
9,0.758173,0.734462,4519.038462,715.923077,0.968726
10,0.7385,0.719736,4736.7,871.15,0.974592


Add up all scores in a new column called `total_score`.

In [44]:
score_board['total_score']= score_board['performance_points']+score_board['overtime_points']+score_board['wip_penalty']

Show the `score_board` to check if everything is alright.

In [45]:
score_board

Unnamed: 0,performance_points,overtime_points,wip_penalty,total_score
1,100.0,35.0,-29.0,106.0
2,65.0,25.0,-16.0,74.0
3,90.0,48.0,-29.0,109.0
4,84.0,50.0,-16.0,118.0
5,61.0,47.0,-0.0,108.0
6,1.0,1.0,-8.0,-6.0
7,0.0,36.0,-7.0,29.0
8,10.0,23.0,-2.0,31.0
9,20.0,28.0,-18.0,30.0
10,24.0,33.0,-30.0,27.0


Once you have finished your scoring system, you can show the 3 worst teams ranked by `total_score`.

In [52]:
score_board.sort_values('total_score').head(3)


Unnamed: 0,performance_points,overtime_points,wip_penalty,total_score
6,1.0,1.0,-8.0,-6.0
10,24.0,33.0,-30.0,27.0
7,0.0,36.0,-7.0,29.0


You can also easily show the 3 best teams.

In [53]:
score_board.sort_values('total_score').tail(3)


Unnamed: 0,performance_points,overtime_points,wip_penalty,total_score
5,61.0,47.0,-0.0,108.0
3,90.0,48.0,-29.0,109.0
4,84.0,50.0,-16.0,118.0


## Lottery

Let's say that the company is running the lottery for the rows with a productivity of more than 0.95. Filter the rows following this rule.

In [57]:
all_data.loc[all_data['actual_productivity']>0.95]

Unnamed: 0,date,quarter,department,day,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
19,1/3/2015,Quarter1,finishing,Saturday,4,0.80,4.15,,6600,0,0.0,0,0,20.0,0.988025
20,1/3/2015,Quarter1,finishing,Saturday,11,0.75,2.90,,5640,0,0.0,0,0,17.0,0.987880
21,1/3/2015,Quarter1,finishing,Saturday,9,0.80,4.15,,960,0,0.0,0,0,8.0,0.956271
40,1/4/2015,Quarter1,finishing,Sunday,3,0.75,4.15,,1560,0,0.0,0,0,8.0,0.991389
61,1/5/2015,Quarter1,finishing,Monday,1,0.80,3.94,,1920,0,0.0,0,0,8.0,0.961059
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1025,3/3/2015,Quarter1,finishing,Tuesday,7,0.80,4.60,,4200,0,0.0,0,0,10.0,0.999533
1068,3/5/2015,Quarter1,finishing,Thursday,8,0.80,4.60,,2640,0,0.0,0,0,22.0,0.980985
1069,3/5/2015,Quarter1,finishing,Thursday,2,0.60,3.90,,960,0,0.0,0,0,8.0,0.950625
1106,3/8/2015,Quarter2,finishing,Sunday,3,0.80,4.60,,1440,0,0.0,0,0,12.0,0.951944


Unfortunately, the lottery is only for 3 rows, so you need to take them out of the dataset. Also, the lottery organisers are interested only in columns: `date`, `team` and `actual_productivity`.

In [66]:
all_data.loc[all_data['actual_productivity']>0.95,['date','team','actual_productivity']].sample(3)

Unnamed: 0,date,team,actual_productivity
280,1/17/2015,10,0.974621
529,1/31/2015,8,0.971867
520,1/31/2015,2,0.971867


You also need to save the results so that you can easily send them over to lottery organisers (save them as `lottery_results.csv`).

In [67]:
all_data.loc[all_data['actual_productivity']>0.95,['date','team','actual_productivity']].sample(3).to_csv('lottery_results.csv', index=False)

# Tips:
* (1)   Make sure that our dataset is **in the same folder** as the notebook. The data are sepereated by **comma**. -> [help](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html)
* (2) Use the format: *dataframe = dataframe.loc[condition]*
* (3) -> [help](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html)
* (4) Use the format: *dataframe['column_name'] = dataframe['column_name'].fillna()*. -> [help](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html)
* (5) Use the format: *new_dataframe = dataframe.groupby().mean()*. -> [help](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html)
* (6) Use `.describe()`
* (7) Combine `.describe()` with `loc[]`
* (8) -> [help](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html)
* (9) -> [help](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.round.html)