# Describing data with summary statistics
---

This worksheet is tied to the Pandas Getting Started Tutorials, picking out particular tutorials to link them into a theme here.

We will focus on describing data.  This is the least risky in terms of bias and inaccurate conclusions as it should focus just on what data is presented to us.

Each exercise will ask you to work through on tutorial on the Getting Started page, to try the code from the tutorial here and to try a second, similar action.

---

The practice data from the tutorials comes from a dataset on Titanic passengers.


### Exercise 1 - open the Titanic dataset
---

The Titanic dataset is stored at this URL:
https://raw.githubusercontent.com/pandas-dev/pandas/master/doc/data/titanic.csv

Read the dataset into a pandas dataframe that you will call **titanic**.

**Test output**:  
The shape of the dataframe will be (891, 12)

In [1]:
import pandas as pd

titanic = pd.read_csv("https://raw.githubusercontent.com/pandas-dev/pandas/master/doc/data/titanic.csv")

print(titanic)


     PassengerId  Survived  Pclass  \
0              1         0       3   
1              2         1       1   
2              3         1       3   
3              4         1       1   
4              5         0       3   
..           ...       ...     ...   
886          887         0       2   
887          888         1       1   
888          889         0       3   
889          890         1       1   
890          891         0       3   

                                                  Name     Sex   Age  SibSp  \
0                              Braund, Mr. Owen Harris    male  22.0      1   
1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   
2                               Heikkinen, Miss. Laina  female  26.0      0   
3         Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   
4                             Allen, Mr. William Henry    male  35.0      0   
..                                                 ...     ...   ... 

### Exercise 2 - get summary information about the dataframe
---

Read through the tutorials:  
[What kind of data does pandas handle?](https://pandas.pydata.org/docs/getting_started/intro_tutorials/01_table_oriented.html#)  
[How do I read and write tabular data?](https://pandas.pydata.org/docs/getting_started/intro_tutorials/02_read_write.html)

Use panda functions to display the following:
1.  A technical summary of the data (info())
2.  A description of the numerical data (describe())
3.  Display the Series 'Age'

**Test output**:   
1.  The info should show that there are only 204 values in the Cabin series, out of 891 records.  
2.  The description should show 7 columns and a mean age of 29.699118
3.  The Age series should have values of type float64 and Length 891

In [5]:
titanic.info()
titanic.describe()
titanic['Age']

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        204 non-null    object 
 11  Embarked     889 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB


0      22.0
1      38.0
2      26.0
3      35.0
4      35.0
       ... 
886    27.0
887    19.0
888     NaN
889    26.0
890    32.0
Name: Age, Length: 891, dtype: float64

### Exercise 3 - aggregating statistics
---

Read through the tutorial:  
[How to calculate summary statistics?](https://pandas.pydata.org/docs/getting_started/intro_tutorials/06_calculate_statistics.html#)  

Use panda functions to display the following summary statistics from the titanic dataset:  

1.  The average (mean) age of passengers  
2.  The median age and fare  
3.  The mean fare
4.  The modal fare and gender

**Test output**:   
29.699118, Age 28.0000 Fare 14.4542, 32.2042079685746, Fare 8.05 Sex male 


In [4]:
print(titanic['Age'].mean())
print(titanic[['Age','Fare']].median())
print(titanic['Fare'].mean())
print(titanic[['Fare','Sex']].mode())

29.69911764705882
Age     28.0000
Fare    14.4542
dtype: float64
32.2042079685746
   Fare   Sex
0  8.05  male


### Exercise 4 - displaying other statistics
---

Take a look at the list of methods available for giving summary statistics [here](https://pandas.pydata.org/docs/user_guide/basics.html#basics-stats) 

Use panda functions, and your existing knowledge, to display the following summary statistics from the titanic dataset:

1.  The total number of passengers on the titanic
2.  The age of the youngest passenger
3.  The most expensive ticket price
4.  The range of ticket prices
5.  The number of passenges with cabins
6.  The code for the port where the highest number of passengers embarked
7.  The most populous gender
8.  The standard deviation for age and fare

**Test output**:  
891, 0.42, 512.3292, 512.3292, 204, S, male, Age 14.526497 Fare 49.693429

In [90]:
titanic.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        204 non-null    object 
 11  Embarked     889 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB


In [33]:
print(titanic['PassengerId'].count())
print(titanic['Age'].min())
print(titanic['Fare'].max())
print((titanic['Fare'].max()) - (titanic['Fare'].min()))
print(titanic['Cabin'].count())
print(titanic['Embarked'].mode())
print(titanic['Sex'].mode())
print(titanic[['Age','Fare']].std())


891
0.42
512.3292
512.3292
204
0    S
dtype: object
0    male
dtype: object
Age     14.526497
Fare    49.693429
dtype: float64


### Exercise 5 - aggregating statistics grouped by category
---

Refer again to the tutorial  
[How to calculate summary statistics?](https://pandas.pydata.org/docs/getting_started/intro_tutorials/06_calculate_statistics.html#)   
looking particularly at the section on Aggregating statistics grouped by category.

1.  What is the mean age for male versus female Titanic passengers?
2.  What is the mean ticket fare price for each of the sex and cabin class combinations?
3.  What is the mean ticket fare price for passengers who embarked at each port?
4.  Which passenger class had the highest number of survivors (for now, just show the statistics - it may not be meaningful yet)?

**Test output**:  
1.  female 27.915709 male 30.726645
2.  
```
female  1         106.125798
            2          21.970121
            3          16.118810
male    1          67.226127
            2          19.741782
            3          12.661633
```
3.  
```
C    59.954144
Q    13.276030
S    27.079812
```
4. 

```
Survived  Pclass

0         1          80
          2          97
          3         372
1         1         136
          2          87
          3         119
```






In [69]:
print(titanic[["Sex", "Age"]].groupby("Sex").mean())
print(titanic[['Fare','Sex','Pclass']].groupby(['Sex','Pclass']).mean())
print(titanic[['Fare','Embarked']].groupby('Embarked').mean())
print(titanic[['Survived','Pclass']].groupby('Survived').max())

              Age
Sex              
female  27.915709
male    30.726645
                     Fare
Sex    Pclass            
female 1       106.125798
       2        21.970121
       3        16.118810
male   1        67.226127
       2        19.741782
       3        12.661633
               Fare
Embarked           
C         59.954144
Q         13.276030
S         27.079812
          Pclass
Survived        
0              3
1              3


### Exercise 6 - an aggregation of different statistics
---

Use the function titanic.agg() as shown in the tutorial  
[How to calculate summary statistics?](https://pandas.pydata.org/docs/getting_started/intro_tutorials/06_calculate_statistics.html#)  

1.  Display

```
     {
         "Age": ["min", "max", "median", "skew"],
         "Fare": ["min", "max", "median", "mean"]
     }
```
2.  Display:  
min, max and mean for Age  
min, max and standard deviation for Fare  
count for Cabin

**Test output**:   
1.  	
```
                  Age	      Fare  
max	   80.000000	512.329200  
mean	  NaN	      32.204208  
median	28.000000	14.454200  
min	   0.420000	 0.000000  
skew	  0.389108	 NaN
```

2.   
```
	        Age	    Fare	   Cabin
count  NaN	    NaN	    204.0
max	80.000000  512.329200 NaN
mean   29.699118  NaN        NaN
min	0.420000   0.000000   NaN
std	NaN        49.693429  NaN
```




In [72]:
print(titanic.agg({"Age": ["min", "max", "median", "skew"],"Fare": ["min", "max", "median", "mean"],}))
print(titanic.agg({'Age':['min','max','mean'],'Fare':['min','max','std'],'Cabin':['count']}))

              Age        Fare
max     80.000000  512.329200
mean          NaN   32.204208
median  28.000000   14.454200
min      0.420000    0.000000
skew     0.389108         NaN
             Age        Fare  Cabin
count        NaN         NaN  204.0
max    80.000000  512.329200    NaN
mean   29.699118         NaN    NaN
min     0.420000    0.000000    NaN
std          NaN   49.693429    NaN


### Exercise 7 - count by category
---

Read the section Count number of records by category in the tutorial  
[How to calculate summary statistics?](https://pandas.pydata.org/docs/getting_started/intro_tutorials/06_calculate_statistics.html#)

1. Display the number of passengers of each gender who had a ticket
2. Display the number of passengers who embarked at each port and had a ticket
3. Calculate the percentage of PassengerIds who survived the sinking of the Titanic (*Hint:  try getting the PassengerIds with a count for survived or not.  Store this value in a new variable, which will contain a list/array.  The second item in this list will be the number who survived.  You can use this number and the count of PassengerIds to calculate the percentage*)

**Test output**:  
1.  female 314, male 577
2.  C 168, Q 77, S 644
3.  38.38383838383838



In [81]:
print(titanic[["Sex","Ticket"]].groupby('Sex').count())
print(titanic[['Embarked','Ticket']].groupby('Embarked').count())
print(titanic[['PassengerId','Survived']].groupby('Survived').count())
print(342/891*100)


        Ticket
Sex           
female     314
male       577
          Ticket
Embarked        
C            168
Q             77
S            644
          PassengerId
Survived             
0                 549
1                 342
38.38383838383838


### Exercise 8 - summary happiness statistics
---

Open the data set here: https://github.com/futureCodersSE/working-with-data/blob/main/Happiness-Data/2019.xlsx?raw=true

It contains data on people's perception of happiness levels in a number of countries across the world.

1.  Display the number of records in the set  
2.  Display the description of the numerical data  
3.  Display the highest GDP and life expectancy  
4.  Display the mean, max and min for Freedom,  mean, max, min and skew for Generosity and mean, min, max and std for GDP  

**Test output**:  
1.  156
2.  Table showing count, mean, std, min, 25%, 50%, 75%, max for 8 columns
3.  GDP 0.905147, life expectancy 0.725244  
4.  


```
	   Freedom to make life choices	Generosity	GDP per capita
max	 0.631000	                   0.566000	  1.684000
mean	0.392571	                   0.184846	  0.905147
min	 0.000000	                   0.000000  	0.000000
skew	NaN	                        0.745942	  NaN
std	 NaN	                        NaN	       0.398389
```




In [95]:
url = 'https://github.com/futureCodersSE/working-with-data/blob/main/Happiness-Data/2019.xlsx?raw=true'

happy = pd.read_excel(url)
print(happy.index)
print(happy.describe())
print(happy[['GDP per capita','Healthy life expectancy']].max())
print(happy.agg({'Freedom to make life choices':['mean','max','min'],'Generosity':['skew'],'GDP per capita':['mean','min','max','std']}))

RangeIndex(start=0, stop=156, step=1)
       Overall rank       Score  GDP per capita  Social support  \
count    156.000000  156.000000      156.000000      156.000000   
mean      78.500000    5.407096        0.905147        1.208814   
std       45.177428    1.113120        0.398389        0.299191   
min        1.000000    2.853000        0.000000        0.000000   
25%       39.750000    4.544500        0.602750        1.055750   
50%       78.500000    5.379500        0.960000        1.271500   
75%      117.250000    6.184500        1.232500        1.452500   
max      156.000000    7.769000        1.684000        1.624000   

       Healthy life expectancy  Freedom to make life choices  Generosity  \
count               156.000000                    156.000000  156.000000   
mean                  0.725244                      0.392571    0.184846   
std                   0.242124                      0.143289    0.095254   
min                   0.000000                      0.

### Exercise 9 - migration data
---

Open the dataset at this url: https://github.com/futureCodersSE/working-with-data/blob/main/Data%20sets/public_use-talent-migration.xlsx?raw=true  Open the sheet named *Country Migration*

1.  Describe the dataset  
2.  Show summary information
3.  Display the mean net per 10K migration in each of the years 2015 to 2019
4.  Display the mean, max and min migration for the year 2019 for each of the regions (*base_country_wb_region*)
5.  Display the median net migration for the years 2015 and 2019 for the base countries by income level
6.  Display the number of target countries in each income level
7.  Display the mean net migration for all five years, for each income level

**Test output**:  
1  count, mean, std, min, 25%, 50%, 75%, max for 9 columns

2  shows 16 columns with non-null count of 4148 in each column

3  
```
net_per_10K_2015    0.461757
net_per_10K_2016    0.150248
net_per_10K_2017   -0.080272
net_per_10K_2018   -0.040591
net_per_10K_2019   -0.022743
dtype: float64
```

4  
```
	net_per_10K_2019
mean	max	min
base_country_wb_region			
East Asia & Pacific	0.198827	21.57	-9.88
Europe & Central Asia	0.208974	87.71	-21.34
Latin America & Caribbean	-0.904602	21.15	-31.75
Middle East & North Africa	-0.107655	55.60	-50.33
North America	0.239246	23.20	-0.29
South Asia	-0.514577	13.72	-24.89
Sub-Saharan Africa	-0.279729	37.11	-21.54
```

5  
```
	net_per_10K_2015	net_per_10K_2019
base_country_wb_income		
High Income	0.02	0.04
Low Income	0.42	-0.05
Lower Middle Income	-0.02	-0.07
Upper Middle Income	-0.03	-0.08
```

6  
```
base_country_wb_income
High Income            2415
Low Income              185
Lower Middle Income     653
Upper Middle Income     895
Name: target_country_name, dtype: int64 
```

7  
```
net_per_10K_2015	net_per_10K_2016	net_per_10K_2017	net_per_10K_2018	net_per_10K_2019
base_country_wb_income					
High Income	0.505482	0.391379	0.314178	0.379201	0.401470
Low Income	1.876432	0.798270	-0.684865	-0.677784	-0.681459
Lower Middle Income	0.591654	-0.029893	-0.519433	-0.527136	-0.476616
Upper Middle Income	-0.043419	-0.502916	-0.699240	-0.686626	-0.700101
```






In [97]:
url = 'https://github.com/futureCodersSE/working-with-data/blob/main/Data%20sets/public_use-talent-migration.xlsx?raw=true'
migos = pd.read_excel(url,sheet_name='Country Migration')


In [105]:
print(migos.describe())
print(migos.info())
print(migos[['net_per_10K_2015','net_per_10K_2016','net_per_10K_2017','net_per_10K_2018','net_per_10K_2019']].mean())
print(migos.agg({'net_per_10K_2019':['mean','max','min']}))

          base_lat    base_long   target_lat  target_long  net_per_10K_2015  \
count  4148.000000  4148.000000  4148.000000  4148.000000       4148.000000   
mean     28.418022    21.698305    28.418022    21.698305          0.461757   
std      25.086012    61.937381    25.086012    61.937381          5.006530   
min     -40.900557  -106.346771   -40.900557  -106.346771        -37.010000   
25%      14.058324    -3.435973    14.058324    -3.435973         -0.150000   
50%      35.861660    19.145136    35.861660    19.145136          0.000000   
75%      47.516231    53.688046    47.516231    53.688046          0.240000   
max      64.963051   179.414413    64.963051   179.414413        150.680000   

       net_per_10K_2016  net_per_10K_2017  net_per_10K_2018  net_per_10K_2019  
count       4148.000000       4148.000000       4148.000000       4148.000000  
mean           0.150248         -0.080272         -0.040591         -0.022743  
std            4.201118          3.203092       

AttributeError: 'dict' object has no attribute 'groupby'

### Exercise 10 - calculating range over a grouped series

Open the dataset at this url: https://github.com/futureCodersSE/working-with-data/blob/main/Data%20sets/public_use-talent-migration.xlsx?raw=true Open the sheet named *Skill Migration*

1.  Display the max for each skill group category of net migration for the year 2017
2.  Assign the max for each skill group category of net migration for the year 2017 to a variable called **max_skill_migration** and print `max_skill_migration`
3.  Create a second variable called **min_skill_migration** and assign to it the min for each skill group category of net migration for the year 2017, print `min_skill_migration`

4.  You now have two series `max_skill_migration` and `min_skill_migration` each of which is a numpy array.  You can perfom calculations on these two series in the same way as you would individual data items.

So, you can calculate the range for each skill category by subtracting the `min_skill_migration` from `max_skill_migration` to get a new series **skill_migration_range**

skill_migration_range = max_skill_migration - min_skill_migration

Try it out.

5.  Now calculate the range for the year 2019
6.  Now calculate the range for countries grouped by base country income level for the year 2015

**Test output**:  
1 and 2  
```
skill_group_category
Business Skills                1048.20
Disruptive Tech Skills         1478.56
Soft Skills                    1572.35
Specialized Industry Skills    1906.14
Tech Skills                    1336.78
Name: net_per_10K_2017, dtype: float64
```

3    
```
skill_group_category
Business Skills               -3471.35
Disruptive Tech Skills        -2646.19
Soft Skills                   -2542.23
Specialized Industry Skills   -6604.67
Tech Skills                   -6060.98
Name: net_per_10K_2017, dtype: float64
```

4  
```
skill_group_category
Business Skills                4519.55
Disruptive Tech Skills         4124.75
Soft Skills                    4114.58
Specialized Industry Skills    8510.81
Tech Skills                    7397.76
Name: net_per_10K_2017, dtype: float64
```

5  
```
skill_group_category
Business Skills                4543.96
Disruptive Tech Skills         3651.81
Soft Skills                    5528.47
Specialized Industry Skills    4036.44
Tech Skills                    3424.45
Name: net_per_10K_2019, dtype: float64
```

6  
```
wb_income
High income            4246.50
Low income             4556.42
Lower middle income    2148.36
Upper middle income    4045.43
Name: net_per_10K_2015, dtype: float64
```





In [116]:
url = 'https://github.com/futureCodersSE/working-with-data/blob/main/Data%20sets/public_use-talent-migration.xlsx?raw=true'

skill_migration = pd.read_excel(url, sheet_name = 'Skill Migration')


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17617 entries, 0 to 17616
Data columns (total 12 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   country_code          17617 non-null  object 
 1   country_name          17617 non-null  object 
 2   wb_income             17617 non-null  object 
 3   wb_region             17617 non-null  object 
 4   skill_group_id        17617 non-null  int64  
 5   skill_group_category  17617 non-null  object 
 6   skill_group_name      17617 non-null  object 
 7   net_per_10K_2015      17617 non-null  float64
 8   net_per_10K_2016      17617 non-null  float64
 9   net_per_10K_2017      17617 non-null  float64
 10  net_per_10K_2018      17617 non-null  float64
 11  net_per_10K_2019      17617 non-null  float64
dtypes: float64(5), int64(1), object(6)
memory usage: 1.6+ MB


In [121]:
skill_migration

Unnamed: 0,country_code,country_name,wb_income,wb_region,skill_group_id,skill_group_category,skill_group_name,net_per_10K_2015,net_per_10K_2016,net_per_10K_2017,net_per_10K_2018,net_per_10K_2019
0,af,Afghanistan,Low income,South Asia,2549,Tech Skills,Information Management,-791.59,-705.88,-550.04,-680.92,-1208.79
1,af,Afghanistan,Low income,South Asia,2608,Business Skills,Operational Efficiency,-1610.25,-933.55,-776.06,-532.22,-790.09
2,af,Afghanistan,Low income,South Asia,3806,Specialized Industry Skills,National Security,-1731.45,-769.68,-756.59,-600.44,-767.64
3,af,Afghanistan,Low income,South Asia,50321,Tech Skills,Software Testing,-957.50,-828.54,-964.73,-406.50,-739.51
4,af,Afghanistan,Low income,South Asia,1606,Specialized Industry Skills,Navy,-1510.71,-841.17,-842.32,-581.71,-718.64
...,...,...,...,...,...,...,...,...,...,...,...,...
17612,zw,Zimbabwe,Low income,Sub-Saharan Africa,12666,Specialized Industry Skills,Teaching,71.18,30.68,-18.85,-68.89,-93.70
17613,zw,Zimbabwe,Low income,Sub-Saharan Africa,1235,Specialized Industry Skills,Mining,8.97,-112.85,-35.87,-65.38,-93.46
17614,zw,Zimbabwe,Low income,Sub-Saharan Africa,43756,Specialized Industry Skills,Personal Coaching,-53.45,-59.70,-88.01,-55.90,-82.23
17615,zw,Zimbabwe,Low income,Sub-Saharan Africa,1724,Specialized Industry Skills,Public Health,15.25,-65.53,-57.22,-39.39,-32.14


In [124]:
skill_migration[['skill_group_category','net_per_10K_2017']].groupby('skill_group_category').max()

Unnamed: 0_level_0,net_per_10K_2017
skill_group_category,Unnamed: 1_level_1
Business Skills,1048.2
Disruptive Tech Skills,1478.56
Soft Skills,1572.35
Specialized Industry Skills,1906.14
Tech Skills,1336.78


In [125]:
max_skill_migration = skill_migration[['skill_group_category','net_per_10K_2017']].groupby('skill_group_category').max()
max_skill_migration

Unnamed: 0_level_0,net_per_10K_2017
skill_group_category,Unnamed: 1_level_1
Business Skills,1048.2
Disruptive Tech Skills,1478.56
Soft Skills,1572.35
Specialized Industry Skills,1906.14
Tech Skills,1336.78


In [126]:
min_skill_migration = skill_migration[['skill_group_category','net_per_10K_2017']].groupby('skill_group_category').min()
min_skill_migration

Unnamed: 0_level_0,net_per_10K_2017
skill_group_category,Unnamed: 1_level_1
Business Skills,-3471.35
Disruptive Tech Skills,-2646.19
Soft Skills,-2542.23
Specialized Industry Skills,-6604.67
Tech Skills,-6060.98


In [127]:
skill_migration_range = max_skill_migration - min_skill_migration
skill_migration_range

Unnamed: 0_level_0,net_per_10K_2017
skill_group_category,Unnamed: 1_level_1
Business Skills,4519.55
Disruptive Tech Skills,4124.75
Soft Skills,4114.58
Specialized Industry Skills,8510.81
Tech Skills,7397.76


In [128]:
max_skill_migration = skill_migration[['skill_group_category','net_per_10K_2019']].groupby('skill_group_category').max()
min_skill_migration = skill_migration[['skill_group_category','net_per_10K_2019']].groupby('skill_group_category').min()
skill_migration_range = max_skill_migration - min_skill_migration
skill_migration_range

Unnamed: 0_level_0,net_per_10K_2019
skill_group_category,Unnamed: 1_level_1
Business Skills,4543.96
Disruptive Tech Skills,3651.81
Soft Skills,5528.47
Specialized Industry Skills,4036.44
Tech Skills,3424.45


In [135]:
max_skill_migration = skill_migration[['net_per_10K_2015','wb_income']].groupby('wb_income').max()
min_skill_migration = skill_migration[['net_per_10K_2015','wb_income']].groupby('wb_income').min()
skill_migration_range = max_skill_migration - min_skill_migration
skill_migration_range

Unnamed: 0_level_0,net_per_10K_2015
wb_income,Unnamed: 1_level_1
High income,4246.5
Low income,4556.42
Lower middle income,2148.36
Upper middle income,4045.43


# Reflection
----

## What skills have you demonstrated in completing this notebook?

Your answer: 
- various ways of summerizing statistics 
- using aggrigation of summery stats
- calculating range amongst different values in arrays

## What caused you the most difficulty?

Your answer: Getting the order of columns and the right amount of parenthesis,brackets and comma's was challenging.  I also wasn't fully able to crack number 9