# Part II - (FIFA19 Data Explanation)
## by (Ayeni Trust)

>**Before you start**: You must have the README.md file ready that include a summary of main findings that reflects on the steps taken during the data exploration (Part I notebook). The README.md file should also describes the key insights that will be conveyed by the explanatory slide deck (Part II  outcome)



## Investigation Overview


>  I primarily investigated the characteristics each playing position must have. Factors that impact a player's ability to become a good footballer. The main features i played with are:
- age 
- nationality 
- overall 
- potential
- club 
- value 
- wage 
- international_reputation 
- work_rate 
- body_type



## Dataset Overview

> Provide a brief overview of the dataset to be presented here.

In [None]:
# import all packages and set plots to be embedded inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb

%matplotlib inline

# suppress warnings from final output
import warnings
warnings.simplefilter("ignore")

In [None]:
# load in the dataset into a pandas dataframe
df = pd.read_csv('kl.csv',  encoding = 'Windows-1252')
df.shape

> Note that the above cells have been set as "Skip"-type slides. That means
that when the notebook is rendered as http slides, those cells won't show up.

### Dataset Structure
This dataset contains comprehensive information for every player in the FIFA 19 database's most recent version. There are 18207 observations and 89 characteristics in all. Age, Nationality, Overall, Potential, Club, Value, Wage, Preferred Foot, International Reputation, Weak Foot, Skill Moves, Work Rate, Position, Jersey Number, Joined, Loaned From, Contract Valid Until, Height, Weight, Skills... are among the FIFA 2019 player qualities mentioned.

> Observation
- I started with some basic descriptive statistics and discovered that `75%` of the participants in the sample are under the age of `29`. 
- The oldest player is `45` years old, while the youngest is only `16` years old. 
- The most valued player is `€118,500,000`, while the lowest valued player is `€10,000`. 
- The average weekly income is `€9583`, while the highest-paid player gets `€565000`. 

# Univariate Explanation
## (Visualization 1)


> Country with the hoghest number of players

In [None]:
nation = df.nationality.value_counts()[0:10]

#plot
g_bar = nation.plot.bar(color = 'blue', fontsize= 15)

#figure size(width, height)
g_bar.figure.set_size_inches(10, 8);

#Add labels
plt.title('Country with the most number of footballers', color = 'white', fontsize = '15')
plt.xlabel('Club', color = 'white', fontsize = '15')
plt.ylabel('Number of occurrence', color = 'white', fontsize = '15');

![](./1.png)

- England tops the nation with the most number of footballers in the data. Japan comes out `9th` in the list.

## (Visualization 2)

> Most played position

In [None]:
ax = plt.subplots(1,1, figsize=(10,8))

base_color = sb.color_palette()[0]
sb.countplot(data = df, y = 'position', color = base_color, order = position_count.index)
plt.ylabel('Position', color = 'white', fontsize = '15')
plt.xlabel('Frequency', color = 'white', fontsize = '15');

![](./2.png)

- The most played position is the `striker(ST)` position

## (Visualization 3)

> Most used stronger foot


In [None]:
df['preferred_foot'].value_counts().plot.bar(title="Right/Left foot")
plt.ylabel('Frequency', color = 'white', fontsize = '17')
plt.xlabel('Foot', color = 'white', fontsize = '17');

![](./3.png)

- Many players use their `right` foot because it's their stronger foot and less than 4000 players have their `left` foot as their stronger foot

# Bivariate Explanation
## (Visualization 4)

> Correlation between value and wage


In [None]:
# scatterplot of value and wage
df.plot(x='value', y='wage', kind='scatter', title = 'Scatter plot for value and wage');

![](./4.png)

- Strong positive correlation between `value` and `wage`

## (Visualization 5)

> Players and their body types, body type posessed by most players


In [None]:
# plotting Facet
bins = np.arange(10, 5+50, 3)
g = sb.FacetGrid(data = df, col = 'body_type')
g.map(plt.hist, 'age', bins = bins);

![](./5.png)

- Many players have `normal` body_type while few have `stocky` body_type

## (Visualization 6)

> Country with the most potential players


In [None]:
# Group by nationality then sum the value.
p_nation = df.groupby('nationality')['potential'].sum().sort_values(ascending = False).head(5)

# plot
p_nation.plot.bar(title="Top countries with potential players")
plt.ylabel('Potential', color = 'black', fontsize = '15')
plt.xlabel('Country', color = 'black', fontsize = '15');

![](./6.png)

- England has a positive influence on players, making it the ideal country to help athletes reach their full potential.

# Multivariate Explanation
## (Visualization 7)

> Facet plot on best foot for playing


In [None]:
# plotting facet
f = sb.FacetGrid(data = df, hue = 'preferred_foot', size = 8, aspect = 1.4, palette = 'viridis_r')
f.map(plt.scatter, 'shot_power', 'finishing')
f.add_legend();

![](./7.png)

- From the facet plot, many players are right footed.

## (Visualization 8)

> Facet plot on body type of players


In [None]:
# plotting facet
b = sb.FacetGrid(data = df, hue = 'body_type', size = 8, aspect = 1.4, palette = 'viridis_r')
b.map(plt.scatter, 'shot_power', 'finishing')
b.add_legend();

![](./8.png)

### Generate Slideshow
Once you're ready to generate your slideshow, use the `jupyter nbconvert` command to generate the HTML slide show.  

In [1]:
# Use this command if you are running this file in local
!jupyter nbconvert Part_II_slide_deck_template.ipynb --to slides --post serve --no-input --no-prompt

[NbConvertApp] Converting notebook Part_II_slide_deck_template.ipynb to slides
[NbConvertApp] Writing 581736 bytes to Part_II_slide_deck_template.slides.html
[NbConvertApp] Redirecting reveal.js requests to https://cdnjs.cloudflare.com/ajax/libs/reveal.js/3.5.0
Traceback (most recent call last):
  File "C:\Users\trust\Anaconda3\Scripts\jupyter-nbconvert-script.py", line 10, in <module>
    sys.exit(main())
  File "C:\Users\trust\Anaconda3\lib\site-packages\jupyter_core\application.py", line 269, in launch_instance
    return super().launch_instance(argv=argv, **kwargs)
  File "C:\Users\trust\Anaconda3\lib\site-packages\traitlets\config\application.py", line 846, in launch_instance
    app.start()
  File "C:\Users\trust\Anaconda3\lib\site-packages\nbconvert\nbconvertapp.py", line 350, in start
    self.convert_notebooks()
  File "C:\Users\trust\Anaconda3\lib\site-packages\nbconvert\nbconvertapp.py", line 524, in convert_notebooks
    self.convert_single_notebook(notebook_filename)
  Fil

> In the classroom workspace, the generated HTML slideshow will be placed in the home folder. 

> In local machines, the command above should open a tab in your web browser where you can scroll through your presentation. Sub-slides can be accessed by pressing 'down' when viewing its parent slide. Make sure you remove all of the quote-formatted guide notes like this one before you finish your presentation! At last, you can stop the Kernel. 

### Submission
If you are using classroom workspace, you can choose from the following two ways of submission:

1. **Submit from the workspace**. Make sure you have removed the example project from the /home/workspace directory. You must submit the following files:
   - Part_I_notebook.ipynb
   - Part_I_notebook.html or pdf
   - Part_II_notebook.ipynb
   - Part_I_slides.html
   - README.md
   - dataset (optional)


2. **Submit a zip file on the last page of this project lesson**. In this case, open the Jupyter terminal and run the command below to generate a ZIP file. 
```bash
zip -r my_project.zip .
```
The command abobve will ZIP every file present in your /home/workspace directory. Next, you can download the zip to your local, and follow the instructions on the last page of this project lesson.
