# Pandas Data Cleaning

In this challenge you will be doing some preprocessing for a dataset for the videogame FIFA19 (https://www.kaggle.com/karangadiya/fifa19).  The dataset contains both data for the game as well as information about the players' real life careers.

**1) Read the CSV file into a pandas dataframe**

The data you'll be working with is found in a file located at`'../data/fifa.csv'`.  Use your knowledge of pandas to create a new dataframe using the csv data. 

Check the contents of the first few rows of your dataframe, then show the size of the dataframe

In [1]:
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

In [2]:
# code here to read in `fifa.csv` and assign a dataframe to the variable `df`
df = pd.read_csv('../data/fifa.csv')

In [3]:
# code here to see the size of the dataframe
df.shape

(20000, 89)

**2. Drop n/a rows for "Release Clause"**
    
Drop rows for which "Release Clause" is none or not given. This is part of a soccer player's contract dealing with being bought out by another team. After you have dropped them, see how many rows are remaining.

In [4]:
df.head()

Unnamed: 0.1,Unnamed: 0,ID,Name,Age,Photo,Nationality,Flag,Overall,Potential,Club,...,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,Release Clause
0,0,158023,L. Messi,31,https://cdn.sofifa.org/players/4/19/158023.png,Argentina,https://cdn.sofifa.org/flags/52.png,94,94,FC Barcelona,...,96.0,33.0,28.0,26.0,6.0,11.0,15.0,14.0,8.0,226500.0
1,1,20801,Cristiano Ronaldo,33,https://cdn.sofifa.org/players/4/19/20801.png,Portugal,https://cdn.sofifa.org/flags/38.png,94,94,Juventus,...,95.0,28.0,31.0,23.0,7.0,11.0,15.0,14.0,11.0,127100.0
2,2,190871,Neymar Jr,26,https://cdn.sofifa.org/players/4/19/190871.png,Brazil,https://cdn.sofifa.org/flags/54.png,92,93,Paris Saint-Germain,...,94.0,27.0,24.0,33.0,9.0,9.0,15.0,15.0,11.0,228100.0
3,3,193080,De Gea,27,https://cdn.sofifa.org/players/4/19/193080.png,Spain,https://cdn.sofifa.org/flags/45.png,91,93,Manchester United,...,68.0,15.0,21.0,13.0,90.0,85.0,87.0,88.0,94.0,138600.0
4,4,192985,K. De Bruyne,27,https://cdn.sofifa.org/players/4/19/192985.png,Belgium,https://cdn.sofifa.org/flags/7.png,91,92,Manchester City,...,88.0,68.0,58.0,51.0,15.0,13.0,5.0,10.0,13.0,196400.0


In [5]:
# code here to drop n/a rows
df = df[df["Release Clause"].isna() != True]

In [8]:
# now check how many rows are left 
print(df.shape)
df.describe()

(18273, 89)


Unnamed: 0.1,Unnamed: 0,ID,Age,Overall,Potential,Special,International Reputation,Weak Foot,Skill Moves,Jersey Number,...,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,Release Clause
count,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,...,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0,18273.0
mean,5019.238822,213547.541345,25.312975,66.172495,71.03486,1597.474635,1.105073,2.943852,2.353199,19.489958,...,58.614951,47.474087,47.877524,45.835714,16.654682,16.414382,16.264106,16.417884,16.737208,231391.862146
std,2903.698314,30434.170106,4.694085,6.690273,6.023137,270.475284,0.382282,0.659299,0.74914,15.956295,...,11.238803,19.786674,21.541282,21.173974,17.699206,16.918385,16.502271,17.044046,17.970736,305836.410351
min,0.0,16.0,16.0,46.0,48.0,731.0,1.0,1.0,1.0,1.0,...,3.0,3.0,2.0,3.0,1.0,1.0,1.0,1.0,1.0,1000.0
25%,2503.0,199498.0,22.0,62.0,67.0,1461.0,1.0,3.0,2.0,8.0,...,52.0,30.0,27.0,24.0,8.0,8.0,8.0,8.0,8.0,2100.0
50%,5014.0,220948.0,25.0,66.0,71.0,1639.0,1.0,3.0,2.0,17.0,...,60.0,53.0,55.0,53.0,11.0,11.0,11.0,11.0,11.0,17300.0
75%,7550.0,236302.0,29.0,70.0,75.0,1781.0,1.0,3.0,3.0,26.0,...,66.0,64.0,66.0,64.0,14.0,14.0,14.0,14.0,14.0,439000.0
max,9999.0,246620.0,45.0,94.0,95.0,2346.0,5.0,5.0,5.0,99.0,...,96.0,94.0,93.0,91.0,90.0,92.0,91.0,90.0,94.0,999000.0


**3) Convert the Release Clause Price from Euros to Dollars**

Now that there are no n/a values, we can change the values in the `Release Clause` column from Euro to Dollar amounts.

Assume the current Exchange Rate is
`1 Euro = 1.2 Dollars`

In [9]:
 # code here to convert the column of euros to dollarss
df['Release Clause'] = df['Release Clause'] * 1.2
df.head()

Unnamed: 0.1,Unnamed: 0,ID,Name,Age,Photo,Nationality,Flag,Overall,Potential,Club,...,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,Release Clause
0,0,158023,L. Messi,31,https://cdn.sofifa.org/players/4/19/158023.png,Argentina,https://cdn.sofifa.org/flags/52.png,94,94,FC Barcelona,...,96.0,33.0,28.0,26.0,6.0,11.0,15.0,14.0,8.0,271800.0
1,1,20801,Cristiano Ronaldo,33,https://cdn.sofifa.org/players/4/19/20801.png,Portugal,https://cdn.sofifa.org/flags/38.png,94,94,Juventus,...,95.0,28.0,31.0,23.0,7.0,11.0,15.0,14.0,11.0,152520.0
2,2,190871,Neymar Jr,26,https://cdn.sofifa.org/players/4/19/190871.png,Brazil,https://cdn.sofifa.org/flags/54.png,92,93,Paris Saint-Germain,...,94.0,27.0,24.0,33.0,9.0,9.0,15.0,15.0,11.0,273720.0
3,3,193080,De Gea,27,https://cdn.sofifa.org/players/4/19/193080.png,Spain,https://cdn.sofifa.org/flags/45.png,91,93,Manchester United,...,68.0,15.0,21.0,13.0,90.0,85.0,87.0,88.0,94.0,166320.0
4,4,192985,K. De Bruyne,27,https://cdn.sofifa.org/players/4/19/192985.png,Belgium,https://cdn.sofifa.org/flags/7.png,91,92,Manchester City,...,88.0,68.0,58.0,51.0,15.0,13.0,5.0,10.0,13.0,235680.0
