# Pandas Data Cleaning

In this challenge you will be doing some preprocessing for a dataset for the videogame FIFA19 (https://www.kaggle.com/karangadiya/fifa19).  The dataset contains both data for the game as well as information about the players' real life careers.

**1) Read the CSV file into a pandas dataframe**

The data you'll be working with is found in a file located at`'../data/fifa.csv'`.  Use your knowledge of pandas to create a new dataframe using the csv data. 

Check the contents of the first few rows of your dataframe, then show the size of the dataframe

In [35]:
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

In [36]:
# code here to read in `fifa.csv` and assign a dataframe to the variable `df`
df = pd.read_csv('../data/fifa.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,ID,Name,Age,Photo,Nationality,Flag,Overall,Potential,Club,...,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,Release Clause
0,0,158023,L. Messi,31,https://cdn.sofifa.org/players/4/19/158023.png,Argentina,https://cdn.sofifa.org/flags/52.png,94,94,FC Barcelona,...,96.0,33.0,28.0,26.0,6.0,11.0,15.0,14.0,8.0,226500.0
1,1,20801,Cristiano Ronaldo,33,https://cdn.sofifa.org/players/4/19/20801.png,Portugal,https://cdn.sofifa.org/flags/38.png,94,94,Juventus,...,95.0,28.0,31.0,23.0,7.0,11.0,15.0,14.0,11.0,127100.0
2,2,190871,Neymar Jr,26,https://cdn.sofifa.org/players/4/19/190871.png,Brazil,https://cdn.sofifa.org/flags/54.png,92,93,Paris Saint-Germain,...,94.0,27.0,24.0,33.0,9.0,9.0,15.0,15.0,11.0,228100.0
3,3,193080,De Gea,27,https://cdn.sofifa.org/players/4/19/193080.png,Spain,https://cdn.sofifa.org/flags/45.png,91,93,Manchester United,...,68.0,15.0,21.0,13.0,90.0,85.0,87.0,88.0,94.0,138600.0
4,4,192985,K. De Bruyne,27,https://cdn.sofifa.org/players/4/19/192985.png,Belgium,https://cdn.sofifa.org/flags/7.png,91,92,Manchester City,...,88.0,68.0,58.0,51.0,15.0,13.0,5.0,10.0,13.0,196400.0


In [37]:
# code here to see the size of the dataframe
df.shape

(20000, 89)

**2. Drop n/a rows for "Release Clause"**
    
Drop rows for which "Release Clause" is none or not given. This is part of a soccer player's contract dealing with being bought out by another team. After you have dropped them, see how many rows are remaining.

In [38]:
# code here to drop n/a rows
df = df[df['Release Clause']>0]


In [32]:
# now check how many rows are left 
df['Release Clause'].count

<bound method Series.count of 0        226500.0
1        127100.0
2        228100.0
3        138600.0
4        196400.0
           ...   
19995    143000.0
19996    113000.0
19997    165000.0
19998    143000.0
19999    165000.0
Name: Release Clause, Length: 18273, dtype: float64>

**3) Convert the Release Clause Price from Euros to Dollars**

Now that there are no n/a values, we can change the values in the `Release Clause` column from Euro to Dollar amounts.

Assume the current Exchange Rate is
`1 Euro = 1.2 Dollars`

In [34]:
 # code here to convert the column of euros to dollarss
exchange_rate = 1.2
df['Release Clause'] = df['Release Clause'].map(lambda x: x*exchange_rate)
df['Release Clause']

0        326160.0
1        183024.0
2        328464.0
3        199584.0
4        282816.0
           ...   
19995    205920.0
19996    162720.0
19997    237600.0
19998    205920.0
19999    237600.0
Name: Release Clause, Length: 18273, dtype: float64