## Project  ETL

### FIFA

* `fifa_db` database in MySQL Workbench with the following tables:

  * `teams` table that contains the columns `team_id`, `team_name`, `league_id`, `transfer_budget`,`country_id`.
  * `player_MV` table that contains the columns `player_name`, `player_marketValue`.
  * `players` table that contains the columns `player_id`, `player_name`, `age`, `nationality`, `overall`, `potencial`
    `club`, `player_value`, `wage`, `relase_clause`.
    
* **Extraction**

  * Load each CSV into a pandas DataFrame.

* **Transform**

  * Copy only the columns needed into a new DataFrame.

  * Rename columns to fit the tables created in the database.

  * Handle any duplicates.

  * Set index to the previously created primary key.

* **Load**

  * Create a connection to database.

  * Check for a successful connection to the database and confirm that the tables have been created.

  * Append DataFrames to tables.

* Confirm successful **Load** by querying database.


In [1]:
import pandas as pd
from sqlalchemy import create_engine

### Extract CSVs into DataFrames

In [2]:
teams_file = "../Resources/teams.csv"
teams_data_df = pd.read_csv(teams_file, encoding='UTF-8')
teams_data_df.head()

Unnamed: 0,team_id,team_name,team_league,rival_team_id,attack,midfield,defence,transfer_budget,country
0,1,FC Barcelona,LaLiga Santander,2,87,86,85,188.000.000,Spain
1,2,Real Madrid,LaLiga Santander,1,83,88,86,188.500.000,Spain
2,3,Juventus,Serie A TIM,13,89,84,85,90.000.000,Italy
3,4,Manchester City,Premier League,12,86,88,83,170.000.000,England
4,5,FC Bayern,Bundesliga,15,85,85,85,100.000.000,Germany


In [3]:
marketValue_file = "../Resources/marketValue_player.csv"
marketValue_data_df = pd.read_csv(marketValue_file, encoding='UTF-8')
marketValue_data_df.head()

Unnamed: 0,Player,Market_Value
0,Kylian Mbappé,"200,00 Mill. €"
1,Neymar,"180,00 Mill. €"
2,Lionel Messi,"160,00 Mill. €"
3,Mohamed Salah,"150,00 Mill. €"
4,Harry Kane,"150,00 Mill. €"


In [4]:
datosFifa_file = "../Resources/jugadores.csv"
datosFifa_data_df = pd.read_csv(datosFifa_file, encoding='UTF-8')
datosFifa_data_df.head()

Unnamed: 0,ID,Name,Age,Nationality,Overall,Potential,Club,Value,Wage,Release Clause
0,158023,L. Messi,31,Argentina,94,94,FC Barcelona,€110.5M,€565K,€226.5M
1,20801,Cristiano Ronaldo,33,Portugal,94,94,Juventus,€77M,€405K,€127.1M
2,190871,Neymar Jr,26,Brazil,92,93,Paris Saint-Germain,€118.5M,€290K,€228.1M
3,193080,De Gea,27,Spain,91,93,Manchester United,€72M,€260K,€138.6M
4,192985,K. De Bruyne,27,Belgium,91,92,Manchester City,€102M,€355K,€196.4M


### Transform DataFrames

In [5]:
teams_transformed_df = teams_data_df[['team_id','team_name','team_league','rival_team_id','attack','midfield','defence','transfer_budget','country']].copy()
teams_transformed_df.head()

Unnamed: 0,team_id,team_name,team_league,rival_team_id,attack,midfield,defence,transfer_budget,country
0,1,FC Barcelona,LaLiga Santander,2,87,86,85,188.000.000,Spain
1,2,Real Madrid,LaLiga Santander,1,83,88,86,188.500.000,Spain
2,3,Juventus,Serie A TIM,13,89,84,85,90.000.000,Italy
3,4,Manchester City,Premier League,12,86,88,83,170.000.000,England
4,5,FC Bayern,Bundesliga,15,85,85,85,100.000.000,Germany


In [6]:
marketValue_transformed_df = marketValue_data_df[['Player','Market_Value']].copy()
# Rename columns
marketValue_transformed_df = marketValue_transformed_df.rename(columns={'Player':'player_name', 'Market_Value':'player_marketValue'})
marketValue_transformed_df.head()

Unnamed: 0,player_name,player_marketValue
0,Kylian Mbappé,"200,00 Mill. €"
1,Neymar,"180,00 Mill. €"
2,Lionel Messi,"160,00 Mill. €"
3,Mohamed Salah,"150,00 Mill. €"
4,Harry Kane,"150,00 Mill. €"


In [7]:
datosFifa_transformed_df = datosFifa_data_df[['ID','Name','Age','Nationality','Overall','Potential','Club','Value','Wage','Release Clause']].copy()
# Rename columns
datosFifa_transformed_df = datosFifa_data_df.rename(columns={'ID':'player_id','Name':'player_name','Age':'age','Nationality':'nationality','Overall':'overall','Potential':'potencial','Club':'club','Value':'player_value','Wage':'wage','Release Clause':'relase_clause'})
datosFifa_transformed_df.head()

Unnamed: 0,player_id,player_name,age,nationality,overall,potencial,club,player_value,wage,relase_clause
0,158023,L. Messi,31,Argentina,94,94,FC Barcelona,€110.5M,€565K,€226.5M
1,20801,Cristiano Ronaldo,33,Portugal,94,94,Juventus,€77M,€405K,€127.1M
2,190871,Neymar Jr,26,Brazil,92,93,Paris Saint-Germain,€118.5M,€290K,€228.1M
3,193080,De Gea,27,Spain,91,93,Manchester United,€72M,€260K,€138.6M
4,192985,K. De Bruyne,27,Belgium,91,92,Manchester City,€102M,€355K,€196.4M


### Database connection

In [8]:
rds_connection_string = 'student:facil@127.0.0.1/fifa_db?charset=utf8mb4'

In [9]:
engine = create_engine(f'mysql://{rds_connection_string}')

### Checking tables

In [10]:
engine.table_names()

['players', 'players_mv', 'teams']

### Load DataFrames into database

In [11]:
teams_transformed_df.to_sql(name='teams', con=engine, if_exists='append', index=False)

In [12]:
marketValue_transformed_df.to_sql(name='players_mv', con=engine, if_exists='append', index=False)

In [13]:
datosFifa_transformed_df.to_sql(name='players', con=engine, if_exists='append', index=False)

### Confirm data has been added by querying the customer_name table

In [14]:
pd.read_sql_query('select * from teams', con=engine).head()

Unnamed: 0,team_id,team_name,team_league,rival_team_id,attack,midfield,defence,transfer_budget,country
0,1,FC Barcelona,LaLiga Santander,2,87,86,85,188.000.000,Spain
1,2,Real Madrid,LaLiga Santander,1,83,88,86,188.500.000,Spain
2,3,Juventus,Serie A TIM,13,89,84,85,90.000.000,Italy
3,4,Manchester City,Premier League,12,86,88,83,170.000.000,England
4,5,FC Bayern,Bundesliga,15,85,85,85,100.000.000,Germany


In [15]:
pd.read_sql_query('select * from players_mv', con=engine).head()

Unnamed: 0,player_name,player_marketValue
0,Kylian Mbappé,"200,00 Mill. €"
1,Neymar,"180,00 Mill. €"
2,Lionel Messi,"160,00 Mill. €"
3,Mohamed Salah,"150,00 Mill. €"
4,Harry Kane,"150,00 Mill. €"


In [16]:
pd.read_sql_query('select * from players', con=engine).head()

Unnamed: 0,player_id,player_name,age,nationality,overall,potencial,club,player_value,wage,relase_clause
0,158023,L. Messi,31,Argentina,94,94,FC Barcelona,€110.5M,€565K,€226.5M
1,20801,Cristiano Ronaldo,33,Portugal,94,94,Juventus,€77M,€405K,€127.1M
2,190871,Neymar Jr,26,Brazil,92,93,Paris Saint-Germain,€118.5M,€290K,€228.1M
3,193080,De Gea,27,Spain,91,93,Manchester United,€72M,€260K,€138.6M
4,192985,K. De Bruyne,27,Belgium,91,92,Manchester City,€102M,€355K,€196.4M
