# Search and Filter DataFrames in PySpark HW

Now it's time to put what you've learn into action with a homework assignment!

In case you need it again, here is the link to the documentation for the full list available function in pyspark.sql.functions library:
http://spark.apache.org/docs/latest/api/python/pyspark.sql.html#module-pyspark.sql.functions


### First set up your Spark Session!
Alright so first things first, let's start up our pyspark instance.

In [1]:
import pandas as pd
import numpy as np
import datetime as dt
from PIL import Image
import requests
import IPython

pd.options.display.max_rows = None
pd.options.display.max_columns = None

import findspark
findspark.init()
import pyspark
from pyspark.sql import SparkSession
from pyspark.sql.functions import *#avg, count, expr
from pyspark.sql.types import *

In [2]:
# initialize
sc = pyspark.SparkContext()
spark = SparkSession(sc)
spark.sparkContext.appName = 'searchandF'
# show the number of cores
print('%d cores'%spark._jsc.sc().getExecutorMemoryStatus().keySet().size())
spark

1 cores


## Read in the DataFrame for this Notebook

We will be continuing to use the fifa19.csv file for this notebook. Make sure that you are writting the correct path to the file. 

In [3]:
# define the schema
schem = StructType([StructField('_c0', IntegerType()),
                    StructField('ID', IntegerType()),
                    StructField('Name', StringType()),
                    StructField('Age', IntegerType()),
                    StructField('Photo', StringType()),
                    StructField('Nationality', StringType()),
                    StructField('Flag', StringType()),
                    StructField('Overall', IntegerType()),
                    StructField('Potential', IntegerType()),
                    StructField('Club', StringType()),
                    StructField('Club Logo', StringType()),
                    StructField('Value', StringType()),
                    StructField('Wage', StringType()),
                    StructField('Special', IntegerType()),
                    StructField('Preferred Foot', StringType()),
                    StructField('International Reputation', IntegerType()),
                    StructField('Weak Foot', IntegerType()),
                    StructField('Skill Moves', IntegerType()),
                    StructField('Work Rate', StringType()),
                    StructField('Body Type', StringType()),
                    StructField('Real Face', StringType()),
                    StructField('Position', StringType()),
                    StructField('Jersey Number', IntegerType()),
                    StructField('Joined', StringType()),
                    StructField('Loaned From', StringType()),
                    StructField('Contract Valid Until', IntegerType()),
                    StructField('Height', StringType()),
                    StructField('Weight', StringType()),
                    StructField('LS', StringType()),
                    StructField('ST', StringType()),
                    StructField('RS', StringType()),
                    StructField('LW', StringType()),
                    StructField('LF', StringType()),
                    StructField('CF', StringType()),
                    StructField('RF', StringType()),
                    StructField('RW', StringType()),
                    StructField('LAM', StringType()),
                    StructField('CAM', StringType()),
                    StructField('RAM', StringType()),
                    StructField('LM', StringType()),
                    StructField('LCM', StringType()),
                    StructField('CM', StringType()),
                    StructField('RCM', StringType()),
                    StructField('RM', StringType()),
                    StructField('LWB', StringType()),
                    StructField('LDM', StringType()),
                    StructField('CDM', StringType()),
                    StructField('RDM', StringType()),
                    StructField('RWB', StringType()),
                    StructField('LB', StringType()),
                    StructField('LCB', StringType()),
                    StructField('CB', StringType()),
                    StructField('RCB', StringType()),
                    StructField('RB', StringType()),
                    StructField('Crossing', IntegerType()),
                    StructField('Finishing', IntegerType()),
                    StructField('HeadingAccuracy', IntegerType()),
                    StructField('ShortPassing', IntegerType()),
                    StructField('Volleys', IntegerType()),
                    StructField('Dribbling', IntegerType()),
                    StructField('Curve', IntegerType()),
                    StructField('FKAccuracy', IntegerType()),
                    StructField('LongPassing', IntegerType()),
                    StructField('BallControl', IntegerType()),
                    StructField('Acceleration', IntegerType()),
                    StructField('SprintSpeed', IntegerType()),
                    StructField('Agility', IntegerType()),
                    StructField('Reactions', IntegerType()),
                    StructField('Balance', IntegerType()),
                    StructField('ShotPower', IntegerType()),
                    StructField('Jumping', IntegerType()),
                    StructField('Stamina', IntegerType()),
                    StructField('Strength', IntegerType()),
                    StructField('LongShots', IntegerType()),
                    StructField('Aggression', IntegerType()),
                    StructField('Interceptions', IntegerType()),
                    StructField('Positioning', IntegerType()),
                    StructField('Vision', IntegerType()),
                    StructField('Penalties', IntegerType()),
                    StructField('Composure', IntegerType()),
                    StructField('Marking', IntegerType()),
                    StructField('StandingTackle', IntegerType()),
                    StructField('SlidingTackle', IntegerType()),
                    StructField('GKDiving', IntegerType()),
                    StructField('GKHandling', IntegerType()),
                    StructField('GKKicking', IntegerType()),
                    StructField('GKPositioning', IntegerType()),
                    StructField('GKReflexes', IntegerType()),
                    StructField('Release Clause', StringType())])

# load data - dropping the first columns
fil = '../../data/fifa19.csv'
fifa = spark.read.format('csv').options(header=True).schema(schem).load(fil)
fifa = fifa.select(fifa.columns[1:])

# talk
display(fifa.limit(10).toPandas())

Unnamed: 0,ID,Name,Age,Photo,Nationality,Flag,Overall,Potential,Club,Club Logo,Value,Wage,Special,Preferred Foot,International Reputation,Weak Foot,Skill Moves,Work Rate,Body Type,Real Face,Position,Jersey Number,Joined,Loaned From,Contract Valid Until,Height,Weight,LS,ST,RS,LW,LF,CF,RF,RW,LAM,CAM,RAM,LM,LCM,CM,RCM,RM,LWB,LDM,CDM,RDM,RWB,LB,LCB,CB,RCB,RB,Crossing,Finishing,HeadingAccuracy,ShortPassing,Volleys,Dribbling,Curve,FKAccuracy,LongPassing,BallControl,Acceleration,SprintSpeed,Agility,Reactions,Balance,ShotPower,Jumping,Stamina,Strength,LongShots,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,Release Clause
0,158023,L. Messi,31,https://cdn.sofifa.org/players/4/19/158023.png,Argentina,https://cdn.sofifa.org/flags/52.png,94,94,FC Barcelona,https://cdn.sofifa.org/teams/2/light/241.png,€110.5M,€565K,2202,Left,5,4,4,Medium/ Medium,Messi,Yes,RF,10,"Jul 1, 2004",,2021,5'7,159lbs,88+2,88+2,88+2,92+2,93+2,93+2,93+2,92+2,93+2,93+2,93+2,91+2,84+2,84+2,84+2,91+2,64+2,61+2,61+2,61+2,64+2,59+2,47+2,47+2,47+2,59+2,84,95,70,90,86,97,93,94,87,96,91,86,91,95,95,85,68,72,59,94,48,22,94,94,75,96,33,28,26,6,11,15,14,8,€226.5M
1,20801,Cristiano Ronaldo,33,https://cdn.sofifa.org/players/4/19/20801.png,Portugal,https://cdn.sofifa.org/flags/38.png,94,94,Juventus,https://cdn.sofifa.org/teams/2/light/45.png,€77M,€405K,2228,Right,5,4,5,High/ Low,C. Ronaldo,Yes,ST,7,"Jul 10, 2018",,2022,6'2,183lbs,91+3,91+3,91+3,89+3,90+3,90+3,90+3,89+3,88+3,88+3,88+3,88+3,81+3,81+3,81+3,88+3,65+3,61+3,61+3,61+3,65+3,61+3,53+3,53+3,53+3,61+3,84,94,89,81,87,88,81,76,77,94,89,91,87,96,70,95,95,88,79,93,63,29,95,82,85,95,28,31,23,7,11,15,14,11,€127.1M
2,190871,Neymar Jr,26,https://cdn.sofifa.org/players/4/19/190871.png,Brazil,https://cdn.sofifa.org/flags/54.png,92,93,Paris Saint-Germain,https://cdn.sofifa.org/teams/2/light/73.png,€118.5M,€290K,2143,Right,5,5,5,High/ Medium,Neymar,Yes,LW,10,"Aug 3, 2017",,2022,5'9,150lbs,84+3,84+3,84+3,89+3,89+3,89+3,89+3,89+3,89+3,89+3,89+3,88+3,81+3,81+3,81+3,88+3,65+3,60+3,60+3,60+3,65+3,60+3,47+3,47+3,47+3,60+3,79,87,62,84,84,96,88,87,78,95,94,90,96,94,84,80,61,81,49,82,56,36,89,87,81,94,27,24,33,9,9,15,15,11,€228.1M
3,193080,De Gea,27,https://cdn.sofifa.org/players/4/19/193080.png,Spain,https://cdn.sofifa.org/flags/45.png,91,93,Manchester United,https://cdn.sofifa.org/teams/2/light/11.png,€72M,€260K,1471,Right,4,3,1,Medium/ Medium,Lean,Yes,GK,1,"Jul 1, 2011",,2020,6'4,168lbs,,,,,,,,,,,,,,,,,,,,,,,,,,,17,13,21,50,13,18,21,19,51,42,57,58,60,90,43,31,67,43,64,12,38,30,12,68,40,68,15,21,13,90,85,87,88,94,€138.6M
4,192985,K. De Bruyne,27,https://cdn.sofifa.org/players/4/19/192985.png,Belgium,https://cdn.sofifa.org/flags/7.png,91,92,Manchester City,https://cdn.sofifa.org/teams/2/light/10.png,€102M,€355K,2281,Right,4,5,4,High/ High,Normal,Yes,RCM,7,"Aug 30, 2015",,2023,5'11,154lbs,82+3,82+3,82+3,87+3,87+3,87+3,87+3,87+3,88+3,88+3,88+3,88+3,87+3,87+3,87+3,88+3,77+3,77+3,77+3,77+3,77+3,73+3,66+3,66+3,66+3,73+3,93,82,55,92,82,86,85,83,91,91,78,76,79,91,77,91,63,90,75,91,76,61,87,94,79,88,68,58,51,15,13,5,10,13,€196.4M
5,183277,E. Hazard,27,https://cdn.sofifa.org/players/4/19/183277.png,Belgium,https://cdn.sofifa.org/flags/7.png,91,91,Chelsea,https://cdn.sofifa.org/teams/2/light/5.png,€93M,€340K,2142,Right,4,4,4,High/ Medium,Normal,Yes,LF,10,"Jul 1, 2012",,2020,5'8,163lbs,83+3,83+3,83+3,89+3,88+3,88+3,88+3,89+3,89+3,89+3,89+3,89+3,82+3,82+3,82+3,89+3,66+3,63+3,63+3,63+3,66+3,60+3,49+3,49+3,49+3,60+3,81,84,61,89,80,95,83,79,83,94,94,88,95,90,94,82,56,83,66,80,54,41,87,89,86,91,34,27,22,11,12,6,8,8,€172.1M
6,177003,L. Modrić,32,https://cdn.sofifa.org/players/4/19/177003.png,Croatia,https://cdn.sofifa.org/flags/10.png,91,91,Real Madrid,https://cdn.sofifa.org/teams/2/light/243.png,€67M,€420K,2280,Right,4,4,4,High/ High,Lean,Yes,RCM,10,"Aug 1, 2012",,2020,5'8,146lbs,77+3,77+3,77+3,85+3,84+3,84+3,84+3,85+3,87+3,87+3,87+3,86+3,88+3,88+3,88+3,86+3,82+3,81+3,81+3,81+3,82+3,79+3,71+3,71+3,71+3,79+3,86,72,55,93,76,90,85,78,88,93,80,72,93,90,94,79,68,89,58,82,62,83,79,92,82,84,60,76,73,13,9,7,14,9,€137.4M
7,176580,L. Suárez,31,https://cdn.sofifa.org/players/4/19/176580.png,Uruguay,https://cdn.sofifa.org/flags/60.png,91,91,FC Barcelona,https://cdn.sofifa.org/teams/2/light/241.png,€80M,€455K,2346,Right,5,4,3,High/ Medium,Normal,Yes,RS,9,"Jul 11, 2014",,2021,6'0,190lbs,87+5,87+5,87+5,86+5,87+5,87+5,87+5,86+5,85+5,85+5,85+5,84+5,79+5,79+5,79+5,84+5,69+5,68+5,68+5,68+5,69+5,66+5,63+5,63+5,63+5,66+5,77,93,77,82,88,87,86,84,64,90,86,75,82,92,83,86,69,90,83,85,87,41,92,84,85,85,62,45,38,27,25,31,33,37,€164M
8,155862,Sergio Ramos,32,https://cdn.sofifa.org/players/4/19/155862.png,Spain,https://cdn.sofifa.org/flags/45.png,91,91,Real Madrid,https://cdn.sofifa.org/teams/2/light/243.png,€51M,€380K,2201,Right,4,3,3,High/ Medium,Normal,Yes,RCB,15,"Aug 1, 2005",,2020,6'0,181lbs,73+3,73+3,73+3,70+3,71+3,71+3,71+3,70+3,71+3,71+3,71+3,72+3,75+3,75+3,75+3,72+3,81+3,84+3,84+3,84+3,81+3,84+3,87+3,87+3,87+3,84+3,66,60,91,78,66,63,74,72,77,84,76,75,78,85,66,79,93,84,83,59,88,90,60,63,75,82,87,92,91,11,8,9,7,11,€104.6M
9,200389,J. Oblak,25,https://cdn.sofifa.org/players/4/19/200389.png,Slovenia,https://cdn.sofifa.org/flags/44.png,90,93,Atlético Madrid,https://cdn.sofifa.org/teams/2/light/240.png,€68M,€94K,1331,Right,3,3,1,Medium/ Medium,Normal,Yes,GK,1,"Jul 16, 2014",,2021,6'2,192lbs,,,,,,,,,,,,,,,,,,,,,,,,,,,13,11,15,29,13,12,13,14,26,16,43,60,67,86,49,22,76,41,78,12,34,19,11,70,11,70,27,12,18,86,92,78,88,89,€144.5M


## About this dataframe

The **fifa19.csv** dataset includes a list of all the FIFA 2019 players and their attributes listed below: 

 - **General**: Age, Nationality, Overall, Potential, Club
 - **Metrics:** Value, Wage
 - **Player Descriptive:** Preferred Foot, International Reputation, Weak Foot, Skill Moves, Work Rate, Position, Jersey Number, Joined, Loaned From, Contract Valid Until, Height, Weight
 - **Possition:** LS, ST, RS, LW, LF, CF, RF, RW, LAM, CAM, RAM, LM, LCM, CM, RCM, RM, LWB, LDM, CDM, RDM, RWB, LB, LCB, CB, RCB, RB, 
 - **Other:** Crossing, Finishing, Heading, Accuracy, ShortPassing, Volleys, Dribbling, Curve, FKAccuracy, LongPassing, BallControl, Acceleration, SprintSpeed, Agility, Reactions, Balance, ShotPower, Jumping, Stamina, Strength, LongShots, Aggression, Interceptions, Positioning, Vision, Penalties, Composure, Marking, StandingTackle, SlidingTackle, GKDiving, GKHandling, GKKicking, GKPositioning, GKReflexes, and Release Clause.

**Source:** https://www.kaggle.com/karangadiya/fifa19

Use the .toPandas() method to view the first few lines of the dataset so we know what we are working with. 

In [4]:
display(fifa.limit(10).toPandas())

Unnamed: 0,ID,Name,Age,Photo,Nationality,Flag,Overall,Potential,Club,Club Logo,Value,Wage,Special,Preferred Foot,International Reputation,Weak Foot,Skill Moves,Work Rate,Body Type,Real Face,Position,Jersey Number,Joined,Loaned From,Contract Valid Until,Height,Weight,LS,ST,RS,LW,LF,CF,RF,RW,LAM,CAM,RAM,LM,LCM,CM,RCM,RM,LWB,LDM,CDM,RDM,RWB,LB,LCB,CB,RCB,RB,Crossing,Finishing,HeadingAccuracy,ShortPassing,Volleys,Dribbling,Curve,FKAccuracy,LongPassing,BallControl,Acceleration,SprintSpeed,Agility,Reactions,Balance,ShotPower,Jumping,Stamina,Strength,LongShots,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,Release Clause
0,158023,L. Messi,31,https://cdn.sofifa.org/players/4/19/158023.png,Argentina,https://cdn.sofifa.org/flags/52.png,94,94,FC Barcelona,https://cdn.sofifa.org/teams/2/light/241.png,€110.5M,€565K,2202,Left,5,4,4,Medium/ Medium,Messi,Yes,RF,10,"Jul 1, 2004",,2021,5'7,159lbs,88+2,88+2,88+2,92+2,93+2,93+2,93+2,92+2,93+2,93+2,93+2,91+2,84+2,84+2,84+2,91+2,64+2,61+2,61+2,61+2,64+2,59+2,47+2,47+2,47+2,59+2,84,95,70,90,86,97,93,94,87,96,91,86,91,95,95,85,68,72,59,94,48,22,94,94,75,96,33,28,26,6,11,15,14,8,€226.5M
1,20801,Cristiano Ronaldo,33,https://cdn.sofifa.org/players/4/19/20801.png,Portugal,https://cdn.sofifa.org/flags/38.png,94,94,Juventus,https://cdn.sofifa.org/teams/2/light/45.png,€77M,€405K,2228,Right,5,4,5,High/ Low,C. Ronaldo,Yes,ST,7,"Jul 10, 2018",,2022,6'2,183lbs,91+3,91+3,91+3,89+3,90+3,90+3,90+3,89+3,88+3,88+3,88+3,88+3,81+3,81+3,81+3,88+3,65+3,61+3,61+3,61+3,65+3,61+3,53+3,53+3,53+3,61+3,84,94,89,81,87,88,81,76,77,94,89,91,87,96,70,95,95,88,79,93,63,29,95,82,85,95,28,31,23,7,11,15,14,11,€127.1M
2,190871,Neymar Jr,26,https://cdn.sofifa.org/players/4/19/190871.png,Brazil,https://cdn.sofifa.org/flags/54.png,92,93,Paris Saint-Germain,https://cdn.sofifa.org/teams/2/light/73.png,€118.5M,€290K,2143,Right,5,5,5,High/ Medium,Neymar,Yes,LW,10,"Aug 3, 2017",,2022,5'9,150lbs,84+3,84+3,84+3,89+3,89+3,89+3,89+3,89+3,89+3,89+3,89+3,88+3,81+3,81+3,81+3,88+3,65+3,60+3,60+3,60+3,65+3,60+3,47+3,47+3,47+3,60+3,79,87,62,84,84,96,88,87,78,95,94,90,96,94,84,80,61,81,49,82,56,36,89,87,81,94,27,24,33,9,9,15,15,11,€228.1M
3,193080,De Gea,27,https://cdn.sofifa.org/players/4/19/193080.png,Spain,https://cdn.sofifa.org/flags/45.png,91,93,Manchester United,https://cdn.sofifa.org/teams/2/light/11.png,€72M,€260K,1471,Right,4,3,1,Medium/ Medium,Lean,Yes,GK,1,"Jul 1, 2011",,2020,6'4,168lbs,,,,,,,,,,,,,,,,,,,,,,,,,,,17,13,21,50,13,18,21,19,51,42,57,58,60,90,43,31,67,43,64,12,38,30,12,68,40,68,15,21,13,90,85,87,88,94,€138.6M
4,192985,K. De Bruyne,27,https://cdn.sofifa.org/players/4/19/192985.png,Belgium,https://cdn.sofifa.org/flags/7.png,91,92,Manchester City,https://cdn.sofifa.org/teams/2/light/10.png,€102M,€355K,2281,Right,4,5,4,High/ High,Normal,Yes,RCM,7,"Aug 30, 2015",,2023,5'11,154lbs,82+3,82+3,82+3,87+3,87+3,87+3,87+3,87+3,88+3,88+3,88+3,88+3,87+3,87+3,87+3,88+3,77+3,77+3,77+3,77+3,77+3,73+3,66+3,66+3,66+3,73+3,93,82,55,92,82,86,85,83,91,91,78,76,79,91,77,91,63,90,75,91,76,61,87,94,79,88,68,58,51,15,13,5,10,13,€196.4M
5,183277,E. Hazard,27,https://cdn.sofifa.org/players/4/19/183277.png,Belgium,https://cdn.sofifa.org/flags/7.png,91,91,Chelsea,https://cdn.sofifa.org/teams/2/light/5.png,€93M,€340K,2142,Right,4,4,4,High/ Medium,Normal,Yes,LF,10,"Jul 1, 2012",,2020,5'8,163lbs,83+3,83+3,83+3,89+3,88+3,88+3,88+3,89+3,89+3,89+3,89+3,89+3,82+3,82+3,82+3,89+3,66+3,63+3,63+3,63+3,66+3,60+3,49+3,49+3,49+3,60+3,81,84,61,89,80,95,83,79,83,94,94,88,95,90,94,82,56,83,66,80,54,41,87,89,86,91,34,27,22,11,12,6,8,8,€172.1M
6,177003,L. Modrić,32,https://cdn.sofifa.org/players/4/19/177003.png,Croatia,https://cdn.sofifa.org/flags/10.png,91,91,Real Madrid,https://cdn.sofifa.org/teams/2/light/243.png,€67M,€420K,2280,Right,4,4,4,High/ High,Lean,Yes,RCM,10,"Aug 1, 2012",,2020,5'8,146lbs,77+3,77+3,77+3,85+3,84+3,84+3,84+3,85+3,87+3,87+3,87+3,86+3,88+3,88+3,88+3,86+3,82+3,81+3,81+3,81+3,82+3,79+3,71+3,71+3,71+3,79+3,86,72,55,93,76,90,85,78,88,93,80,72,93,90,94,79,68,89,58,82,62,83,79,92,82,84,60,76,73,13,9,7,14,9,€137.4M
7,176580,L. Suárez,31,https://cdn.sofifa.org/players/4/19/176580.png,Uruguay,https://cdn.sofifa.org/flags/60.png,91,91,FC Barcelona,https://cdn.sofifa.org/teams/2/light/241.png,€80M,€455K,2346,Right,5,4,3,High/ Medium,Normal,Yes,RS,9,"Jul 11, 2014",,2021,6'0,190lbs,87+5,87+5,87+5,86+5,87+5,87+5,87+5,86+5,85+5,85+5,85+5,84+5,79+5,79+5,79+5,84+5,69+5,68+5,68+5,68+5,69+5,66+5,63+5,63+5,63+5,66+5,77,93,77,82,88,87,86,84,64,90,86,75,82,92,83,86,69,90,83,85,87,41,92,84,85,85,62,45,38,27,25,31,33,37,€164M
8,155862,Sergio Ramos,32,https://cdn.sofifa.org/players/4/19/155862.png,Spain,https://cdn.sofifa.org/flags/45.png,91,91,Real Madrid,https://cdn.sofifa.org/teams/2/light/243.png,€51M,€380K,2201,Right,4,3,3,High/ Medium,Normal,Yes,RCB,15,"Aug 1, 2005",,2020,6'0,181lbs,73+3,73+3,73+3,70+3,71+3,71+3,71+3,70+3,71+3,71+3,71+3,72+3,75+3,75+3,75+3,72+3,81+3,84+3,84+3,84+3,81+3,84+3,87+3,87+3,87+3,84+3,66,60,91,78,66,63,74,72,77,84,76,75,78,85,66,79,93,84,83,59,88,90,60,63,75,82,87,92,91,11,8,9,7,11,€104.6M
9,200389,J. Oblak,25,https://cdn.sofifa.org/players/4/19/200389.png,Slovenia,https://cdn.sofifa.org/flags/44.png,90,93,Atlético Madrid,https://cdn.sofifa.org/teams/2/light/240.png,€68M,€94K,1331,Right,3,3,1,Medium/ Medium,Normal,Yes,GK,1,"Jul 16, 2014",,2021,6'2,192lbs,,,,,,,,,,,,,,,,,,,,,,,,,,,13,11,15,29,13,12,13,14,26,16,43,60,67,86,49,22,76,41,78,12,34,19,11,70,11,70,27,12,18,86,92,78,88,89,€144.5M


Now print the schema of the dataset so we can see the data types of all the varaibles. 

In [5]:
# not necessary really, as I defined the schmea
fifa.printSchema()

root
 |-- ID: integer (nullable = true)
 |-- Name: string (nullable = true)
 |-- Age: integer (nullable = true)
 |-- Photo: string (nullable = true)
 |-- Nationality: string (nullable = true)
 |-- Flag: string (nullable = true)
 |-- Overall: integer (nullable = true)
 |-- Potential: integer (nullable = true)
 |-- Club: string (nullable = true)
 |-- Club Logo: string (nullable = true)
 |-- Value: string (nullable = true)
 |-- Wage: string (nullable = true)
 |-- Special: integer (nullable = true)
 |-- Preferred Foot: string (nullable = true)
 |-- International Reputation: integer (nullable = true)
 |-- Weak Foot: integer (nullable = true)
 |-- Skill Moves: integer (nullable = true)
 |-- Work Rate: string (nullable = true)
 |-- Body Type: string (nullable = true)
 |-- Real Face: string (nullable = true)
 |-- Position: string (nullable = true)
 |-- Jersey Number: integer (nullable = true)
 |-- Joined: string (nullable = true)
 |-- Loaned From: string (nullable = true)
 |-- Contract Valid U

## Now let's get started!

### First things first..... import the pyspark sql functions library

Since we know we will be using it a lot.

In [None]:
# already did this

### 1. Select the Name and Position of each player in the dataframe

In [6]:
fifa.select('Name', 'Position').show(5, False)

+-----------------+--------+
|Name             |Position|
+-----------------+--------+
|L. Messi         |RF      |
|Cristiano Ronaldo|ST      |
|Neymar Jr        |LW      |
|De Gea           |GK      |
|K. De Bruyne     |RCM     |
+-----------------+--------+
only showing top 5 rows



### 1.1 Display the same results from above sorted by the players names

In [7]:
fifa.select('Name', 'Position').orderBy(col('Name')).show(5, False)

+-------------+--------+
|Name         |Position|
+-------------+--------+
|A. Abang     |ST      |
|A. Abdellaoui|LB      |
|A. Abdennour |CB      |
|A. Abdi      |CM      |
|A. Abdu Jaber|ST      |
+-------------+--------+
only showing top 5 rows



### 2. Select only the players who belong to a club begining with FC

In [8]:
fcs = fifa.select('Name', 'Club').where(col('Club').like('FC%'))
display(fcs.show(5, False))
display(fcs.groupBy('Club').agg(count('Name').alias('Roster')).toPandas())

+---------------+-----------------+
|Name           |Club             |
+---------------+-----------------+
|L. Messi       |FC Barcelona     |
|L. Suárez      |FC Barcelona     |
|R. Lewandowski |FC Bayern München|
|M. ter Stegen  |FC Barcelona     |
|Sergio Busquets|FC Barcelona     |
+---------------+-----------------+
only showing top 5 rows



None

Unnamed: 0,Club,Roster
0,FC Luzern,27
1,FC St. Gallen,25
2,FC Seoul,28
3,FC Red Bull Salzburg,27
4,FC Augsburg,31
5,FC Emmen,30
6,FC Schalke 04,29
7,FC Erzgebirge Aue,29
8,FC Thun,26
9,FC København,23


### 3. Who is the oldest player in the dataset and how old are they?

Display only the name and age of the oldest player.

In [9]:
fifa.select('Name', 'Age').orderBy(col('Age').desc()).show(1)

+--------+---+
|    Name|Age|
+--------+---+
|O. Pérez| 45|
+--------+---+
only showing top 1 row



### 4. Select only the following players from the dataframe:

 - L. Messi
 - Cristiano Ronaldo

In [10]:
display(fifa.where(fifa.Name.isin('L. Messi', 'Cristiano Ronaldo')).toPandas())

Unnamed: 0,ID,Name,Age,Photo,Nationality,Flag,Overall,Potential,Club,Club Logo,Value,Wage,Special,Preferred Foot,International Reputation,Weak Foot,Skill Moves,Work Rate,Body Type,Real Face,Position,Jersey Number,Joined,Loaned From,Contract Valid Until,Height,Weight,LS,ST,RS,LW,LF,CF,RF,RW,LAM,CAM,RAM,LM,LCM,CM,RCM,RM,LWB,LDM,CDM,RDM,RWB,LB,LCB,CB,RCB,RB,Crossing,Finishing,HeadingAccuracy,ShortPassing,Volleys,Dribbling,Curve,FKAccuracy,LongPassing,BallControl,Acceleration,SprintSpeed,Agility,Reactions,Balance,ShotPower,Jumping,Stamina,Strength,LongShots,Aggression,Interceptions,Positioning,Vision,Penalties,Composure,Marking,StandingTackle,SlidingTackle,GKDiving,GKHandling,GKKicking,GKPositioning,GKReflexes,Release Clause
0,158023,L. Messi,31,https://cdn.sofifa.org/players/4/19/158023.png,Argentina,https://cdn.sofifa.org/flags/52.png,94,94,FC Barcelona,https://cdn.sofifa.org/teams/2/light/241.png,€110.5M,€565K,2202,Left,5,4,4,Medium/ Medium,Messi,Yes,RF,10,"Jul 1, 2004",,2021,5'7,159lbs,88+2,88+2,88+2,92+2,93+2,93+2,93+2,92+2,93+2,93+2,93+2,91+2,84+2,84+2,84+2,91+2,64+2,61+2,61+2,61+2,64+2,59+2,47+2,47+2,47+2,59+2,84,95,70,90,86,97,93,94,87,96,91,86,91,95,95,85,68,72,59,94,48,22,94,94,75,96,33,28,26,6,11,15,14,8,€226.5M
1,20801,Cristiano Ronaldo,33,https://cdn.sofifa.org/players/4/19/20801.png,Portugal,https://cdn.sofifa.org/flags/38.png,94,94,Juventus,https://cdn.sofifa.org/teams/2/light/45.png,€77M,€405K,2228,Right,5,4,5,High/ Low,C. Ronaldo,Yes,ST,7,"Jul 10, 2018",,2022,6'2,183lbs,91+3,91+3,91+3,89+3,90+3,90+3,90+3,89+3,88+3,88+3,88+3,88+3,81+3,81+3,81+3,88+3,65+3,61+3,61+3,61+3,65+3,61+3,53+3,53+3,53+3,61+3,84,94,89,81,87,88,81,76,77,94,89,91,87,96,70,95,95,88,79,93,63,29,95,82,85,95,28,31,23,7,11,15,14,11,€127.1M


### 5. Can you select the first character from the Release Clause variable which indicates the currency used?

In [12]:
rc = fifa.select('Name', 'Club', 'Release Clause', col('Release Clause').substr(0,1).alias('Release Currency'))
rc.groupBy('Release Currency').agg(count('Release Currency').alias('Players')).show()

+----------------+-------+
|Release Currency|Players|
+----------------+-------+
|            null|      0|
|               €|  16643|
+----------------+-------+



### 6. Can you select only the players who are over the age of 40?

In [11]:
fifa.select('Name', 'Age').where(col('Age') > 40).orderBy('Age').show(5, False)

+------------+---+
|Name        |Age|
+------------+---+
|B. Nivet    |41 |
|J. Villar   |41 |
|H. Sulaimani|41 |
|M. Tyler    |41 |
|C. Muñoz    |41 |
+------------+---+
only showing top 5 rows



### That's is for now... Great Job!