# Ex2 - Filtering and Sorting Data

This time we are going to pull data directly from the internet.

### Step 1. Import the necessary libraries

In [None]:
using DotEnv
using Pkg

DotEnv.load!()
path = ENV["ENV_PATH"]
Pkg.activate(path)

using CSV
using DataFrames
using Downloads
using Statistics # mean, std

### Step 2. Import the dataset from this [address](https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/02_Filtering_%26_Sorting/Euro12/Euro_2012_stats_TEAM.csv). 

### Step 3. Assign it to a variable called euro12.

In [3]:
url = "https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/02_Filtering_%26_Sorting/Euro12/Euro_2012_stats_TEAM.csv"
file = Downloads.download(url)
euro12 = CSV.read(file, DataFrame);

### Step 4. Select only the Goal column.

In [3]:
euro12[!, :Goals]

16-element Vector{Int64}:
  4
  4
  4
  5
  3
 10
  5
  6
  2
  2
  6
  1
  5
 12
  5
  2

### Step 5. How many team participated in the Euro2012?

In [4]:
n_team = length(unique(euro12[!, :Team]))
@show n_team;

n_team = 16


### Step 6. What is the number of columns in the dataset?

In [6]:
n_columns = size(euro12, 2)
@show n_columns;

n_columns = 35


### Step 7. View only the columns Team, Yellow Cards and Red Cards and assign them to a dataframe called discipline

In [4]:
discipline = euro12[!, ["Team", "Yellow Cards", "Red Cards"]]
discipline

Row,Team,Yellow Cards,Red Cards
Unnamed: 0_level_1,String31,Int64,Int64
1,Croatia,9,0
2,Czech Republic,7,0
3,Denmark,4,0
4,England,5,0
5,France,6,0
6,Germany,4,0
7,Greece,9,1
8,Italy,16,0
9,Netherlands,5,0
10,Poland,7,1


### Step 8. Sort the teams by Red Cards, then to Yellow Cards

In [5]:
sort(discipline, ["Red Cards", "Yellow Cards"], rev=true)

Row,Team,Yellow Cards,Red Cards
Unnamed: 0_level_1,String31,Int64,Int64
1,Greece,9,1
2,Poland,7,1
3,Republic of Ireland,6,1
4,Italy,16,0
5,Portugal,12,0
6,Spain,11,0
7,Croatia,9,0
8,Czech Republic,7,0
9,Sweden,7,0
10,France,6,0


### Step 9. Calculate the mean Yellow Cards given per Team

In [9]:
round(mean(discipline[!, "Yellow Cards"]), digits=0)

7.0

### Step 10. Filter teams that scored more than 6 goals

In [11]:
euro12[euro12[!, :Goals] .> 6, :]

Row,Team,Goals,Shots on target,Shots off target,Shooting Accuracy,% Goals-to-shots,Total shots (inc. Blocked),Hit Woodwork,Penalty goals,Penalties not scored,Headed goals,Passes,Passes completed,Passing Accuracy,Touches,Crosses,Dribbles,Corners Taken,Tackles,Clearances,Interceptions,Clearances off line,Clean Sheets,Blocks,Goals conceded,Saves made,Saves-to-shots ratio,Fouls Won,Fouls Conceded,Offsides,Yellow Cards,Red Cards,Subs on,Subs off,Players Used
Unnamed: 0_level_1,String31,Int64,Int64,Int64,String7,String7,Int64,Int64,Int64,Int64,Int64,Int64,Int64,String7,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64?,Int64,Int64,Int64,Int64,String7,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64
1,Germany,10,32,32,47.8%,15.6%,80,2,1,0,2,2774,2427,87.4%,3761,101,60,35,91,73,69,0,1,11,6,10,62.6%,63,49,12,4,0,15,15,17
2,Spain,12,42,33,55.9%,16.0%,100,0,1,0,2,4317,3820,88.4%,5585,69,106,44,122,102,79,0,5,8,1,15,93.8%,102,83,19,11,0,17,17,18


### Step 11. Select the teams that start with G

In [14]:
euro12[first.(euro12[!, :Team], 1) .== "G", :]

Row,Team,Goals,Shots on target,Shots off target,Shooting Accuracy,% Goals-to-shots,Total shots (inc. Blocked),Hit Woodwork,Penalty goals,Penalties not scored,Headed goals,Passes,Passes completed,Passing Accuracy,Touches,Crosses,Dribbles,Corners Taken,Tackles,Clearances,Interceptions,Clearances off line,Clean Sheets,Blocks,Goals conceded,Saves made,Saves-to-shots ratio,Fouls Won,Fouls Conceded,Offsides,Yellow Cards,Red Cards,Subs on,Subs off,Players Used
Unnamed: 0_level_1,String31,Int64,Int64,Int64,String7,String7,Int64,Int64,Int64,Int64,Int64,Int64,Int64,String7,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64?,Int64,Int64,Int64,Int64,String7,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64
1,Germany,10,32,32,47.8%,15.6%,80,2,1,0,2,2774,2427,87.4%,3761,101,60,35,91,73,69,0,1,11,6,10,62.6%,63,49,12,4,0,15,15,17
2,Greece,5,8,18,30.7%,19.2%,32,1,1,1,0,1187,911,76.7%,2016,52,53,10,65,123,87,0,1,23,7,13,65.1%,67,48,12,9,1,12,12,20


### Step 12. Select the first 7 columns

In [16]:
euro12[!, 1:7]

Row,Team,Goals,Shots on target,Shots off target,Shooting Accuracy,% Goals-to-shots,Total shots (inc. Blocked)
Unnamed: 0_level_1,String31,Int64,Int64,Int64,String7,String7,Int64
1,Croatia,4,13,12,51.9%,16.0%,32
2,Czech Republic,4,13,18,41.9%,12.9%,39
3,Denmark,4,10,10,50.0%,20.0%,27
4,England,5,11,18,50.0%,17.2%,40
5,France,3,22,24,37.9%,6.5%,65
6,Germany,10,32,32,47.8%,15.6%,80
7,Greece,5,8,18,30.7%,19.2%,32
8,Italy,6,34,45,43.0%,7.5%,110
9,Netherlands,2,12,36,25.0%,4.1%,60
10,Poland,2,15,23,39.4%,5.2%,48


### Step 13. Select all columns except the last 3.

In [20]:
euro12[!, 1:end-3]

Row,Team,Goals,Shots on target,Shots off target,Shooting Accuracy,% Goals-to-shots,Total shots (inc. Blocked),Hit Woodwork,Penalty goals,Penalties not scored,Headed goals,Passes,Passes completed,Passing Accuracy,Touches,Crosses,Dribbles,Corners Taken,Tackles,Clearances,Interceptions,Clearances off line,Clean Sheets,Blocks,Goals conceded,Saves made,Saves-to-shots ratio,Fouls Won,Fouls Conceded,Offsides,Yellow Cards,Red Cards
Unnamed: 0_level_1,String31,Int64,Int64,Int64,String7,String7,Int64,Int64,Int64,Int64,Int64,Int64,Int64,String7,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64?,Int64,Int64,Int64,Int64,String7,Int64,Int64,Int64,Int64,Int64
1,Croatia,4,13,12,51.9%,16.0%,32,0,0,0,2,1076,828,76.9%,1706,60,42,14,49,83,56,missing,0,10,3,13,81.3%,41,62,2,9,0
2,Czech Republic,4,13,18,41.9%,12.9%,39,0,0,0,0,1565,1223,78.1%,2358,46,68,21,62,98,37,2,1,10,6,9,60.1%,53,73,8,7,0
3,Denmark,4,10,10,50.0%,20.0%,27,1,0,0,3,1298,1082,83.3%,1873,43,32,16,40,61,59,0,1,10,5,10,66.7%,25,38,8,4,0
4,England,5,11,18,50.0%,17.2%,40,0,0,0,3,1488,1200,80.6%,2440,58,60,16,86,106,72,1,2,29,3,22,88.1%,43,45,6,5,0
5,France,3,22,24,37.9%,6.5%,65,1,0,0,0,2066,1803,87.2%,2909,55,76,28,71,76,58,0,1,7,5,6,54.6%,36,51,5,6,0
6,Germany,10,32,32,47.8%,15.6%,80,2,1,0,2,2774,2427,87.4%,3761,101,60,35,91,73,69,0,1,11,6,10,62.6%,63,49,12,4,0
7,Greece,5,8,18,30.7%,19.2%,32,1,1,1,0,1187,911,76.7%,2016,52,53,10,65,123,87,0,1,23,7,13,65.1%,67,48,12,9,1
8,Italy,6,34,45,43.0%,7.5%,110,2,0,0,2,3016,2531,83.9%,4363,75,75,30,98,137,136,1,2,18,7,20,74.1%,101,89,16,16,0
9,Netherlands,2,12,36,25.0%,4.1%,60,2,0,0,0,1556,1381,88.7%,2163,50,49,22,34,41,41,0,0,9,5,12,70.6%,35,30,3,5,0
10,Poland,2,15,23,39.4%,5.2%,48,0,0,0,1,1059,852,80.4%,1724,55,39,14,67,87,62,0,0,8,3,6,66.7%,48,56,3,7,1


### Step 14. Present only the Shooting Accuracy from England, Italy and Russia

In [16]:
euro12[in.(euro12.Team, Ref(Set(["England", "Italy", "Russia"]))), ["Team","Shooting Accuracy"]]

Row,Team,Shooting Accuracy
Unnamed: 0_level_1,String31,String7
1,England,50.0%
2,Italy,43.0%
3,Russia,22.5%
