![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

# Soccer Analytics

Welcome to a Jupyter notebook on soccer analytics. This notebook is a free resource and is part of the Callysto project, which brings data science skills to grades 5 to 12 classrooms. 

In this notebook, we’ll code our own visualizations using data from the UEFA Champions League. 

Visualizations will be coded using Python, a computer programming language. Python contains words from English and is used by data scientists. Programming languages are how people communicate with computers.

“Run” the cells to see the graphs.
Click “Cell” and select “Run All.” This will import the data and run all the code to create the data visualizations (scroll back to the top after you’ve run the cells).   

![instructions](https://github.com/callysto/data-viz-of-the-week/blob/main/images/instructions.png?raw=true)

In [None]:
# import Python libraries

import pandas as pd
import plotly.graph_objects as go
import plotly.express as px

## Accessing UEFA Champions League data to create graphs with Python

## Inputting data from a table to create a bar graph

Data source: https://www.uefa.com/uefachampionsleague/season=2021/statistics/round=2001252/clubs/kind=goaltypes/index.html

The data source for this dataset is the UEFA Champions League website. The data is live, which means that it is updated when games are played. Some teams may have played more games than others. 

We can use Python to link to the dataset on the website and create a table. You can rerun the code cell if the data takes a long time to load by selecting the cell and clicking stop ⏹ then run ▶️ from the menu. Or, you can launch the <a href = "https://www.uefa.com/uefachampionsleague/" target="_blank">Champions League website</a> and access the Stats page.

The dataset is small with 32 rows and 10 columns. A smaller dataset makes it easier to see all the data in the dataset, which can be helpul when you are just learning how to analyze data through coding.

Let’s take a look at the columns: Data is displayed for the team, the total goals, the technique used to score (left foot, right foot, header, other), the area (inside or outside) from which the shot was taken, as well as whether the goal was a penalty shot. 

### 🔎 Which techniques lead to goal scoring?

In [None]:
try:
    url = 'https://www.uefa.com/uefachampionsleague/season=2021/statistics/round=2001252/clubs/kind=goaltypes/index.html'
    df1 = pd.read_html(url)[0]
    display(df1)
except:
    print("Please rerun cell")

Let's use a Python library, plotly, to create a bar graph and a double bar graph for techniques used to score a goal. A Python library is a toolkit of code, and the plotly library is used to create visualizations. Since the data is live, it may have changed. You can update or change the code. Follow the directions after the # in the code cell for the visualizations.

In [None]:
Type = ['left foot', 'right foot', 'header', 'other'] 
Count = [4, 13, 5, 0] # input counts for each technique in the same order as Type

px.bar(x=Type, 
       y=Count, 
       title='Bayern: Technique of goal', # change the name of the team, if you enter data for another team
       labels={'x':'Technique', 'y':'Count'}) 

# run the cell with Shift+Enter to update the graph

## Inputting data from a table to create a double bar graph

### 🔎 Comparing teams: Do similar techniques lead to goal scoring?

In [None]:
Type = ['left foot', 'right foot', 'header', 'other'] 
y1 = [4, 13, 5, 0] # input counts for each technique in the same order as Type for a team 
y2 = [6, 10, 1, 1] # input counts for each technique in the same order as Type for another team 

fig = go.Figure()
fig.add_trace(go.Bar(x=Type, y=y1, name='Bayern')) # change team name, if needed
fig.add_trace(go.Bar(x=Type, y=y2, name='Barcelona')) # change team name, if needed

fig.update_layout(title_text='Technique of goal', xaxis_title='Technique', yaxis_title='Count')
fig.show() # run the cell with Shift+Enter to update the graph

## Inputting data to create a circle graph

We can use Python to link to another dataset from the Champions League website and create a table. The dataset is small with 32 rows and 7 columns. 

Let’s take a look at the columns: Data is displayed for the team, total attempts, average attempts per game, and attempt types (on target, off target, blocked, against woodwork). On target attempts are goals. Unsuccessful attempts are off target, blocked, and against woodwork. Off target is when the shot misses the net, blocked is when the shot is blocked, likely by the opponent's defense, and against woodwork is when the shot hits the frame of the net.

### 🔎 How do on target goals and unsuccessful goal attempts compare?

In [None]:
try:
    url = 'https://www.uefa.com/uefachampionsleague/season=2021/statistics/round=2001252/clubs/kind=attempts/index.html'
    df2 = pd.read_html(url)[0]
    df2
except:
    print("Please rerun cell")

Since the data is live, it may have changed. You can update or change the code. Follow the directions after the # in the code cell for the visualizations. 

In [None]:
px.pie(names = ['On target', 'Off target', 'Blocked', 'Against woodwork'], 
       values = [44, 42, 34, 4], # input counts for each type of attempt
       title = 'Bayern: Goal attempts') # update team name, if needed

Notice that the Python code translates the counts into percentages on the circle graph. 

Now that we've created graphs by inputting data from reading tables, let's create graphs by accessing already collected data. Click on the <a href = "https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Flesson-plans&branch=master&subPath=notebooks/sports/soccer-partIII.ipynb&depth=1" target="_blank">next notebook</a> to explore how to create our own CSV ("comma-separated-value") file. 

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)