![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

# Visualizing Data

In this notebook we'll visualize data using the [Plotly Express](https://plotly.com/python/plotly-express/) library, imported as `px`.

## Bar Graphs

We will start with the same data from our last notebook and create a bar graph.

In [None]:
import pandas as pd
import plotly.express as px
data = pd.read_csv('https://raw.githubusercontent.com/callysto/basketball-and-data-science/main/content/data/nba-players/Pascal_Siakam.csv')
px.bar(data, x='Season', y='FG', title='Siakam Field Goals by Season')

It is also possible rename the axis titles.

In [None]:
px.bar(data, x='Season', y='FG', title='Siakam Field Goals by Season').update_yaxes(title='Field Goals')

If we prefer a horizontal bar graph we can use `orientation='h'` and switch the `x` and `y` columns.

In [None]:
px.bar(data, x='FG', y='Season', title='Siakam Field Goals by Season', orientation='h')

If we want to include multiple colunns from our dataset, we can put them in a list using `[]` brackets.

In [None]:
px.bar(data, x='Season', y=['FG', 'FGA'], title='Siakam Field Goals by Season')

By default it will stack the bars, we can use `barmode='group'` to put them side by side.

In [None]:
px.bar(data, x='Season', y=['FG', 'FGA'], barmode='group', title='Siakam Field Goals by Season')

### Exercise

---

Create a bar chart with `'Age'` on the x-axis and `'FG%', '2P%', '3P%'` on the y-axis. Do you see any changes in these values over time?

---

## Scatter Plots

A good way to visualize if values are related is to use a scatter plot.

In [None]:
px.scatter(data, x='FGA', y='FG', title='Siakam Field Goals versus Field Goal Attempts')

We can see that generall the more attempts Pascal Siakam made in a season, the more field goals he scored.

Just like with bar graphs, we can change the axis titles. To make it easier we will create a variable called `fig` to store the scatter plot figure, then update the axes and use the `.show()` method to display it.

In [None]:
fig = px.scatter(data, x='FGA', y='FG', title='Siakam Field Goals versus Field Goal Attempts')
fig.update_xaxes(title='Field Goal Attempts').update_yaxes(title='Field Goals')
fig.show()

A scatter plot can also include a line of best fit, called a trendline. We often use the [ordinary least squares](https://en.wikipedia.org/wiki/Ordinary_least_squares) method of calculating the trendline.

In [None]:
px.scatter(data, x='FGA', y='FG', title='Siakam Field Goals versus Field Goal Attempts', trendline='ols')

We can also add options like `color` and `size` to visualize other columns from our data.

In [None]:
px.scatter(data, x='FGA', y='FG', title='Siakam Field Goals versus Field Goal Attempts', color='Season', size='FG%')

### Exercise

---

Create a scatter plot with assists per game (`'AST'`) on the y-axis, points per game (`'PTS'`) on the y-axis,  and `color='Age'`. Include a trendline.

What do you observe about the relationship between these columns?

---

## Other Visualizations

Plotly has functions to create many different types of visualizations, listed on the [Plotly Express page](https://plotly.com/python/plotly-express/). Let's try a pie chart.

In [None]:
px.pie(data, values='FG', names='Season', title='Siakam Field Goals by Season')

In the next notebook we will look at [statistics](03-statistics.ipynb).

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)