# Creating Graphs

## Objectives

Students will be able to:
- Use the [Plotly Express](https://plotly.com/python/plotly-express/) library in [Python](https://www.python.org/) to create data visualizations
- Load data from a CSV file into a [pandas](https://pandas.pydata.org/) [dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) (like a spreadsheet)
- Filter the data
- Rename axis titles
- Compare multiple data columns

## Importing and Filtering Data

Let's work with a [a data set about Pascal Siakam](../Data/Pascal_Siakam.csv), using the [pandas](https://pandas.pydata.org/) library to create a [dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html).

We also use a filter to only include data up to and including the '2022-23' season (`SEASON_ID`). We'll display the resulting dataframe by including `df` on the last line of the cell.

In [None]:
import pandas as pd

url = 'https://raw.githubusercontent.com/Data-Dunkers/data-dunkers-modules/main/data-dunkers/Data/Pascal_Siakam.csv'
df = pd.read_csv(url)

filter = df['SEASON_ID'] <= '2022-23'
df = df[filter]

df

## Graphing

We will use the [Plotly Express](https://plotly.com/python/plotly-express/) library for graphiung. This library can be imported by using the following command:

`import plotly.express as px`

The `px` part means we can refer to the library as `px` any time we want to use it.

### Bar Graphs

To create a [bar graph](https://plotly.com/python/bar-charts/), we will use the function `px.bar()`, tell it which dataframe to use (`df`), and state which column we want on each axis. In this example  we are plotting Siakam's field goals made (FGM) by season.

In [None]:
import plotly.express as px
px.bar(df, x='SEASON_ID', y='FGM')

We can also add a title to the graph.

In [None]:
px.bar(df, x='SEASON_ID', y='FGM', title='Siakam Field Goals by Season')

#### Axis Labels

Let's rename the y-axis with `.update_yaxes(title='Field Goals')`

In [None]:
px.bar(df, x='SEASON_ID', y='FGM', title='Siakam Field Goals by Season').update_yaxes(title='Field Goals')

How would you do the x-axis label? Use the code cell below to make the x-axis label `'Season'`.

In [None]:
px.bar(df, x='SEASON_ID', y='FGM', title='Siakam Field Goals by Season').update_yaxes(title='Field Goals')

#### Multiple Columns

If we want to include multiple columns on the x-axis, we can use `[ ]` brackets.

In [None]:
px.bar(df, x='SEASON_ID', y=['FGM', 'FGA'], title='Siakam Field Goals by Season')

You see that the default is to stack the bars. We can use `barmode='group'` to put them side by side.

In [None]:
px.bar(df, x='SEASON_ID', y=['FGM', 'FGA'], barmode='group', title='Siakam Field Goals by Season')

## Exercise

Create a bar chart with 'Age' on the x-axis and `['FG_PCT', 'FG2_PCT', 'FG3_PCT']` on the y-axis.

What changes do you see in these values over time?

### Line Graphs

Using Plotly Express and the same dataframe, we can also create [line graphs](https://plotly.com/python/line-charts/).

In [None]:
px.line(df, x='SEASON_ID', y='FGM', title='Siakam Field Goals by Season')

### Scatter Plots

[Scatter plots](https://plotly.com/python/line-and-scatter/) are similar to line graphs, without the connecting lines.

In [None]:
px.scatter(df, x='SEASON_ID', y='FGM', title='Siakam Field Goals by Season')

#### Displaying More Data

We can color code the points by another value from the data set. Let's put numeric values on the axes and color the points by season.

In [None]:
px.scatter(df, x='FGA', y='FGM', title='Siakam Field Goals Made versus Field Goal Attempts', color='SEASON_ID')

We can also change the size of the data points to be proportional to one of the data columns. For example `size='AST'`.

In [None]:
px.scatter(df, x='FGA', y='FGM', title='Siakam Field Goals versus Field Goal Attempts', color='SEASON_ID', size='AST')

#### Trendlines

To help us draw conclusions we can add a line of best fit, which we call a trendline. We often use the [ordinary least squares](https://en.wikipedia.org/wiki/Ordinary_least_squares) (OLS) method to calculate the parameters of the trendline.

In [None]:
px.scatter(df, x='FGA', y='FGM', title='Siakam Field Goals Made versus Field Goal Attempts', trendline='ols')

#### Exercise

Create a scatter plot with assists per game `('AST')` on the x-axis, points per game `('PTS')` on the y-axis, and `color='AGE'`. Include a trendline.

What do you observe about the relationship between these columns?

### Pie Charts

When creating a [pie chart](https://plotly.com/python/pie-charts/), we use `values=` and `names=` instead of `x=` and `y=`.

In [None]:
px.pie(df, values='FGM', names='SEASON_ID', title='Siakam Field Goals by Season')

### Histograms

[Histograms](https://plotly.com/python/histograms/) are a type of bar graph that groups data in ranges, called bins. The height of each bar shows the amount of things in that bin, so we don't use `y=`.

For this we will use a different data set, with data about all of the [2023 Toronto Raptors players](https://en.wikipedia.org/wiki/2022%E2%80%9323_Toronto_Raptors_season).

In [None]:
url = 'https://raw.githubusercontent.com/Data-Dunkers/data-dunkers-modules/main/data-dunkers/Data/raptors-2023.csv'
raptors_df = pd.read_csv(url)

px.histogram(raptors_df, x='FG%', title='Raptors Field Goal Percentage')

We can see that most players had a field goal percentage between 0.45 and 0.499.

Let's change the number of bins with `nbins=15`.

In [None]:
px.histogram(raptors_df, x='FG%', title='Raptors Field Goal Percentage', nbins=15)

We can also use use color to display another column, for example `color='Age'`.

In [None]:
px.histogram(raptors_df, x='FG%', title='Raptors Field Goal Percentage', color='Age')

### Exercise

Create a histogram that shows free throw percentages.

## Conclusion

In this notebook we created data visualizations from a [pandas dataframe]([dataframe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)) using [Plotly Express](https://plotly.com/python/plotly-express/).

Back to [Lessons](../Lessons.ipynb)