This notebook has been adapted from... 

https://github.com/callysto/basketball-and-data-science/blob/main/content/01-introduction.ipynb, with permmission.

(Open in 
[Callysto](https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https://github.com/pbeens/Data-Analysis&branch=main&subPath=BADS/01-Intro/01-03-sorting-data.ipynb&depth=1) | [Colab](https://githubtocolab.com/pbeens/Data-Analysis/blob/main/BADS/01-Intro/01-03-sorting-data.ipynb)) 

# Let’s Get Our Data

In [None]:
import pandas as pd

# URL of the CSV file containing data for Pascal Siakam
url = 'https://raw.githubusercontent.com/callysto/basketball-and-data-science/main/content/data/nba-players/Pascal_Siakam.csv'

# Read the CSV file into a pandas DataFrame
df = pd.read_csv(url)

# Display the DataFrame
display(df)

You can view the raw CSV file [here](https://raw.githubusercontent.com/callysto/basketball-and-data-science/main/content/data/nba-players/Pascal_Siakam.csv).

As a reminder, here are our columns:

|Column|Meaning|Column|Meaning|
|:-:|-|:-:|-|
| **2P** | 2-Point Field Goals Per Game | **2P%** | 2-Point Field Goal Percentage |
| **2PA** | 2-Point Field Goal Attempts Per Game | **3P** | 3-Point Field Goals Per Game |
| **3P%** | 3-Point Field Goal Percentage | **3PA** | 3-Point Field Goal Attempts Per Game |
| **AST** | Assists Per Game | **BLK** | Blocks Per Game |
| **DRB** | Defensive Rebounds Per Game | **eFG%** | Effective Field Goal Percentage* |
| **FG** | Field Goals Per Game | **FG%** | Field Goal Percentage |
| **FGA** | Field Goal Attempts Per Game | **FT** | Free Throws Per Game |
| **FT%** | Free Throw Percentage | **FTA** | Free Throw Attempts Per Game |
| **G** | Games | **GS** | Games Started |
| **Lg** | League | **MP** | Minutes Played Per Game |
| **ORB** | Offensive Rebounds Per Game | **PF** | Personal Fouls Per Game |
| **PTS** | Points Per Game | **Pos** | Position |
| **STL** | Steals Per Game | **TOV** | Turnovers Per Game |
| **TRB** | Total Rebounds Per Game |

<span style="font-size:10px">*This statistic adjusts for the fact that a 3-point field goal is worth one more point than a 2-point field goal.</span>

# Dropping a Line by Index Number

Notice we still have that last line, "[Career](https://raw.githubusercontent.com/callysto/basketball-and-data-science/main/content/data/nba-players/Pascal_Siakam.csv)"? Let's drop that.

Look closely and you'll see that it's index #7. 

In [None]:
display(df.drop(7))

What happens if we change the "7" to another number? Try it! (Then change it back to 7!)

Let's display the dataframe again, to see if the change is permanent.

In [None]:
display(df)

Nope, the Careers line is still there. 

Let's now tell the program we want to make this permanent by using the `inplace=True` argument in `drop()`.

In [None]:
df.drop(7, inplace=True)

display(df)

That's better!

# Sorting

What if we want to sort the data by personal fouls (PF)?

In [None]:
display(df.sort_values('PF'))

What if we want them descending instead of ascending? Simply add `ascending=False` to the arguments in `sort_values()`. (The default is `ascending=True`)

In [None]:
df.sort_values('PF', ascending=False)

Let's sort on two columns, for example first by Blocks Per Game (BLK) and then by Steals Per Game (STL). Notice that now we put the column names in a list (`[ ]`).



In [None]:
df.sort_values(['BLK', 'STL'])

Let's reduce the columns we're looking at and save it in a new dataframe named `df_2`. 

If we were to continue working with just these columns we then work with `df_2` from now on.

In [None]:
df_2 = df[['G', 'GS', 'MP', 'FG', 'FGA']]

display(df_2)

# Exercise

Modify the program below to only display the columns 'Season', 'FG%', '2P%', and '3P%' sorted by 'FG%'.

In [None]:
import pandas as pd

url = 'https://raw.githubusercontent.com/callysto/basketball-and-data-science/main/content/data/nba-players/Pascal_Siakam.csv'

df = pd.read_csv(url)

display(df)

# Extra Challenge

Produce this graph using the code stub below. You can view the raw data [here](https://raw.githubusercontent.com/pbeens/Data-Analysis/main/Data/raptors-2023.csv).

![raptors-2023-top-5-points.png](https://raw.githubusercontent.com/pbeens/Data-Analysis/b411303d899ad240197f905ceb288644f769ea5c/Images/raptors-2023-top-5-points.png)



In [None]:
import pandas as pd
import plotly_express as px

url = r'https://raw.githubusercontent.com/pbeens/Data-Analysis/main/Data/raptors-2023.csv'

# put the rest of the code here!

---
Next Lesson: [Bar Graphs](../02-visualize/02-01-bar-graphs.ipynb) ([GitHub link](https://github.com/pbeens/Data-Analysis/blob/f74aee1f8912a8a1e80ec13c277203f62bebadc2/BADS/02-visualize/02-01-bar-graphs.ipynb))