# Playing with Columns

(Open in [Callysto](https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https://github.com/pbeens/Data-Dunkers&branch=main&subPath=Demos/columns.ipynb&depth=1) | [Colab](https://githubtocolab.com/pbeens/Data-Dunkers/blob/main/Demos/columns.ipynb))

*A guidebook for this lesson is available [here](https://github.com/pbeens/Data-Dunkers/blob/main/Demos/Guides/columns-guide.md) or as a [PDF](https://github.com/pbeens/Data-Dunkers/blob/main/Demos/Guides/columns-guide.pdf) for download.*

# Lesson Objectives

By the end of this lesson, students will be able to:
- Utilize the Pandas library to load a CSV file into a DataFrame and display its contents.
- Access and manipulate specific columns within a DataFrame using the columns attribute and indexing.
- Employ Python loops to iterate over column names, enhancing their understanding of Python data structures and control flow.
- Display specific columns by filtering DataFrame columns using list notation to focus on particular statistics like Field Goals Made (FGM) and Field Goal Attempts (FGA).
- Understand the organizational structure of data within a DataFrame, including how to access and manipulate data for analysis purposes, focusing on performance metrics like shooting percentages.

## Let’s Get Our Data

In [None]:
import pandas as pd

# URL of the CSV file containing data for Pascal Siakam
url = 'https://raw.githubusercontent.com/pbeens/Data-Dunkers/main/Data/Pascal_Siakam.csv'

# Read the CSV file into a pandas DataFrame named df
df = pd.read_csv(url)

# Display the DataFrame
display(df)

You can inspect the raw CSV file [here](https://raw.githubusercontent.com/pbeens/Data-Dunkers/main/Data/Pascal_Siakam.csv).

Notice that last row? Why might you have to keep that in mind?

## Looking at the Columns

There are lots of columns, so let's list them using the `columns` attribute. 

Notice there aren't any parentheses when we use `columns`.

In [None]:
display(df.columns)

A different way we can look at the columns could be to use a `for loop`. Looping is an important concept in programming.

In [None]:
for column in df.columns:
    print(column)

Here's what all the columns mean:

| Field Name | Definition | Field Name | Definition |
|---|---|---|---|
| AST | The total number of assists a player has made. | FTM | The total number of free throws the player has made. |
| BLK | The total number of opponent shots a player has deflected or prevented. | GP | The number of games in which the player has appeared. |
| DREB | The total number of rebounds a player has grabbed on the defensive end. | GS | The number of games in which the player was in the starting lineup. |
| FG_PCT | The percentage of field goal attempts that are successful. | MIN | The total number of minutes the player has played. |
| FG2_PCT | The percentage of two-point field goal attempts that are successful. | OREB | The total number of rebounds a player has grabbed on the offensive end. |
| FG2A | The total number of two-point field goal attempts by the player. | PF | The total number of personal fouls committed by the player. |
| FG2M | The total number of two-point field goals a player has made. | PLAYER_AGE | The age of the player. |
| FG3_PCT | The percentage of three-point field goal attempts that are successful. | PTS | The total number of points a player has scored. |
| FG3A | The total number of three-point field goal attempts by the player. | REB | The total number of rebounds (offensive + defensive) a player has collected. |
| FG3M | The total number of three-point field goals a player has made. | SEASON_ID | The identifier for the basketball season. |
| FGA | The total number of field goal attempts by the player. | STL | The total number of times a player has successfully taken the ball away from an opponent. |
| FGM | The total number of field goals a player has made. | TEAM_ABBREVIATION | The abbreviated name of the team. |
| FT_PCT | The percentage of free throw attempts that are successful. | TEAM_ID | A unique identifier for the team. |
| FTA | The total number of free throw attempts by the player. | TOV | The total number of times a player loses possession of the ball. | 

Let's just look at just the FG (Field Goals Made) column.

In [None]:
display(df['FGM'])

What if we also want FGA (Field Goal Attempts Per Game)? Notice that now we put the column names in a list (`[ ]`), which are enclosed by brackets.

In [None]:
display(df[['FGM', 'FGA']])

## Exercise

Modify the code below to display just the columns for Field Goal Percentage, 2-Point Field Goal Percentage, 3-Point Field Goal Percentage, and Free Throw Percentage. 

*Hint: Start by recording the names of those fields from the table above.*

In [None]:
import pandas as pd

# URL of the CSV file containing data for Pascal Siakam
url = 'https://raw.githubusercontent.com/pbeens/Data-Dunkers/main/Data/Pascal_Siakam.csv'

# Read the CSV file into a pandas DataFrame
df = pd.read_csv(url)

# ... (rest of your code)

# Display the DataFrame
display(df)

---
*Report issues or give us feedback about this notebook [here](https://docs.google.com/forms/d/e/1FAIpQLSdMRX2hPqZyD8-argFJXxB3ABQdLk3aUH1CAfmMEtcFAlWzCw/viewform?usp=pp_url&entry.1771525592=Module%20Resources%20%28the%20Jupyter%20notebooks%2C%20PPTS%20or%20additional%20resources%29&entry.1364186163=Columns).*

---
Back to [Lessons](https://github.com/pbeens/Data-Dunkers/blob/main/Lessons.ipynb)

---
This notebook has been adapted from... 

https://github.com/callysto/basketball-and-data-science/blob/main/content/01-introduction.ipynb, with permission.