# Getting Data From a CSV File (Hoops Activity)

Open this notebook in [Callysto](https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https://github.com/pbeens/Data-Dunkers&branch=main&subPath=Demos/hoops_data_from_csv.ipynb&depth=1) | [Colab](https://githubtocolab.com/pbeens/Data-Dunkers/blob/main/Demos/hoops_data_from_csv.ipynb).

# Lesson Objectives

By the end of this lesson, students will be able to:
- Utilize the Pandas library to load data from a CSV file into a DataFrame.
- Display the top and bottom rows of data using the `head()` and `tail()` functions in Pandas.

## Program Setup 

This first code block may have to be run if these libraries haven't already been installed. Once this has been done once, it will never have to be done again. You can skip it for now, but if you get an error message related to a library not being installed, go ahead and run it.

In [None]:
%pip install pandas -q
%pip install plotly.express -q

## Introduction

There are many ways we can import data, but the most common are from the program itself, a CSV (comma separated values) file, from an Excel spreadsheet, from a Google Sheet, or from a webpage. 

In this demo, we will demonstrate how to get data from within the Jupyter Notebook itself.

## Setup & Input


In this example program, we first import the **Pandas** library using `import pandas as pd` (we still need `plotly.express` so that's imported as well). We then use the `pd.read_csv()` function to read the [CSV file](https://gist.githubusercontent.com/pbeens/84fcbb472943dc74a8c22cc7fcef1e42/raw/9dd73742f45b6a953d76f88e22026ec8af8c8593/Hoops_Data.csv) into a **Pandas DataFrame**. 

In [None]:
# import plotly.express and pandas
import plotly.express as px
import pandas as pd

# Read the CSV file into a DataFrame named df
# url = 'https://docs.google.com/spreadsheets/d/1peJis68KbNsD1jKwW32g3VOGCqMmdfjwGdGcpeukli8/export?format=csv' # Single Shot Data
url = 'https://docs.google.com/spreadsheets/d/1BFQvIypVAtZTxELURWg69qCXPdkD0c4Lc0CXM4WVIxo/export?format=csv' # 5 Shot Data
df = pd.read_csv(url)

## Process

Just for fun, let's look at the top few lines of data we just inputted. We use the Pandas `head()` function for this:

In [None]:
# Display the first 5 rows of the data
print(df.head())

What about the bottom rows? (Let's only look at the bottom 2 rows)

In [None]:
# Display the last 2 rows of the data
print(df.tail(2))

You'll see that Pandas has inserted an index column before the data. We won't worry about that at this time because it won't affect us here.

Besides using `head()` to have a quick look at the data, data scientists also often look at what columns are included in the datafile. To do that, we use the `df.columns` attribute. Here's how:

In [None]:
# Display the column names
print(df.columns)

Does that look familiar? Note that the case of the letters is important, so always pay attention to that. 