In [None]:
# These lines import the Numpy and Datascience modules.
from datascience import *


## DataSets: Adding or changing a dataset

## Part 1: Changing a dataset 
<div class="alert alert-block alert-info">
There are two goals in this section:

- Understand the paths to data files that are stored with your notebook
- Be able to change the location and path to a data file
</div>

### Understanding Paths

<div class="alert alert-block alert-success">
If you would like to change the dataset being used in a notebook, your first step is to determine the path to the dataset relative to the notebook you are working with. The second step is copy that path into the `Table().read_table` function. 

For example, we are updating the player_data.csv and salary_data.csv files to reflect 2024 NBA data. Later we will add in the WNBA player data.

Sometimes it is helpful to know where the data files you need are located relative to your notebook. There are two commands we can run:
- `%pwd` -- "print working directory" -- displays your current directory
- `%ls -l` -- "list in long format" which lists all the files in the current directory

Run the cells below to observe these commands:
</div>


In [None]:
%pwd

In [None]:
%ls -l

### Changing the path to a dataset

<div class="alert alert-block alert-success">
Hopefully the `pwd` and `ls -l` commands illustrated that you are in the nba-demo directory and your data files are present.

The paths we need to the 2024 data are:
- player_data_nba_2024.csv
- salary_data_nba_2024.csv

You might want to run the cell below to see the data we currently have. 

Then change the paths below by replacing player_data.csv and salary_data.csv with the appropriate new file names. Once the change is made, run the cell and take a look at the data again
</div>


In [None]:
nba_player_data = Table.read_table("player_data.csv")
nba_salary_data = Table.read_table("salary_data.csv")

# The show method immediately displays the contents of a table. 
nba_player_data.show(3)
nba_salary_data.show(3)



### Storing datasets in a sub-folder

<div class="alert alert-block alert-success">
There is a big difference!  The original data is from 2014-15 season. 

You might want to load and display both the 2014 and 2024 datasets in the same cell to easily see the difference in salaries. 

Before we do that, we can practice moving our datasets into a different folder.

- Create the folder, data, in the nba-demo folder.
- Move the .csv files into the folder, data.
- Run `pwd` and `ls -l` again to see the difference.
</div>

In [None]:
%pwd

In [None]:
%ls -l

<div class="alert alert-block alert-success">
Now, instead of seeing the csv files you see the folder, "data/". We need to include the folder name, "data/", when we use the `Table.read_table` function.

In the cell below:
- read in the 2014 and 2024 data via read_table
- show the first few records of each.
- Be careful! The path to the data files is now: "data/name_of_file.csv"
</div>

In [None]:

nba_player_data_2024 = ...
nba_salary_data_2024 = ...

# The show method immediately displays the contents of a table. 
nba_salary_data_2024.show(3)

nba_player_data_2014 = ...
nba_salary_data_2014 = ...

nba_salary_data_2014.show(3)


### Where are we

<div class="alert alert-block alert-success">
In ten years, the highest paid players in the NBA jumps $28,000,000! 

Inflation, OK. But still. $51,000,000 in 2024 was about $38,000,000 in 2014.

We know how to read in csv files both from the directory our notebook is currently in as well as from a sub-folder. 

Now, we work on loading the WNBA salary data from a URL(website).
</div>

## Part 2: Change the data file to read from a URL
<div class="alert alert-block alert-info">
In this section, we are aiming for you to be able to:

- Load and display data from a URL
- Create your own Code cell

</div>

<div class="alert alert-block alert-success">

It is pretty straight-froward to load a dataset from a URL instead of storing the dataset with the notebook itself.

We want to add WNBA player data to this notebook, which includes statistics and salary data for WNBA players. We conveniently put the dataset in a git repository on Github. You can navigate to the dataset by going here:
- https://github.com/ucb-ds/demo-datasets/raw/main/wnba_data.csv

You can create a "Code" cell below, that will load and show the data from a URL by replacing the path in `Table.read_table` with the URL above.

In order to create a "Code" cell, move to the top-right of this cell and click the button with the "+" sign underneath it - second to last

![Image of "Code" button to click to create "Code" cell](code1.png "'Code' button to click to create 'Code' cell")

Copy the contents of a cell earlier in the notebook that loads in player data, then change the path to the WNBA data URL. You may want to change the variable names as well to be descriptive of the dataset we are loading.
</div>

## Summary
<div class="alert alert-block alert-info">
Here we have practiced working with various methods of loading datasets into your notebook, display those datasets, as well as creating and editing Code cells.
</div>