# Class 3: The basics of Python continued

In this notebook we will continue learning some of the basic syntax and data structures of Python. We will use what we learn today throughout the rest of the semester so make sure you are understanding the main ideas, and try to practice more on your own. 

## Notes on the class Jupyter setup

If you have the *ydata123_2023e* environment set up correctly, you can get the class code using the code below (which presumably you've already done given that you are seeing this notebook).  

In [None]:
import YData

YData.download.download_class_code(3)   # get class 2 code    

YData.download.download_class_code(3, TRUE) # get the code with the answers 

There are also similar functions to download the homework:

In [None]:
YData.download.download_homework(1)  # downloads the first homework 

If you are using colabs, you should install polars and the YData packages by uncommenting and running the code below.

In [None]:
# !pip install polars
# !pip install https://github.com/emeyers/YData_package/tarball/master

If you are using google colabs, you should also uncomment and run the code below to mount the your google drive

In [None]:
# from google.colab import drive
# drive.mount('/content/drive')

## Review of the basics of Python

Let's review the basics of Python with a "number journey"...
1. Create a string called `string_holding_a_number` that is equal to the value of 2
2. Multiple this string by 3 to create a longer string and store it in the name `string_holding_a_bigger_number`
3. Convert this string to an integer and store it in the name `just_an_int`
4. Divide `just_an_int` by -8 and save it to the name `a_negative_float`
5. Take the absolute value of `a_negative_float` and save it to the name `a_positive_float`
6. Print out the value of `a_positive_float` and also the type of `a_positive_float`


### Any questions???

If so, please ask them now...

## Lists 

Lists are a *data structure* that can hold multiple values. 

We use the square brackets to create lists; e.g., `my_list = [1, 2, 3]`

We can access elements using square brackets; e.g., `my_list[2]`. 

Let's explore lists!

In [None]:
# A list of numbers



In [None]:
# A list of strings



In [None]:
# Lists can hold elements of different types



In [None]:
# We can access elements of a list also using square brackets


In [None]:
# concatenating lists



In [None]:
# getting the number of elements in a list using the len() function 


In [None]:
# slicing lists


In [None]:
# start at a different index


In [None]:
# what does this do? 


In [None]:
# If a list is all numbers we can sum the values, or get the maximum value



In [None]:
# We can't sum values that are not numbers


In [None]:
# We can append values on to a list using the append() method. 
# Note, this modifies the original list and return a value of None!!!





In [None]:
# We can also sort a list. Note this modified the original list!

number_list = [1, 52, 5, 124, 2, 5, 1, 4, 4, 5, 98]






In [None]:
# if we want to save the original list we can create a copy

number_list = [1, 52, 5, 124, 2, 5, 1, 4, 4, 5, 98]







In [None]:
# one can store lists inside of other lists










## Dictionaries

Dictionaries allow us to look up values. In particular, we provide a "key" and the dictionary return a "value". 

We can create dictionaries using the syntax: 

`my_dict = {"key1": 1, "key2": 20}`


In [None]:
# we can access elements using square brackets 


In [None]:
# values in dictionaries can be list



In [None]:
# We can create a dictionary from two lists of the same length using the dict() and zip() functions








## Example: NBA Salaries

Let's look a salaries of basketball players in the NBA! The data we will analyze contains infomraiton about each player including their salary from the 2015-2016 season listed in millions of dollars.  

We will load the data as a "polar's DataFrame" which is a data structure we will discuss more in a couple of weeks. We will then convert the data to lists and dictionaries to explore it further. 

This table can be found online: https://www.statcrunch.com/app/index.php?dataid=1843341


In [None]:
import polars as pl

nba = pl.read_csv("nba_salaries_2015_16.csv")  # load in the data

nba.head()  # show the first 6 rows


In [None]:
# get the salaries as a list

salary_list = nba["SALARY"].to_list()
player_list = nba["PLAYER"].to_list()

salary_list[0:10]

In [None]:
# What is the maximum and minimum salaries? 



In [None]:
# What is the average salary?



In [None]:
# we can also use the mean() and median() functions in the statistics module to get the mean and the median values

import statistics





In [None]:
# What was Stephen Curry's salary in 2015-2016 season? 



In [None]:
# Visualize a histogram of the data with vertical lines at the mean and the median
# Don't worry about this code for now. We will go over creating visualizations soon....

import matplotlib.pyplot as plt
%matplotlib inline

plt.hist(nba["SALARY"], bins=20, edgecolor='k', color = 'c');
plt.xlabel("Salary (million $)");
plt.ylabel("Count");
plt.axvline(statistics.mean(salary_list), color='r', linestyle='dashed', linewidth=1, label = "Mean");
plt.axvline(statistics.median(salary_list), color='b', linestyle='dashed', linewidth=1, label = "Median");
plt.legend();

# View the counts in the different histogram bins
import numpy as np
counts, bins = np.histogram(salary_list, bins = 10)
dict(zip(list(zip(np.round(bins[0:-1], 1), np.round(bins[1:], 1))), counts))


We will learn much easier ways to manipulate structured data tables when we learn how to use the polars package. 

## Loops

Loops allow us to repeat a process many times. They are particularly useful in conjuction with lists to process and store multiple values. 


In [None]:
a_list = ["first", "second", "third", "forth"]





In [None]:
# looping over numbers using the range() function




In [None]:
# Can you print the squares of the numbers from 1 to 6? 




We can use a loop to build up values in a list...

In [None]:
# create a list that has the squares of the numbers 1 to 6








How can we sum the numbers 1 to 10? Or, to use mathematical notation, how can we compute $\sum_{i=1}^{10} i$ ?


In [None]:
# we can use enumerate(my_list) to get both values from a list and sequential index numbers







## Comparison ##

We can do simple mathematical and string comparisons in Python which return Boolean values.

In [None]:
# basic math comparison


In [None]:
# checking the type of a basic math comparison


In [None]:
# another basic math comparison


In [None]:
# We can type in Boolean values ourselves


In [None]:
# We use == to compare whether two items are equal (not 3 = 3)


In [None]:
# we can compare whether a value is between two values


In [None]:
# we can also do mathematical operations between logical comparisons


In [None]:
# we can use the `and` keyword to combine multiple logical statements 


In [None]:
# we can also use the `or` keyword to combine multiple logical statements 


In [None]:
# We can also compare strings


In [None]:
# Stings compare alphabetically


In [None]:
# Shorter words occur earlier than longer words that have matching letters


## Conditional Statements 

Conditional statements allow use to excecute particular pieces of code when certain conditions are met; i.e., they execute a piece of code when a Boolean value is True. 

Let's explore!

In [None]:
# let's look at a conditional statement in a loop












