## Python Basic's

Here are some extra resources for learning Python:

**Getting Started with Python**:

* https://www.codecademy.com/learn/python
* http://docs.python-guide.org/en/latest/intro/learning/
* https://learnpythonthehardway.org/book/
* https://www.codementor.io/learn-python-online

**Learning Python in Notebooks**:

* http://mbakker7.github.io/exploratory_computing_with_python/
This is handy to always have available for reference:

**Python Reference**:

* https://docs.python.org/3/reference/

Shoutout to Michigan Data Science Team (MDST) for creating most of this code
Specifically to Casper Guo, Weile Zheng

## How to use Jupyter Notebook

Jupyter Notebook is a tool that lets you develop documents that combine codes, visualizations, and explanatory texts.

For our purposes we are going to teach you the basic's of Jupyter Notebook in order to interact with this tutorial. To learn more about it I highly recommend learning from Jupyter Notebook site (https://docs.jupyter.org/en/latest/) or from MDST (https://mdst-club.notion.site/MDST-Onboarding-3d1b3591dd224115a548325a7d66a723).

In [1]:
# Jupyter has code cells for writing and running Python codes. 
# What's in this cell are not Python codes; they are comments. You can start comments by putting an asterisk at the beginning of lines.
# Comments are for other humans only. The computer will ignore them when executing your programs.
# Pro Tip: You can comment and uncomment many lines at once by highlighting them and pressing CTRL + / or CMD + /

You can run a cell by pressing `CTRL + ENTER` or `SHIFT + ENTER`. Running a cell will either render the contained Markdown to nice-looking text or execute the contained codes.

If you are new to Python, you should run every cell in this notebook. If you already have some familarity, you can use the Table of Contents to skip ahead.

# Table of Contents

- [1. Data Types](#1.-Data-Types)
    - [1.0 Your First Python Program](#1.0-Your-First-Python-Program)
    - [1.1 Data Type](#1.1-Data-Type)
    - [1.2 Container (list, tuple, dictionary, set)](#1.2-Container)
- [2. Control Flow](#2.-Control-Flow)
- [3. Iterating](#3.-Iterating)
    - [3.0 For Loops](#3.0-For-Loops)
- [4. Functions](#4.-Functions)
    - [4.0 Import & Library](#4.0-Import-&-Library)
    - [4.1 Built-in Function](#4.1-Built-in-Function)
    - [4.2 Custom Function](#4.2-Custom-Function)
    - [4.3 Lambda Function](#4.3-Lambda-Function-(Optional))
    - [4.4 Type Hinting](#4.4-Type-Hinting-(Optional))
- [5. Numpy](#5.-Numpy)
    - [5.0 Array](#5.0-Array)
    - [5.1 Math](#5.1-Math)
- [6. Pandas](#6.-Pandas)
    - [6.0 Dataframes & Series](#6.0-Dataframes-&-Series)
    - [6.1 Indexing](#6.1-Indexing)
    - [6.2 Data Transformation](#6.2-Data-Transformation)
    - [6.3 Grouping & Aggregating](#6.3-Grouping-&-Aggregating)
    - [6.4 Lambda Functions in Pandas](#6.4-Leveraging-Lambda-Functions-In-Pandas-(Optional))

The checkpoint proceeds in the same order so you can follow along.

## 1. Data Types

### 1.0 Your First Python Program

In [None]:
# Tradition demands that we do this
# Try running this cell

print("Hello World")

In [None]:
# Notebooks will automatically print the output of the last line of each cell when they are ran

413 * 5791

### 1.1 Data Type

#### 1.1.0 Ints and Floats

Python distinguishes between integers and decimal numbers (floats).

In [None]:
type(0)

In [None]:
type(0.0)

Basic arithmetic is straight forward in Python.

In [None]:
3 + 2

In [None]:
1.1 - 9.0

In [None]:
3 * 5

In [None]:
# When two numbers, regardless of whether they are int or float, are divided, Python returns the result as if the operation is done on a calculator
# This is known as float division
print(1/2)
print(1.5/2.4)

In [None]:
# There is also integer division that can be done between two int
# In Python, the behavior is always to round the float divison result down to the nearest integer
14 // 5

In [None]:
# You can also find the remainders of divisions
# Also known as taking the modulus
13 % 5

In [None]:
# exponent
10 ** 3

ints and floats are mostly interchangeable and can also be cast (i.e. converted) to the other type.

In [None]:
float(3)

In [None]:
int(2.9)

#### 1.1.1 Strings

Strings are Python's internal representation of texts.

In [None]:
# They can either be surrounded by double quotes...
type("apple")

In [None]:
# or single quotes
type('apple')

In [None]:
# You can piece two strings together (aka concatenate) using the plus sign
"Hello" + " World"

Python provides many functions for manipulating strings. 

In [None]:
# Capitalize
"like so".upper()

In [None]:
# Lowercase
"LIKE SO".lower()

In [None]:
# Title case
"like so".title()

In [None]:
# Count the number of characters, including whitespace
len("like so")

In [None]:
# Remove spaces on either side of a string
"    like so  ".strip()

In [None]:
# Split a string into a list of words 
"like so".split()

You can find a comprehensive list of these functions [here](https://www.w3schools.com/python/python_ref_string.asp).

#### 1.1.2 Boolean Values

There are two boolean values in Python `True` and `False`. They are case sensitive and must be typed exactly as such.

Now time for some basic [boolean algebra](https://en.wikipedia.org/wiki/Boolean_algebra).

In [None]:
print(not True)
print(not False)

In [None]:
# and, or conjunction, only evaluates to True when every boolean value involved is True
print(True and True)
print(True and False)
print(False and False)

In [None]:
# or, or disjunction, evaluates to True whenever at least one involved boolean value is True
print(True or True)
print(True or False)
print(False or False)

In [None]:
# All the non-zero numbers are treated as True
print(bool(1 and True))
print(bool(0 and True))

In [None]:
# All non-empty strings, even if the string is all whitespaces, are treated as True
print(bool(True and ""))
print(bool(True and "    "))
print(bool(True and "False"))

We will use boolean values much more extensively when we encounter control flow and if statements.

#### 1.1.3 Variables

In [None]:
# Python automatically figures out what type your variables are
# Once the cell is ran, the variables are made available everywhere else in the notebook

x = 4
y = 5

In [None]:
# We can do arithmetic with those variables in another cell
4*x + 5*y

In [None]:
# There are some shorthands for updating variables
# Instead of x = x + 2
# We can simply do:

x += 2
x

# You can do the same for -, *, and /

In [None]:
# In Python, snake case is the norm for multi-word variable names

michigan_data_science_club_abbreviation = "MROBOSUB"

In [None]:
# The values stored inside the variables can be overwritten later by referring back to the variable name
# Python allows changing the data type of the variable when it is overwritten

x = "like"
y = " so"

x + y

### 1.2 Container

#### 1.2.0 List

In [None]:
# You can create (aka initialize) an empty list with the square brackets
empty_list = []

# or with the list() command
another_empty_list = list()

In [None]:
# Or you can create lists by listing the elements it should contain
nonempty_list = [32, 'MROBOSUB', True]

Once a list is created, you can retrieve elements inside with its index.

Python uses 0-indexing, meaning the first element is on index 0. 

In [None]:
# Retrieve an element by putting its index in a square bracket after the list's name
nonempty_list[1]

In [None]:
# This works similarly for strings
mdst = "MROBOSUB"
mdst[2]

In [None]:
# You can chain indices as well
nonempty_list[1][2]

Negative numbers index from the end. Think of it as -1 wrapping around to the last element in the list. -2 is then the second last element in the list etc.

In [None]:
nonempty_list[-2]

Be careful to not use an index that doesn't exist in a list. Python won't know what to do and will throw an error.

In [None]:
# Getting the first element in an empty list doesn't make sense.

print(empty_list[0])

In [None]:
# Neither does finding the fifth element in a three-element list

print(nonempty_list[4])

You can use indexing to get subarrays/substrings.

syntax: [start:end:step]

The subarray will include the start index (inclusive) but not the end (exclusive).

In [None]:
sample_list = [0, 1, 2, 3 , 4, 5, 6, 7, 8, 9, 10]

In [None]:
# Getting the fourth to eighth element
# If you don't specify the step, Python assumes you want every element in the range

sample_list[3:8]

In [None]:
# When end is not specified, Python includes everything including and after the start index
sample_list[5:]

In [None]:
# Similarly, when start is not specified, Python includes everything before the end index but excludes the end index itself
sample_list[:-5]

In [None]:
# When neither start nor end is specified, Python applies the step argument to the entire list
# step = 2 means to take 2 steps forward each time an element is selected. In other words, it selects every other element

sample_list[::2]

In [None]:
# A neat trick for reversing a list, try to understand what it's doing
sample_list[::-1]

You can add element to an existing list ...

In [None]:
# at the end ...
sample_list.append(11)

# or somewhere in the middle
# syntax: insert(index, new_value)
sample_list.insert(1, 0.5)

print(sample_list)

or remove an element ...

In [None]:
# remove the first instance of a given value in the list
sample_list.remove(0.5)

# or remove the element on a specified index
sample_list.pop(0)

sample_list

or change an element using its index ...

In [None]:
sample_list[-1] = 12
sample_list

or many other things ...

See the full range of possibility [here](https://www.w3schools.com/python/python_ref_list.asp).

If you thought typing out every number from 0 to 10 was an inefficient way of creating a list, you will be glad to learn about the `range()` function. 

Syntax: `range(start (inclusive), end (exclusive), step)`

Pro tip: if you only specify `end`, Python will give you every integer from 0 up to the one before `end`.

In [None]:
# let's recreate the list of numbers from 0 to 10 using range()
# The output of range()'s type is range, not list. We need to convert it with list()
sample_list = list(range(11))
sample_list

##### 1.2.1 Tuple

Python tuples are list-like data structures with one important difference.

In [None]:
# You can create them with parenthesis

empty_tuple = tuple()

sample_tuple = (1, 2, 3, 4)

print(empty_tuple, sample_tuple)

Indexing tuples is just like indexing lists

In [None]:
print(sample_tuple[1], sample_tuple[-3])

Crucially, tuples can NOT be modified once created. 

Tuples are *immutable*. While this property makes them less versatile than lists, it sometimes come in handy. For example, tuples can be used as keys in dictionaries (next section). 

In [None]:
# try to overwrite an item in a tuple

try:
    sample_tuple[-1] = 10
except TypeError as e:
    print(e)

##### 1.2.2 Dictionary

Dictionary is a way to store pairs of values, known as keys and values, with some associations to each other.

In [None]:
# You can create an empty dictionary in two ways
empty_dict1 = dict()
empty_dict2 = {}

print(empty_dict1, empty_dict2)

In [None]:
# You can also create dictionaries by specifying the key:value pairs
panda_express_pricing = {"Bowl":5.80, "Plate":6.80, "Bigger Plate":8.30}

You index a dictionary with a key and gets its associated value.

In [None]:
bowl_price = panda_express_pricing["Bowl"]
bowl_price

Be careful to not index a key that doesn't exist in the dictionary because that will cause an error.

If you are not sure whether a key is in the dictionary or not, use the [get](https://www.w3schools.com/python/ref_dictionary_get.asp) method to be safe.

In [None]:
# try to eat buffet at Panda express
buffet_price = panda_express_pricing["Buffet"]

It follows that you can change the value associated with a key.

In [None]:
# let's say Panda Express has a sale on the bowls
panda_express_pricing["Bowl"] = 5.00
bowl_price = panda_express_pricing["Bowl"]
bowl_price

There is, however, no easy way to modify the key associated with a value.

In [None]:
# You can see a list of all the keys in a dictionary
panda_express_pricing.keys()

In [None]:
# Or a list of all values
panda_express_pricing.values()

In [None]:
# Or a list of key value pairs, represented as tuples
panda_express_pricing.items()

See everything you can do with dictionaries [here](https://www.w3schools.com/python/python_ref_dictionary.asp).

##### 1.2.3 Set

Sets store unique elements.

In [None]:
# You can only create sets with set(); (), [], {} are all taken

s = set([1,2,3,1,2,3])
s

In [None]:
# Add new elements to a set
s.add(3)
s.add(4)
s

In [None]:
# Remove elements in the set 
s.discard(1)
s.discard(2)

There are many set operations that can be performed between two sets. We will not go into them here. You can see a list on this [page](https://www.w3schools.com/python/python_ref_set.asp).

#### 1.2.4 Container Utilities

You can use `len()` to find the number of items in each of the above four containers.

In [None]:
l = [1,2,3]
t = (1,2,3)
d = {1:'a', 2:'b', 3:'c'}
s = set([1, 2, 3])

print(len(l), len(t), len(d), len(s))

And use the `in` keyword to check if an element is in the container or not.

For dictionaries, you can only use this to check whether a key is in the dictionary or not.

In [None]:
print(1 in l)
print(4 in t)
print(2 in d)
print(0 in s)

## 2. Control Flow

You can use `if` statements to execute different actions in different scenarios.

Before we dive in, a quick aside on comparing numbers:
- Use `==` to check equality
- Use `!=` to check inequality
- Use `<`, `>`, `>=`, and `<=` to compare two numbers

In [None]:
# Here is the general idea of if statements
# if (condition evalutes to true):
#   execute code here

to_print_or_not_to_print = True

if to_print_or_not_to_print:
    # Most code editors will automatically indent the lines inside an if statement for you 
    # It doesn't matter whether you use tabs or spaces to indent or how much you indent (two or four spaces are common)
    # Just be consistent! Your code will not work without consistent indentation!
    
    print("The first block of code is executed")

to_print_or_not_to_print = False

if to_print_or_not_to_print:
    print("The second block of code is executed")


We can use more complex conditions for `if` statements.

In [None]:
if 4 < 5 and 6 >= 6 and len(list(range(3))) == 3:
    print("The first block of code is executed")

if 4 != 4 or 6 > 7 or -1 < 0:
    print("The second block of code is executed")

An `if ... else` scheme can handle both when the condition is true and false.

In [None]:
to_print_or_not_to_print = True

if to_print_or_not_to_print:
    # indented
    print("printing")
# unindented
else:
    # indented
    print("not printing")

to_print_or_not_to_print = False

if to_print_or_not_to_print:
    print("printing")
else:
    print("not printing")

`if ... elif ... else` schemes can handle many different scenarios.

You can have `elif` without `else` but all `elif` must appear before `else`.

In [None]:
uniqname = "ENTER YOUR UNIQNAME HERE"

if len(uniqname) <= 4:
    print("Short")
elif len(uniqname) < 8:
    print("Medium")
else: 
    print("Long")

## 3. Iterating

### 3.0 For Loops

Lists, tuples, sets, dictionaries, strings, and ranges are all *iterables*. That just means we can move through them in a certain order.

This property is useful for simplifying repeated actions. 

Say we have a list of numbers and we want to print each of them, doubled. 

We can use the index to access, multiply, and print each of them but that's inefficient.

For loops to the rescue.

In [None]:
nums = list(range(5))

for num in nums: 
    print(num*2)


In [None]:
# What is actually going on here?
#
# in nums specifies the iterable to go through, nums in this case
# num is what is called an iterator. i, j, and k are common iterator names but num makes more sense here
#
# for num in nums: 
#     indent!
#     num is set to an element in the nums list and the action is executed
#     print(num*2)
#     num is set to the next element in the nums list
#
# in this case, we iterated through the elements of the list

In [None]:
# Another common pattern is to iterate through the indices 
# let's print out the indices that has an even number on them

for i in range(len(nums)):
    # range(len(nums)) gives all the indices in the nums list
    # nums has 5 elements so range(len(nums)) looks like 0, 1, 2, 3, 4
    # you will see this all the time in for loops

    if nums[i] % 2 == 0:
        print(i)

One more example: 

Make a new list containing the items in nums squared.


In [None]:
squared_nums = []

for num in nums:
    squared_nums.append(num ** 2)

squared_nums

Sometimes it is useful to iterate through both the element and index at the same time. 

Look into [`enumerate`](https://realpython.com/python-enumerate/).

## 4. Functions

### 4.0 Import & Library

Libraries (aka packages) are codes that other people have developed for you to use. Python has tons of cool and interesting libraries.

You can start using them in your notebooks with the `import` key word.

In [None]:
# There is always a relevant xkcd 

import antigravity

Most libraries are more elaborate and contain many functionalities.