In [None]:
from datascience import *
import numpy as np 
import pandas as pd

import matplotlib.pyplot as plt 
plt.style.use('fivethirtyeight') 
%matplotlib inline 

In [None]:
# Why do we use histograms and density over count/bar charts?
# Let's look at midterm data from a class. 
scores = Table().read_table("scores_by_section.csv")
bins = make_array(0, 40, 60, 80, 90, 95, 100)
bin_cats = ["[0, 30)", "[30, 60)", "[60, 80)", "[80, 90)", "[90, 95)", "[95, 100)"] # notice the bins
counts_per_bin = [scores.where("Midterm", are.between(0, 30)).num_rows,
                  scores.where("Midterm", are.between(3, 60)).num_rows,
                  scores.where("Midterm", are.between(60, 80)).num_rows,
                  scores.where("Midterm", are.between(80, 90)).num_rows,
                  scores.where("Midterm", are.between(90, 95)).num_rows,
                  scores.where("Midterm", are.between_or_equal_to(90, 100)).num_rows]

In [None]:
plt.figure(figsize = (8, 5));
plt.bar(bin_cats, counts_per_bin);
plt.ylabel("Count");
plt.xlabel("Score");

In [None]:
# These don't tell the same story!
plt.figure(figsize = (8, 5));
plt.hist(scores.column("Midterm"), bins = bins);
plt.ylabel("Percent per score");
plt.xlabel("Score");

In [None]:
# A better use of bar charts in this dataset: compare across section 
scores_by_section = ...


## Booleans: Making Decisions in Python

You might have noticed two specific values (highlighted in green) in Python: True and False.

These values are different from simple strings ("True" or "False") and have special properties associated with them. Because they are a binary, we can use them to make complicated decisions or filters with our code.

In [None]:
# What is a Boolean? + properties
True

In [None]:
# Boolean comparisons
# We can create booleans by comparing values
# >, <, <=, >=, !=, ==
...

In [None]:
# and it works with arrays! Boolean masking
grades = make_array(10, 12, 9, 8, 9, 10, 7, 5, 6, 8, 4, 3, 5, 11, 12, 9, 8)
...

In [None]:
# Practice question: Find the names of your classmates that are longer than 5 characters long
names = make_array("Temma", "Jessica", "Corrine", "Sakshina", "Sachin", "Grace", "Nicole", "Cynthia",
                  "Jack", "Zhenya", "Benny", "Danielle", "Jose", "Megan", "Lena", "Beixuan", "Kendle",
                  "Vicky", "Kelsey", "Kimsa", "Ifejesu", "Joe", "Gerald", "Yang", "Joyce", "Sofia",
                  "Christine", "Hsiang Yun", "Zihong")

name_lengths = np.array([len(name) for name in names]) # a list comprehension; more on iteration next week

...

In [None]:
# Compound statements: and/or

#(3 < 4) and (5 < 2)
#(42 == 42.0) and (3*2 > 5)

#("blue" == "green") or ("red" == "red")
#("blue" == "gold") or ("red" == "white")

In [None]:
# If else statements: We can build this into code to make our computer choose between options

grade = ...

if ...: # A 
    ...
    
elif ...: # B 
    ...
    
elif ...: # C
    ...

else: # not pass
    ...

In [None]:
# Truthy + falsey values: Python considers all values that are 0, None, or empty [] to be False 
# Everything else is True

if ...:
    print("This is a truth-y value!")
else:
    print("This is a false-y value!")


In [None]:
# Practice problem:
# What will the name animal be equal to? Guess before you run it. 

animal = None

if type(7) == float:
    animal = "turtle"

elif False:
    animal = "fish"

elif 0:
    animal = "bear"

elif max:
    animal = "snake"

else:
    animal = "bird"

animal

## Functions: Making generalizable code

Functions are ways we can create packageable/generalizable code. In other words, by writing code that is fit to a general variable, we can use it 

For example, in mathematics:

$ f(x) = x^2 $ 

Means we can just plug in x and find the square value, such as:

$ f(5) = 5^2 = 25 $

Python functions are similar. They take in an input and create an output, where there is code in the background that works on the input.

In [None]:
# On 2/11, at 1:25 PM, the weather in Berkeley was 55 degrees Fahrenheit

# To convert F to C, as we saw in homework, C = (5/9) * (F - 32)
...


In [None]:
todays_temp = 

...

In [None]:
# Writing functions: a "pure" function
# Like in homework, let's do the F to C conversion::

def name(args):
    """Docstring"""
    ...
    return ...

f_to_c(55.0)

In [None]:
# Non-pure functions
def print_conversion(...):
    ...
    print(...)

...

# Try doing math with the output now


In [None]:
# What if I tried to call a variable created in the body of a function? 
# Why? 
...

In [None]:
"a".lower()

In [None]:
# How does this work with tables? If we want to use a function on an entire column,
# we can use tbl.apply(func, "col")

def first_letter(name, lower = False):  # lower is a "flag" 
    """Return the first character of a given string in the specified case"""
    if lower:
        return name[0].lower()
    else:
        return name[0].upper()

names_tbl = Table().with_column("Name", names)
names_tbl

In [None]:
# Try applying the first_letter func to the names_tbl. What data type do you get?
# Then, using a Boolean mask, find all the names in the class that are 
# in the first half of the alphabet (A-M)
# Make sure you get an array of the names in question
...

In [None]:
# Another way of answering the previous question (with table methods)
...

In [None]:
## One last note: functions can take in anything as an input and make any output
# We can plug in tables, functions, strings, ints, etc.!

## Treating Functions as Abstractions

Remember that functions are objects in Python, and we can use previously existing functions to make more complex or useful functions (for our purposes). If we simply treat them as black boxes that work (without caring about the implementation), we should use them to build other functions that fit our use cases.

One such example is currying. In currying, we will call a function within a different function. We do this to reduce the total number of arguments and simplify the function for our specific use. See below for an example.

In [None]:
## Write a function that calculates the value of a power
def power(base, power):
    ...

In [None]:
# Treating functions as abstractions: currying
def square(...):
    ...

In [None]:
def cube(...):
    ...

Recursion is the process of calling a function within itself. This lets us repeat a computation many times, and the key part of this is that we treat the function as an abstraction and assume it works correctly, and build towards a stop point. Here are the aspects of a recursive function:

- A base case (the "stop point") 
- A recursive call that moves towards the base case (ex. if we have the function f(n), then we would do something like return f(n-1) in the body)

For example: this function finds the factorial of n.

In [None]:
def find_fact(n):
    "Find factorial of n: n * (n-1) * (n-2) * (n-3) * ..."
    if n == 1:
        return 1
    else:
        return n * find_fact(n-1)
    
find_fact(5) # 5 * 4 * 3 * 2 * 1

The Hailstone sequence from Hofstadter is as follows:
1. Pick a positive integer n as the start value.
2. If n is even, divide it by 2.
3. If n is odd, multiply it by 3 and add 1.
4. If you continue this process, n will eventually reach 1. 


Let's write a function that uses if/else statements and recursion to do the hailstone sequence. 

In [None]:
def hailstone(n):
    ...
    
hailstone(10)

In [None]:
## Not that important for us, but cool application: we can write functions that return functions 
# This is called a "higher order" function
# More info in Prof. DeNero and Prof. Farid's lectures (CS61A and INFO 206A)

def pow_creator(n):
    """Returns a function that calculates x to a certain power n"""
    def n_power(x):
        """Calculates x ** n, where n is defined by pow_creator"""
        return x ** n
    return n_power

squarer = ...

## Next week's topics and a quick clarification

Note: these discussions may not match up well with the lecture content, but I think it's useful to go into these computing topics here so I've switched up a few topics. Module 4 covers functions and table methods and Module 5 covers booleans and iteration. 

Next week, we will cover the advanced table methods (group, join, pivot) which will be in Lab 5 and Homework 4 due 2/25. The videos for Module 4 introduced these concepts, but I think it will be more useful to discuss them more in depth in discussion next week (2/19). For the asynchronous students' sake (who may not be able to watch the discussion recording by Monday), the vitamin this week will check your understanding of functions and table methods.

We will also talk about iteration, or using Python to run a block of code many times, with while and for loops which will be a key part of how we run empirical simulations for statistical analysis. 
