# Quiz 1 Review

## Learning Objectives

### 1.1
- Define integers, strings, tuples, lists, and dictionaries
- Demonstrate arithmetic operations and string operations
- Learn to use basic Jupyter Notebook features

### 1.2
- Explore `Python` control flow and conditional programming.  
- Apply `if, else` conditional statements.
- Combine control flow and conditional statements to solve the classic "FizzBuzz" code challenge.
- Demonstrate error-handling using `try, except` statements.

### 1.3
- Review basic programming concepts (control flow, iteration, datatypes)
- Create functions
- Lambda functions
- Introduce `NumPy`

### 1.4
- While loops
- For loops
- Looping through dictionaries
- Mapping/filtering (lists & dicts)
- List comprehensions

### 1.5
- Define experiment, outcome, event, and sample space.
- Calculate the union and intersection of sets.
- Apply six probability rules.
- Import packages into Python.
- Solve probability problems using simulations.

### 1.6
- Define distribution and random variable.
- Describe the difference between discrete and continuous random variables.
- Understand the difference between probability mass functions and cumulative density functions.
- Give examples of the following distributions: Discrete Uniform, Bernoulli, Binomial, and Poisson.

### 1.7
- Give examples of the following distributions: Continuous Uniform, Exponential, Normal.
- Describe why the Normal distribution is seen everywhere.
- State the Central Limit Theorem.

# Data Types


## Numeric

For now, we can think of our two main numeric types as `int` and `float`. In Python 3, these numbers should behave identically when doing mathematical operations, but some functions specifically require an `int` or a `float`. The big difference: `float`s will always have a number after the decimal (the integer `2` will be `2.0` as a `float`.)

In [0]:
type(2)

int

In [0]:
type(2.0)

float

## Numeric operators

| Symbol | Operation |
| --- | --- |
| + | addition |
| - | subtraction |
| * | multiplication |
| / | division |
| ** | exponentiation |
| // | floor division |
| % | modulo (remainder) |


In [0]:
1 ** 2

1

In [0]:
2.0 ** 2

4.0

In [0]:
1/2

0.5

In [0]:
2/2

1.0

In [0]:
2//2

1

In [0]:
1//2

0

In [0]:
12 % 3

0

In [0]:
12 % 5

2

## Strings

Strings are used for text/words in Python (but we can also cast numbers or puncuation as strings for certain operations).

We can use single or double quotes to define strings (`""` or `''`).

The `str()` function will cast other datatypes to string.

Strings are immutable iterables.

In [0]:
name = 'Teacher McTeacherson'

In [0]:
student = "Teacher's Pet"

In [0]:
triple_quotes = """She said, "It's hers." """ 

In [0]:
the_number_two = 2

In [0]:
the_string_two = str(the_number_two)

In [0]:
type(the_string_two)

str

## String operators

Strings have many methods associated with them. Amongst these:

| Method | Effect |
| --- | --- |
| `join` | concatenates passed string iterable using string as separator |
| `find` | return lowest index of searched string |
| `count` | returns the count of passed string |
| `split` | split string on white space (or passed string arg) |
| `strip` | returns string without leading/trailing whitespace |

In addition, strings will behave like lists in many cases.


In [0]:
the = 'The'
ball = 'ball'
bounces = 'bounces'

In [0]:
sentence = ' '.join([the, ball, bounces])
sentence

'The ball bounces'

In [0]:
'-'.join('concatenate')

'c-o-n-c-a-t-e-n-a-t-e'

In [0]:
sentence.split()

['The', 'ball', 'bounces']

In [0]:
sentence.find('b')

9

In [0]:
sentence[4]

'b'

In [0]:
sentence[4:8]

'ball'

In [0]:
sentence.count('b')

2

In [0]:
dirty_sentence = '   ' + sentence + '    '
dirty_sentence

'   The ball bounces    '

In [0]:
dirty_sentence.strip()

'The ball bounces'

## Iterables

While iterables are *sometimes* interchangeable, it's important to note slight differences between them, as well as default behaviors.

### Lists 

* Mutable
* Indexed with numbers
* Ordered
* Can contain multiple types, as well as redundant values

### Tuples

* Immutable
* Indexed with numbers
* Ordered
* Can contain multiple types, as well as redundant values

### Dictionaries

* Unordered
* `values` accessed by `keys`
* `keys` must be unique
*  `values` are mutable, but `keys` are not

### Sets
* Mutable
* Cannot be indexed
* Unordered
* All elements are unique


In [0]:
reds_list = ['red', 'maroon', 'brick', 'pink']

In [0]:
reds_list[2]

'brick'

In [0]:
reds_list[2] = 'burgandy'

In [0]:
for i, red in enumerate(reds_list):
  print(i, red)

0 red
1 maroon
2 burgandy
3 pink


In [0]:
blue_tuple = ('blue', 'cerulean', 'teal', 'navy')

In [0]:
blue_tuple[2]

'teal'

In [0]:
try:
  blue_tuple[2] = 'sky'
except:
  print('Tuples are immutable')

Tuples are immutable


In [0]:
for i, blue in enumerate(blue_tuple):
  print(i**2, blue)

0 blue
1 cerulean
4 teal
9 navy


In [0]:
green_set = {'green', 'lime', 'forest', 'green', 'safety green'}
green_set

{'forest', 'green', 'lime', 'safety green'}

In [0]:
color_dict = {'red': reds_list, 'blue': blue_tuple, 'green': green_set}
color_dict

{'blue': ('blue', 'cerulean', 'teal', 'navy'),
 'green': {'forest', 'green', 'lime', 'safety green'},
 'red': ['red', 'maroon', 'burgandy', 'pink']}

In [0]:
for key in color_dict:
  color_dict[key] = len(color_dict[key])
  
color_dict

{'blue': 4, 'green': 4, 'red': 4}

## Booleans/Logical operators

| Operator | Purpose |
| --- | --- |
| `==` | equality |
| `!=` | inequality |
| `<` | less than |
| `>` | greater than |
| `<=` | less than or equal |
| `>=` | greater than or equal |
| `and` | boolean and |
| `&` | bitwise and |
| `or` | boolean or |
| <code>&#124;</code> | bitwise or |
| `not` | boolean not |
| `~` | bitwise not |
| `is`| equality |
| `is not` | inequality |
| `in` | is in iterable |


## Control flow & Iteration

`if`/`elif`/`else`
* `if` checks to see if a condition is satisfied
* `else` will always run if the condition for a preceding `if` is not satisfied
* `elif` combines `else` and `if`: it evaluates if a condition is satisifed if a preceding `if` is not satisfied

`for` loops
* Does something a number of times while looping through an iterable
* Often combined with a `range` call to specify how many times a thing should be done
* `enumerate` allows us to capture a count simtaneously as we iterate through our iterable
* We can use tuple unpacking to capture multiple variable on each pass through an iterable

`while` loops
* Checks to see if a condition is satisfied
* Will continue to repeat contained code until the condition is satisfied

`try`/`except`
* `try` contains code that we hope to run
* `except` will run if the try code throws an error
* `finally` will always run after either the `try` or `except` completes (used less often) 

## Defining functions

We use the `def` keyword to begin a function definition.

This is followed by the name of the function, and arguments, then we do something, and `return` it.

In [0]:
def sample_function(arg1, arg2='default'):
  sum_of_args =  arg1 + arg2
  return sum_of_args, arg1, arg2

In [0]:
sentence, word1, word2 = sample_function('thank ', 'you')

In [0]:
print(sentence)

thank you


## Quick demo on multiple assignment

In [0]:
a, b = 1, 2

In [0]:
a, b = b, a
print(a, b)

2 1


In [0]:
a = 1
b = 2
temp_a = a
a = b
b = temp_a
print(a, b)

2 1


## Lambda functions

`lambda` functions allow us to define throwaway functions inline with our code. We can assign `lambda` functions to variable names. They will often be useful in list comprehensions or for mapping and filtering.

In [0]:
square_this = lambda x: x**2

In [0]:
square_this(2)

4

## List comprehensions

Allow us to unpack `for` loops into an iterable in a single line of code, applying operations as we go through the interable. Make for tight, neat code.

In [0]:
animals = ['bear', 'dog', 'cat', 'men']

In [0]:
animals_upper = [animal.upper() for animal in animals]
animals_upper

['BEAR', 'DOG', 'CAT', 'MEN']

In [0]:
[letter for animal in animals for letter in animal]

['b', 'e', 'a', 'r', 'd', 'o', 'g', 'c', 'a', 't', 'm', 'e', 'n']

In [0]:
letters = []
for animal in animals:
  for letter in animal:
    letters += letter
    
letters

['b', 'e', 'a', 'r', 'd', 'o', 'g', 'c', 'a', 't', 'm', 'e', 'n']

In [0]:
first_10_numbers_squared = [square_this(i) for i in range(10)]
first_10_numbers_squared

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [0]:
square_evens = [x**2 for x in range(10) if x%2 == 0]
square_evens

[0, 4, 16, 36, 64]

In [0]:
square_odds_dict = {x:x**2 for x in range(10) if x%2 != 0}
square_odds_dict

{1: 1, 3: 9, 5: 25, 7: 49, 9: 81}

1.5
- Define experiment, outcome, event, and sample space.
- Calculate the union and intersection of sets.
- Apply six probability rules.
- Import packages into Python.
- Solve probability problems using simulations.

**Experiment**  : 
- _A proceedure that can be repeated infinitely many times and has a well defined set of outcomes._
			They can be repeated or conducted in the same manor by anyone assuming said individual has the resources required.
			A "Well Defined set of outcomes" indicates expected outcome of the experiment is know.  There are random chance and special cases, but for the most part the scientist understands the general results or information that is supposed to come from said experiment.
      
      Example : Rolling a 6-sided die once.  The experiment parameters are know and this can be conducted by anyone with a (fair) 6-sided die.  The experimentor also understands that the outcome can be 1, 2, 3, 4, 5 or 6 and each of them is equally likely. 

**Outcome**  :   _The results of an experiment. In the coin flipping example this iseigher Heads or Tails. In the dice rolling example this is any of the numbers $1, 2, 3, 4, 5, 6$._
  



**Event**  :  Any collection of possible outcomes for an experiment.   

     Example: If my experiment is to roll a die 3 times in a row some of the possible events are {1,1,5}, {3,5,2} or {1,6,1}



**Sample Space**  :  The collection of ALL possible outcomes of an experiment. 
      
      In the example of rolling a die 3 times, this sample space is all possible combinations of outcomes that could occur from {1,1,1} to {1,1,2} to {6,6,6}

**Random Variable** : Functions that depend on the outcomes of experiments. For example we could roll two dice and add the results. The resultant single number, ranging from $2$ to $12$ would be a random variable. Typically random variables will be some, probably arithmetical, function applied to the outcome of the experiment.

## Probability

The (_frequentist_) probability of some event $A$ is given by:

$$
P(A) = \lim_{n \rightarrow \infty} \frac{\text{number of times A occurs}}{n}
$$

We can estimate $P(A)$ by running some large number of trials and seeing how frequently $A$ occurs.

## Discrete vs. Continuous random variables

### Discrete
* All outcomes of a random variable could be counted.
* The distribution of probabilities of each **specific** outcome is called the  **probability mass function (pmf)**.

### Continuous
* Have outcomes that are uncountable (or infinite).
* The probabilities of **ranges of values** are calculated as areas under the **probability density function (pdf)**.

## PDF vs. CDF

To calculate the **cumulative density function (CDF)**, start at the minimum possible outcome, then add the probability of each outcome. Thus, each point on a cumulative distribution function represents the probability that a random variable is less than or equal to that value.

The difference in the height of the CDF between any two points on the x axis represents the probability that a random variable is in that range of values.

$pdf$ is given as $f(x) = P(X = x)$

$cdf$ is given as  $F(x) = P(X \leq x)$

**For the purposes of this course, there are few distinctions that need to be remembered between PDF vs PMF or CDF vs CMF**. Density functions are for continuous variables, while mass functions are for discrete.

**However**, at any exact point in a continuous distribution there is zero likelihood of observing that exact value. Hence the differing names: one represents the probability of a range of values, while the other represents the probability of specific outcomes.

## Discrete Distributions

### Discrete uniform distribution
Used when we have a **discrete set of outcomes** and **each outcome is equally likely**.
* all of the outcomes have the same probability.
* probability histogram is uniform (flat).
  * Example:
    * sample space $= \{1, 2, \dots, n\}$
    * pmf: $f(x) = 1/n$ for each $x$ in the sample space.
    * cdf: $F(x) = x/n$ for each $x$ in the sample space.

### Bernoulli distribution
- When your outcome is binary (i.e., two outcomes, say, `1 = success` and `0 = failure`)
- When there is a constant probability of success $p$.
  - Example
    - sample space = $\{0, 1\}$
    - pmf: $f(0) = 1-p$ and $f(1) = p$ for some number $0\le p \le 1$. If you're thinking of coin flipping,  the coin is fair then $p=\frac{1}{2}$.
    - cdf:  
   \begin{align*}
      F(x) &= 0 \textrm{ if } x \le 0,\\
      F(x) &= 1-p \textrm{ if } 0\le x\le 1, \\
      F(x) &= 1 \textrm{ if } x\ge 1
    \end{align*}
### Binomial distribution
- when you have fixed $n$ independent Bernoulli trials.

More explicitly:

- when you have fixed $n$ trials,
- each trial is independent of one another,
- when you have a constant probability of success $p$, and
- when you have a binary outcome.
  - Example
    - sample space: length $n$ sequence of $\{0,1\}$ for the bernoulli trial, as in the example above 
    - pmf : $f(k) = {n \choose k}p^k(1-p)^{n-k}$
    - cdf: $F(k) = \sum\limits_{i=0}^k f(k) =  \sum\limits_{i=0}^k {n \choose k}p^k(1-p)^{n-k}$


### Poisson distribution
- when the number of successes is is a non-negative integer,
- when events occur independently,
- when the rate at which events occur is constant,
- when two events cannot occur at exactly the same instant, and
- the probability of an event occurring in an interval is proportional to the length of the interval.  
  - Example, with mean $\lambda$
    - sample space: all non-negative integers $\{0, 1, 2, 3, \dots\}$ 
    - pmf : $f(k) = \frac{\lambda^ke^{-\lambda}}{k!}$
    - cdf: $F(k) =\sum\limits_{i=0}^k f(k) =  \sum\limits_{i=0}^k \frac{\lambda^ke^{-\lambda}}{k!}$



## Continuous Distributions

- Uniform Distribution on $[a,b]$
  - pdf: $f(x) =\frac{1}{b-a}$ for $a\le x \le b$
  - cdf: $F(x) = \frac{x-a}{b-a}$ for $a\le x \le b$
  - Just like in the discrete case, this distribution is used when all outcomes are equally likely. 
  
- Exponential Distribution with mean $\frac{1}{\lambda}$
  - pdf: $f(x) = \lambda e^{-\lambda x}$ for $x \ge 0,$ otherwise $f(x) = 0$.
  - cdf: $F(x) = 1-e^{-\lambda x}$ for $x \ge 0$, otherwise $F(x) = 0$.
  - Often used in cases when larger numbers occur with lower probaility than smaller ones. This distribution has the memorylessness property -- this is why it is used to describe things like  the amount of time waiting for an event to occur (like waiting for a bus).
  
- Normal Distribution with mean $\mu$ and standard deviation $\sigma$
  - pdf: $f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$
  - cdf: $F(x)= \int_\infty^x \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$
  - The infamous 'bell curve'. Very common to see used for describing outcomes of measurements of numerical quantities sampled from real life populations including hegiht, weight, etc.
  