# Crash Course on Surviving the Math
Week of July 25, 2022

## First, a Minimal Intro to Python

In [1]:
# This is a Python Commentm which starts with a '#' sign
# Below is Python Code

myValue = 3.1415
myVector = [1, 1, 2, 3, 5, 8, 13]

print("My scalar value is %s" % myValue)
print("My vector has the list of values: %s" % myVector)

My scalar value is 3.1415
My vector has the list of values: [1, 1, 2, 3, 5, 8, 13]


## Chapter 4 - Linear Algebra

### The Second Most Imporant Equation in Machine Learning

There are [may different 'distance' approaches](https://towardsdatascience.com/9-distance-measures-in-data-science-918109d069fa), but the most common is one we are familiar with by other names and simpler forms:

![Euclidian Distance](https://miro.medium.com/max/844/0*wv6oFAVd0_PQ50mX)

... but in the end, these are all forms of the [Minkowski Distance](https://en.wikipedia.org/wiki/Minkowski_distance): 

![Minkowski Distance](https://miro.medium.com/max/902/0*UbbyH2MUPb5ZBa64). 
The value of 'p' transforms the equation from Manhattan distance (p=0, a.k.a. L1-Norm) to Euclidian distance (p=1, a.k.a. L2-Norm), or [Chebyshev distance](https://en.wikipedia.org/wiki/Chebyshev_distance) as p approaches infinitiy. For fractional values of 'p' where 0 < p < 1, the 'Agrawal' distance is extremely useful to mitigate the [Curse of Dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality). 

![Variations of P](https://upload.wikimedia.org/wikipedia/commons/thumb/0/00/2D_unit_balls.svg/967px-2D_unit_balls.svg.png)

### In Python

## Chapter 5 - Statistics


### Continuous Probability Distributions


### Discrete Probability Distributions


### Statistics and Randomness in Python



## Chapter 6 - Probability

### Joint Probability

### Priors

### Bayes Formula (or the Third Most Important Formula in Machine Learning)
Devised by Thomas Bayes in the 1700's, but now a critical for predictive modeling and analysis:

![Bayes Formula](https://wikimedia.org/api/rest_v1/media/math/render/svg/c1a7279a1639d92d751e0f2d3aa54e62a2ddb1e8)

The common formulation and use deals with seemingly likely outcomes weighted by other information:

![BayesUse](https://wikimedia.org/api/rest_v1/media/math/render/svg/b01f679001d8f19c6c6036f1ac66ca3c3f400258)

### The Monte Hall Problem
Imagine you are at a game show, with three doors. One has a good prize, two have a bad prize (goat?). You choose one:

![Monte Hall Step 1](https://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Monty_open_door.svg/220px-Monty_open_door.svg.png)

Then the host reveals one door with two assumptions:

1) The Host will NOT reveal the door you chose first
2) The Host will NOT reveal the prize

The host then asks if you want to keep your original door choice, or switch to the remaining door. This is your final choice and you will get the prize (or goat) behind your final decision. What choice should you make to maximize your probability of winning the prize?

![Monte Hall Step 2](https://upload.wikimedia.org/wikipedia/commons/thumb/4/41/Monty_Hall_Problem_-_Standard_probabilities.svg/330px-Monty_Hall_Problem_-_Standard_probabilities.svg.png)

Let's prove the Monte Hall solution using Monte Carlo (no relation):

In [4]:
import random

# Let D = number of doors in the game
D=3

# Let us run these many trials (large number for sufficient statistics)
TRIALS=5000

# These will be our two strategies
SWITCH_STRATEGY=True
STAY_STRATEGY=False

# Define a single instance/execution of the game
def monteHallGame(strategy):
    # Create D Doors
    door_list = [x+1 for x in range(D)]
    # Place the Prize
    door_with_prize = random.randint(1, D)
    # Choose first door
    door_I_choose  = random.randint(1, D)
    # Remove all but two doors keeping with the rules of the game
    remaining_doors_list = [door_I_choose]
    if door_I_choose == door_with_prize:
        remaining_doors_list.append(random.randint(1, D))
    else:
        remaining_doors_list.append(door_with_prize)
        
    if strategy == SWITCH_STRATEGY: 
        # If switching, remove my door from choices
        remaining_doors_list.remove(door_I_choose)
        # Then my final choice is the remaining door
        door_I_choose = remaining_doors_list[0] 
                
    # return True (1) if we chose the prize
    return door_with_prize == door_I_choose

# Manage multiple independant expirements of the game
def monteCarlo(strategy):
    winCount = 0
    for i in range(TRIALS):
        winCount += monteHallGame(strategy) 
    return winCount


switch_wins = monteCarlo(SWITCH_STRATEGY)
stay_wins   = monteCarlo(STAY_STRATEGY)

print('Results with %s doors and %s trials' % (D,TRIALS))
print('Proportion of wins without switching: {:.2f}%'.format(100.0*stay_wins/TRIALS))
print('Proportion of wins with switching: {:.2f}%'.format(100.0*switch_wins/TRIALS))

Results with 3 doors and 5000 trials
Proportion of wins without switching: 32.58%
Proportion of wins with switching: 77.78%


... but if you still don't believe it, here are some different perspectives:

* From the movie '21': ['21' Movie depiction of Monte Hall](https://youtu.be/iBdjqtR2iK4)
* My favorite [explination by Numberphile](https://youtu.be/4Lb-6rxZxx0)
* A less mathmatical by very historical view [as explained by Vox](https://youtu.be/ggDQXlinbME)
* An entertaining [explination by VSauce](https://youtu.be/TVq2ivVpZgQ) ... this guy is weird.