**Adapted from https://gist.github.com/kenjyco/69eeb503125035f21a9d**


## Quick note about Jupyter cells

When you are editing a cell in Jupyter notebook, you need to re-run the cell by pressing **`<Shift> + <Enter>`**. This will allow changes you made to be available to other cells.

Use **`<Enter>`** to make new lines inside a cell you are editing.

#### Code cells

Re-running will execute any statements you have written. To edit an existing code cell, click on it.

#### Markdown cells

Re-running will render the markdown text. To edit an existing markdown cell, double-click on it.

<hr>

## Common Jupyter operations

Near the top of the Jupyter notebook window, there are a row of menu options (`File`, `Edit`, `View`, `Insert`, ...) and a row of tool bar icons (disk, plus sign, scissors, 2 files, clipboard and file, up arrow, ...).

#### Inserting and removing cells

- Use the "plus sign" icon to insert a cell below the currently selected cell
- Use "Insert" -> "Insert Cell Above" from the menu to insert above

#### Clear the output of all cells

- Use "Kernel" -> "Restart" from the menu to restart the kernel
    - click on "clear all outputs & restart" to have all the output cleared

#### Save your notebook file locally

- Clear the output of all cells
- Use "File" -> "Download as" -> "IPython Notebook (.ipynb)" to download a notebook file representing your https://mybinder.org session

<hr>

## Tips and tricks for .ipynb and VSCode

* Question mark after a command to bring up the documentaries that gives you info on what it does
* To comment and uncomment blocks in Jupyter: **` 'CMD' + '/' `**
* To indent or unindent blocks of code: **` CMD + '[' OR ']' `**
* **` ESC + L `** to number each line of code
* To select multiple occurrences of words and edit simultaneously in VSCode, highlight the word, press `CMD + D` and edit

In [None]:
def some_code(x, y):
    """
    Comment the first three solutions out and fix indentation
    """
        solution = x + y
        solution = x - y
        solution = x * y
    solution = np.sqrt(x**2 + y**2)
    return solution

In [None]:
# multiple variable assignments
mean, std = 1, 2
print(mean, std)

## NumPy

Great for simple numerical calculations and manipulating data structures like arrays and vectors.

### Some methods on list objects

- **`.append(item)`** to add a single item to the list
- **`.extend([item1, item2, ...])`** to add multiple items to the list
- **`.remove(item)`** to remove a single item from the list
- **`.pop()`** to remove and return the item at the end of the list
- **`.pop(index)`** to remove and return an item at an index

In [None]:
# Manipulating lists and arrays are important for playing around with any type of data

import numpy as np

array1 = np.array([0, 1, 2, 3, 4, 5])
type(array1)
# numpy arrays are really useful for vectorization and matrix operations

In [None]:
# f-strings are very useful for debugging and printing in general
print(f'Array1 Before: {array1}')
array1 = np.append(array1, 15) # append is a function
print(f'Array1 After: {array1}')

In [None]:
list1 = list(range(5))
print(f'list1 Before: {list1}')
# list1.append(15) 
list1 = list1.append(15) # .append is a method, it does it to itself
print(f'list1 After: {list1}')

In [None]:
# Queues - First in First Out (FIFO)
list1 = list(range(5))
print(f'list1 Before: {list1}')
list1.pop(0)
print(f'list1 After: {list1}')

# append and pop(0) are great for turning your data structure into a Queue
# Check out dequeues, great data structure for LeetCode questions
# https://www.geeksforgeeks.org/deque-in-python/

In [None]:
ls = np.linspace(0, 100, 51) # notice the dtype is a float, you can change this in the linspace function parameters
#np.arange()
print(ls)
print(type(ls))
#?np.linspace

In [None]:
list2 = list(range(6))
array2 = np.array([0, 1, 2, 3, 4, 5])

print(f'list2 Before: {list2}')
list2 = 2*list2
print(f'list2 After: {list2}')

print() ###################

print(f'array2 Before: {array2}')
array2 = 2*array2
print(f'array2 After: {array2}')
# this is why arrays are great for linear algebra and playing with data (matrices)

In [None]:
print(np.e, np.pi, np.sin(np.pi))

# notice what np.sin(np.pi) returns, can you think why?

## Matplotlib

Great for making plots in Python. Typical plots include histograms, lineplots, and scatterplots.

In [None]:
from matplotlib import pyplot as plt
# OR import matplotlib.pyplot as plt

In [None]:
x = np.linspace(0, 2 * np.pi, 400)
y = np.sin(x) # transform x
print(x[:10])

In [None]:
plt.figure()
plt.plot(x, y)

plt.xlabel("x")
plt.ylabel("y")
plt.title("y vs x")

In [None]:
fig, axs = plt.subplots(2)
fig.suptitle('Vertically stacked subplots')
axs[0].plot(x, y)
axs[1].plot(x, -y)

In [None]:
# Plot, labels, and legends
# might have to use plt.show() if plots not showing

In [None]:
fig, (ax1, ax2) = plt.subplots(2, figsize = (5,5))
fig.suptitle('Vertically stacked subplots')
ax1.plot(x, y)
ax2.plot(x, -y)

In [None]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,5))
fig.suptitle('Horizontally stacked subplots')
ax1.plot(x, y)
ax2.plot(x, -y)

In [None]:
Z = np.random.normal(0, 1, 100)
#plt.hist(Z, bins = 100)
plt.scatter(Z, Z**2)

## Pandas

Great for reading data, manipulating data, preparing data.

In [None]:
import pandas as pd

data = pd.read_csv('SPY.csv')
data

In [None]:
data.head()
# data.tail()

In [None]:
data.plot('Date', 'Close')

In [None]:
# daily returns
data['returns'] = (data['Close'] - data['Open']) / data['Open']
data

In [None]:
# create a histogram of returns

plt.hist?

# create a figure, then do whatever you wish to make the plot pretty

## Resources

1. https://www.learndatasci.com/tutorials/applied-introduction-to-numpy-python-tutorial/
2. https://app.datacamp.com/learn
3. https://queirozf.com/entries/pandas-dataframe-plot-examples-with-matplotlib-pyplot

## Other libraries that are good to know

1. `seaborn` for data visualization
2. `sklearn / sci-kit learn` for machine learning
3. `Tensorflow / PyTorch` for deep learning
4. Any financial data library to extract data (`yfinance, openBB, etc.`)

## Practice!

1. Create an array of the first 10 prime numbers. Then write a python program to return the difference between its neighboring value ($n_{i+1} - n_{i}$). 

`Ex. [5, 9, 7] should return [4, -2]`

2. Given a range of numbers (inclusive), return how many odd numbers there are.

3. Create a 2D $n$ by $n$ array where only the border of the matrix is one, and the rest are 0s.

4. Given a list of non repeating numbers, find the missing number.**

**Let's try to use the libraries we just learned!**

In [None]:
def primeDifference():
    """ q1
    [5, 9, 7] should return [4, -2]
    """
    #create array
    
    # hint: you should try to code it first, but there is a function that specifically does this
    # Stackoverflow, Google, ChatGPT are your best friends
    
    return 

primeDifference()

In [None]:
def oddInRange(a, b):
    """ q2
    Given [a, b], return number of odds in that range ([a, b] inclusive)
    Ex. [1, 4] should return 2. [0,4] returns 2. [10, 2003] returns 997. 
    """
    
    return 

In [None]:
def arrayBorder(n):
    """ q3
    Create a 2D n by n array where only the border of the matrix is one, and the rest are 0s.
    
    Hint: Many ways to do it, but for the sake of learning, use np.ones() and figure out what it does!
    """

    return

?np.ones

In [None]:
def missingNumber(miss):
    """ q4
    Given a list of non repeating numbers, find the missing number.
    [1,2,4] returns 3
    [3,2,1,7,5,6] returns 4 
    Hint: there is a really elegant way to do this - try to come up with it!
    """
    

    
    return

5. Two Sum: Classic LeetCode

https://leetcode.com/problems/two-sum/

Top data structures to know:
* Queues
* Stacks
* Hashmap (Dictionary)
* Dequeues

**Calculating probabilities**

Consider the number of people that show up at a bus station is Poisson with mean 2.5/hour.
What is the probability that at most three people show up in a four hour period?
$\mu = 2.5*4 = 10$, 
$k = 3$

In [None]:
from scipy.stats import poisson

# poisson is discrete, not continuous

p = 0
count = 3

for i in range(count+1):
    p += poisson.pmf(k=i, mu=10)
p

**Use SPY.csv to create two histograms, one for returns and one for stock prices**

What do you think these distributions are? (Exponential, Normal, etc.)

In [None]:
# plt.hist