Intro to ML and ML

- Machine learning (ML) and deep learning (DL) are both forms of **artificial intelligence (AI)**.
- AI arose in the 1950s and 60s to simulate humans.

![ELIZA](img/eliza.jpeg)

AI Overview

- AI is about building programs that can make decisions like a human.
- You'll hear a lot of hype about AI, but it's just code and data!  There's no magic.

In [23]:
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m")
model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m")

prompt = "Explain what AI is. Explanation:"
inputs = tokenizer.encode(prompt, return_tensors="pt")
output = tokenizer.decode(model.generate(inputs)[0], max_new_tokens=128)
print(output.replace(prompt, ""))

 AI is a computer program that processes data and makes decisions


Human Predictions

- We make predictions all of the time
- Let's say we want to predict tomorrow's temperature
- We might look outside, and decide that tomorrow will be the same temperature as today

In [10]:
import pandas as pd
pd.read_csv("observations.csv", index_col=0)

Unnamed: 0,tmax,prediction
2022-11-20,61.0,61.0
2022-11-21,60.0,60.0
2022-11-22,62.0,62.0
2022-11-23,67.0,67.0
2022-11-24,66.0,66.0
2022-11-25,70.0,70.0
2022-11-26,62.0,62.0


Testing Predictions

- Now, we want to know how good our predictions are
- We calculate the error of our predictions by taking the difference from the actuals

In [15]:
observations = pd.read_csv("observation_error.csv", index_col=0)
observations

Unnamed: 0,tmax,prediction,tmax_tomorrow,error
2022-11-20,61.0,61.0,60.0,1.0
2022-11-21,60.0,60.0,62.0,2.0
2022-11-22,62.0,62.0,67.0,5.0
2022-11-23,67.0,67.0,66.0,1.0
2022-11-24,66.0,66.0,70.0,4.0
2022-11-25,70.0,70.0,62.0,8.0
2022-11-26,62.0,62.0,64.0,2.0


Error Metric

- It can be hard to interpret several individual error metrics
- We usually average the errors to create a single error number
- This metric is called mean absolute error

In [16]:
observations["error"].mean()

3.2857142857142856

Expert Systems

- Computers run human-generated rules to make predictions
- Temperature rules
    - Tomorrow's temperature will be the average of the last 5 days
    - If today is more than 5 degrees warmer then yesterday, add 2 to tomorrow's temperature

In [24]:
expert = pd.read_csv("expert_error.csv", index_col=0)
expert["error"].mean()

3.057142857142857

Machine Learning

- An algorithm automatically makes the rules for predictions
- Much less effort than expert systems

In [None]:
![tree](img/tree.svg)

Linear regression

- One of the simplest forms of a machine learning algorithm
- Learn a linear relationship between inputs and predictions
- Equation is $$y=mx+b$$.  The model learns $$m$$ and $$b$$ automatically.

In [43]:
from sklearn.linear_model import LinearRegression
weather = pd.read_csv("clean_weather.csv", index_col=0)
lr = LinearRegression()
train = weather[:"2022-11-20"]

lr.fit(train[["tmax", "rain"]], train["tmax_tomorrow"])

In [44]:
test = weather["2022-11-20":]
preds = lr.predict(test[["tmax", "rain"]])

preds

array([62.02506267, 61.21379293, 62.83633241, 66.89268113, 66.08141139,
       69.32649036, 62.83633241])

In [45]:
(test["tmax_tomorrow"] - preds).abs().mean()

2.896623574188308

Improving accuracy

- We reduce error by giving the model better features to make predictions with
- For example, we can add columns with:
    - The average temperature in the last 7 days
    - The ratio between today's temperature and the average

In [51]:
weather["avg_temp"] = weather["tmax"].rolling(5).mean()
weather["temp_ratio"] = weather["tmax"] / weather["avg_temp"]
weather = weather.dropna()

train = weather[:"2022-11-20"]
lr.fit(train[["tmax", "rain", "avg_temp", "temp_ratio"]], train["tmax_tomorrow"])

In [52]:
test = weather["2022-11-20":]
preds = lr.predict(test[["tmax", "rain", "avg_temp", "temp_ratio"]])
(test["tmax_tomorrow"] - preds).abs().mean()

2.730178610842679

Deep Learning

- ML uses features to automatically make rules
- If we want to reduce error with ML, we usually have to add more features
- Deep learning automatically makes the features and the rules!