# Introduction to Machine Learning

## Classic Artificial Intelligence

Classic AI attempts to codify the rules that a human would use to make decisions.

*Example*: If you are trying to build a system that finds all the proper nouns in a text document, you might hard-code the following rules:
- If a word is a proper noun, then the first letter or the word is capitalized.
- The first letter of a sentence is always capitalized.
- ...

The system can deduce new rules from existing rules, e.g.,
- It is impossible to tell whether the first word of a sentence is a proper noun just from capitalization.

**Pros:** The model is super interpretable! <br />
**Cons:** For complex tasks, there are too many rules, and we can't anticipate them all.

## What is Machine Learning?

*Exercise:* Pair up with the person sitting next to you. One of you will be an Earthling, the other a Martian. <br />
*Moral:* We often learn by seeing examples.

Rather than trying to come up with the rules ourselves, we can learn the rules from data. This is the essence of **machine learning**.

**Learning** refers to the act of coming up with a rule for making decisions based on a set of inputs.

Inputs: $\mathrm{x}$ <br />
Decision: $y$

Goal of Machine Learning:
Come up with a rule $f$ from **training data** $(\mathrm{x}_{i}, y_i)$

The decision $y$ is typically called the **target** or the **label**.

## A Controversy in the Wine World

In 1991, Orley Ashenfelter predicted that the 1986 vintage of Bordeaux wines would be disappointing.

He did this without tasting a drop of the wine.

Wine critics were outraged.

Robert Parker had predicted that the 1986 vintage woulde be "very good and sometimes exceptional" based on tasting an early sample.

## How did Ashenfelter make this prediction?

Ashenfelter collected data on summer temperature and winter rainfall in Bordeaux from 1952 to 1991.

The quality of wines becomes apparent after 10 years. So for vintages up to 1980, he also collected their price.

In [None]:
import pandas as pd

df = pd.read_csv('data/bordeaux.csv', index_col="year")
df

# x = summer temp, winter rainfall, harvest rainfall, age
# y = price

Unnamed: 0_level_0,price,summer,har,sep,win,age
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1952,37.0,17.1,160,14.3,600,40
1953,63.0,16.7,80,17.3,690,39
1955,45.0,17.1,130,16.8,502,37
1957,22.0,16.1,110,16.2,420,35
1958,18.0,16.4,187,19.1,582,34
1959,66.0,17.5,187,18.7,485,33
1960,14.0,16.4,290,15.8,763,32
1961,100.0,17.3,38,20.4,830,31
1962,33.0,16.3,52,17.2,697,30
1963,17.0,15.7,155,16.2,608,29


## Visualizing the Data

In [7]:
import plotly.express as px

px.scatter(df[~df['price'].isnull()],
           x='win',
           y='summer',
           color='price')

In [6]:
import plotly.graph_objects as go

fig1 = px.scatter(df[~df['price'].isnull()],
           x='win',
           y='summer',
           color='price')
fig2 = px.scatter(df[df['price'].isnull()],
                  x='win', y='summer',
                  symbol_sequence=['circle-open'])

go.Figure(data=fig1.data + fig2.data, layout=fig1.layout)

What would you predict is the quality of the 1986 wine?

<u>Insight:</u> The "closest" wines are low quality, so the 1986 vintage is probably low quality as well.

This is the intuition behind $k$**-nearest neighbors**.

## Types of Machine Learning Problems

Machine learning problems are grouped into two types, based on the type of $y$:

**Regression:** The label $y$ is quantitative.
**Classification:** The label $y$ is categorical.

Was Ashenfelter's wine problemn a regression or a classification problem?

Note that the input features $\mathrm{x}$ may be categorical, quantitative, textual, ..., or any combination of these.