# Introduction to Machine Learning

- Use [this CheatSheet](https://www.craft.do/s/bj8CiGxswdNZl3/b/2CC9B9A0-7484-45FD-9097-52ED788D4925/The-Machine-Learning-System) to work better with the following exercises.

## Load the data

In [1]:
import seaborn as sns

In [17]:
sns.get_dataset_names()

['anagrams',
 'anscombe',
 'attention',
 'brain_networks',
 'car_crashes',
 'diamonds',
 'dots',
 'dowjones',
 'exercise',
 'flights',
 'fmri',
 'geyser',
 'glue',
 'healthexp',
 'iris',
 'mpg',
 'penguins',
 'planets',
 'seaice',
 'taxis',
 'tips',
 'titanic']

Load one of the datasets from the previous list

In [18]:
df = sns.load_dataset('tips')
df

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
...,...,...,...,...,...,...,...
242,17.82,1.75,Male,No,Sat,Dinner,2
243,18.78,3.00,Female,No,Thur,Dinner,2


## Select the variables for the model

**Both variables must be continuous**

1. y: the variable you want to predict
2. X: the variable you will use to calculate the prediction y

In [19]:
y = df.tip
X = df[['total_bill']]

## The Linear Regression ML Model

### Fit the model with the data

In [20]:
from sklearn.linear_model import LinearRegression

In [21]:
model_lr = LinearRegression()

In [22]:
model_lr.fit(X, y)

### Predictions

#### Calculate the predictions

In [23]:
model_lr.predict(X)

array([2.70463616, 2.00622312, 3.12683472, 3.40725019, 3.5028225 ,
       3.57633966, 1.84133463, 3.74332864, 2.49983836, 2.47253198,
       1.99887141, 4.6234341 , 2.53974767, 2.85587147, 2.47778321,
       3.1866987 , 2.00517288, 2.631119  , 2.70253567, 3.0890259 ,
       2.80230897, 3.05121707, 2.57650625, 5.06033609, 3.00185555,
       2.79075627, 2.32444741, 2.25303074, 3.19930164, 2.98400138,
       1.92325375, 2.84746951, 2.50193885, 3.09322688, 2.78760553,
       3.4471595 , 2.63321949, 2.69833469, 2.88317784, 4.20438627,
       2.60486287, 2.75399769, 2.38431139, 1.93690694, 4.11301494,
       2.84116804, 3.25496464, 4.32306398, 3.91871958, 2.81491191,
       2.23727706, 2.0009719 , 4.57617306, 1.96421332, 3.60469628,
       2.96719746, 4.91225152, 3.69396712, 2.10074519, 5.98980307,
       3.05121707, 2.3706582 , 2.0776398 , 2.84116804, 2.76765087,
       3.02916192, 2.64792292, 1.24269488, 3.0449156 , 2.49668762,
       2.18266431, 2.71303813, 3.74122815, 3.57528941, 2.46728

### Model evaluation

#### Calculate the model's score

In [24]:
model_lr.score(X, y)

0.45661658635167657

## The K Nearest Neighbours ML Model

### Fit the model with the data

In [25]:
from sklearn.neighbors import KNeighborsRegressor

In [26]:
model_kn = KNeighborsRegressor()

In [27]:
model_kn.fit(X, y)

### Predictions

#### Calculate the predictions

In [28]:
model_kn.predict(X)

array([2.916, 1.986, 3.416, 3.422, 3.394, 4.418, 1.63 , 3.334, 2.66 ,
       2.5  , 1.928, 4.744, 2.246, 3.252, 2.5  , 3.52 , 1.986, 2.902,
       2.916, 3.408, 2.65 , 2.76 , 2.378, 4.462, 3.104, 2.688, 2.27 ,
       1.948, 3.92 , 3.176, 2.012, 3.252, 2.66 , 3.408, 2.688, 2.62 ,
       3.002, 2.916, 2.734, 4.056, 2.254, 3.036, 2.512, 2.012, 3.552,
       3.554, 3.372, 3.692, 3.422, 3.132, 1.864, 1.928, 4.52 , 1.674,
       4.642, 3.14 , 4.462, 3.266, 1.948, 6.846, 2.76 , 2.512, 1.842,
       3.554, 2.99 , 2.79 , 2.5  , 2.03 , 2.79 , 2.66 , 1.86 , 2.664,
       3.334, 4.418, 2.49 , 1.604, 2.65 , 3.624, 2.786, 2.998, 2.602,
       3.326, 1.67 , 3.692, 2.254, 4.52 , 1.82 , 3.554, 3.738, 3.416,
       4.006, 3.446, 2.118, 3.002, 2.786, 4.662, 3.624, 1.86 , 3.416,
       1.864, 1.852, 2.246, 3.8  , 3.372, 3.264, 2.246, 3.432, 4.418,
       3.554, 2.5  , 2.512, 2.206, 4.462, 2.876, 4.642, 2.998, 3.9  ,
       1.566, 2.064, 2.62 , 2.166, 2.27 , 2.6  , 2.254, 1.864, 3.9  ,
       1.63 , 2.686,

### Model evaluation

#### Calculate the model's score

In [29]:
model_kn.score(X, y)

0.5603363923139638

## The Support Vector Machine ML Model

### Fit the model with the data

In [30]:
from sklearn.svm import SVR

In [31]:
model_sv = SVR()

In [32]:
model_sv.fit(X, y)

### Predictions

#### Calculate the predictions

In [33]:
model_sv.predict(X)

array([2.73991249, 1.74766152, 3.26767403, 3.51374935, 3.59636995,
       3.66375701, 1.6036565 , 3.83442985, 2.42134284, 2.37848137,
       1.74015205, 4.57040161, 2.48418918, 2.95578074, 2.38670938,
       3.32415604, 1.74658288, 2.6274048 , 2.73675109, 3.23001898,
       2.8822872 , 3.19061638, 2.54204503, 4.50567059, 3.13632624,
       2.86597528, 2.1523542 , 2.04980687, 3.33562243, 3.11585949,
       1.66863651, 2.944494  , 2.42464645, 3.23428632, 2.86149975,
       3.54774719, 2.63066437, 2.73041757, 2.9918126 , 4.35734932,
       2.58649594, 2.81307832, 2.24203279, 1.68075751, 4.26308296,
       2.9359685 , 3.38486254, 4.45799569, 4.03708084, 2.89990149,
       2.02795302, 1.74228784, 4.56623956, 1.70605893, 3.69093283,
       3.09618542, 4.5301756 , 3.78130522, 1.85242629, 5.10765961,
       3.19061638, 2.22132758, 1.82545373, 2.9359685 , 2.8328964 ,
       3.1667693 , 2.65341818, 1.63146893, 3.18386933, 2.41638886,
       1.95461735, 2.75252146, 3.83212569, 3.66276399, 2.37026

### Model evaluation

#### Calculate the model's score

In [34]:
model_sv.score(X, y)

0.44437403471087844