# Preface

## Objective and Approach

Goal: By the end of the book you should be able to implement programs capable of learning from data!

Python Frameworks to Use:

* Scikit-Learn: Easy to use, entry point to learning ML
* TensorFlow: more complex library, great to train and run very large neural networks (NN)
* Keras: high-level deep learning Application Programming Interface (API), makes it simple to train and run NN, comes with TensorFlow


All code examples are already provided online at https://github.com/ageron/handson-ml3, as jupyter notebooks. This is just a personal repository to play with the code and summarize the book for myself.

Prerequisites:

* NumPy
* Pandas
* Matplotlib
* Linear Algebra
* Differential Calculus

If not familiar, have a look at https://homl.info/tutorials. The book is divided into two parts: 1. The Fundamentals of Machine Learning (Scikit-Learn), and 2. Neural Networks and Deep Learning (TensorFlow+Keras). 

## Other Resources

* Andrew Ng's ML course on Coursera: https://www.coursera.org/learn/machine-learning
* Scikit-Learn's User Guide: https://scikit-learn.org/stable/user_guide.html
* Interactive Tutorials: https://www.dataquest.io
* ML blogs: https://www.quora.com/What-are-the-best-artificial-intelligence-blogs-newsletters
* ML competitions: https://www.kaggle.com

# Chapter 1 - The Machine Learning Landscape

This notebook follows the first chapter and plays with the examples given in the book. I will also try to answer the questions in the book that are asked to the reader.

## Prerequisites to run our code

Make sure we use the required python version or above. Note to myself: I am using the system version of python. 

In [2]:
import sys
assert sys.version_info >= (3, 7)

Import essential packages:

In [6]:
from packaging import version
import sklearn

assert version.parse(sklearn.__version__) >= version.parse("1.0.1")

Use same plot settings as in the book:

In [7]:
import matplotlib.pyplot as plt

plt.rc('font', size=12)
plt.rc('axes', labelsize=14, titlesize=14)
plt.rc('legend', fontsize=12)
plt.rc('xtick', labelsize=10)
plt.rc('ytick', labelsize=10)

Make this notebook's output stable across runs, we choose a specific random seed:

In [9]:
import numpy as np

np.random.seed(42)

## Theory

### Machine Learning in a Nutshell

Machine Learning is the science (and art) of programming computers so they can learn from data. It is the "field of study that gives computers the ability to learn without being explicitly programmed" (Arthur Samuel, 1959).

ML system:

* Training set/data: Examples that the system uses to learn, each example is called a 'training instance/sample'
* Model: ML system part that learns and makes predictions, e.g. NN, random forests

Usual advantages of ML:
* No fine-tuning or long list of rules: program can be shorter, be easier to maintain, be more accurate compared to classical computer programs
* automatically adapts to changes (fluctuating environments) compared to classical computer programs
* Complex problems that have no known algorithm, e.g. speech recognition
* can help humans learn by identifying best predictors for tasks and reveal correlations $\to$ Discover hidden patterns in big data (Data Mining)

### Types of Machine Learning Systems