In [None]:
!conda install -c conda-forge rise

# What is machine learning, and how does it work?

![Machine learning](images/01_robot.png)

## Agenda

- What is machine learning?
- What are the two main categories of machine learning?
- What are some examples of machine learning?
- How does machine learning "work"?

## What is machine learning?

Traditionally, software engineering combined human created rules with data to create answers to a problem. Instead, machine learning uses data and answers to discover the rules behind a problem.

![TP_VS_ML](images/01_TP_vs_ML.png)

In the **traditional software development** approach, input and the algorithm is known, and you write a function to produce an output.
* Input data
* Design an algorithm by applying logic to it
* Output is produced

However, in the **Machine Learning approach**, you know the input and desired output, but you don’t know the algorithm that gives the output.
* Give a set of input data
* Give a set of desired output data
* Train a machine learning algorithm and use it to predict future data.

To learn the rules governing a phenomenon, machines have to go through a learning process, trying different rules and learning from how well they perform. Hence, why it’s known as Machine Learning.

## What are the two main categories of machine learning?

![ML_types](images/01_ML_types.png)

## Supervised Learning

In Supervised Learning, we are given a labeled data set and already know what our correct output should look like, taking into consideration that there is some relationship between the input and output. Here the goal is to learn the mapping (the rules) or relationship between a set of inputs and outputs. Supervised problems are categorized into “Regression” and “Classification” problems.

### Regression

In a “Regression” problem we are trying to predict something that has a continuous result. Therefore, regression is useful when predicting number based problems like stock market prices, the temperature for a given day, or price of house.

#### Example
Let's take an example of House Price Prediction. Your friend owns a house and says it is 750 sq/feet and he is hoping to sell his house and want to know for how much he can sell it. You know Machine Learning, and hence you can help him to predict the price using your Machine Learning knowledge and skills. Let’s see how to do it.
![HPP](images/01_house_price_prediction.png)

![Spam filter](images/01_spam_filter.png)

**Unsupervised learning**: Extracting structure from data

- Example: Segment grocery store shoppers into clusters that exhibit similar behaviors
- There is no "right answer"

![Clustering](images/01_clustering.png)

## How does machine learning "work"?

High-level steps of supervised learning:

1. First, train a **machine learning model** using **labeled data**

    - "Labeled data" has been labeled with the outcome
    - "Machine learning model" learns the relationship between the attributes of the data and its outcome

2. Then, make **predictions** on **new data** for which the label is unknown

![Supervised learning](images/01_supervised_learning.png)

The primary goal of supervised learning is to build a model that "generalizes": It accurately predicts the **future** rather than the **past**!

## Questions about machine learning

- How do I choose **which attributes** of my data to include in the model?
- How do I choose **which model** to use?
- How do I **optimize** this model for best performance?
- How do I ensure that I'm building a model that will **generalize** to unseen data?
- Can I **estimate** how well my model is likely to perform on unseen data?

## Resources

- Book: [An Introduction to Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/) (section 2.1, 14 pages)
- Video: [Learning Paradigms](http://work.caltech.edu/library/014.html) (13 minutes)

## Comments or Questions?

- Email: <kevin@dataschool.io>
- Website: http://dataschool.io
- Twitter: [@justmarkham](https://twitter.com/justmarkham)

In [1]:
from IPython.core.display import HTML
def css_styling():
    styles = open("styles/custom.css", "r").read()
    return HTML(styles)
css_styling()