<img src = "https://images2.imgbox.com/60/09/VFwl5LOq_o.jpg" width="400">

# 1. What is Machine Learning?
---

What's behind the machine learning hype? In this non-technical course, you’ll learn everything you’ve been too afraid to ask about machine learning. There’s no coding required. Hands-on exercises will help you get past the jargon and learn how this exciting technology powers everything from self-driving cars to your personal Amazon shopping suggestions. How does machine learning work, when can you use it, and what is the difference between AI and machine learning? They’re all covered. Gain skills in this hugely in-demand and influential field, and discover why machine learning is for everyone!

In this chapter, we'll define machine learning and its relation to data science and artificial intelligence. Then, we'll unpack important machine learning jargon and end with the machine learning workflow for building models.

### Artificial intelligence (AI)

First, let's talk about artificial intelligence or AI. Today, when people refer to AI, they're most likely referring to machine learning AI is a huge set of tools for making computers behave intelligently. It comprises of several sub-fields, including robotics and machine learning. In recent decades, machine learning has become the most prevalent subset of AI.

### Defining machine learning

Defining machine learning is not simple. Machine learning has many applications and overlaps with several other fields. Combined with the rapid growth of machine learning as a field, the boundaries of machine learning can be blurry. We like to define machine learning as a set of tools for making inferences and predictions from data. Let's compare inference and prediction tasks to better understand what machine learning can do.

### Defining machine learning: what can it do?

**Prediction** is about the outcome of future events. For example, will it rain tomorrow? **Inference** is more vague because it's about drawing insights. We can infer the causes of events and behaviors, for example, why it rains. We may get a combination of factors like the month, humidity, and temperature. We can also **infer patterns**, for instance, what are the different types of weather conditions? Such as a rain or overcast. Ultimately, these tasks can work together, because inferences help make predictions, however they require different types of machine learning.

### Defining machine learning: how does it work?

So, how does it all work? Machine learning methods are taken primarily from statistics and computer science. Machine learning is extremely powerful because it gives computers the ability to learn without being explicitly programmed to do so. Meaning, the computer can learn without step-by-step instructions. Essentially, machine learning learns patterns from existing data and applies it to new data. For example, it can process archived emails to learn what spam looks like on its own. Then, using what it learned, it can detect spam in new emails. For machine learning to be successful, it needs high-quality data. We'll be learning more about all this as we move through the course!

### Data science

With all this data talk, you may be wondering: where does data science fit? Data science is about discovering and communicating insights from data. Machine learning is often an important tool for data science work, especially for making predictions from data.

<img src = "https://images2.imgbox.com/bf/c2/QOtg79ae_o.png" width="400">


### Machine learning model

We've defined machine learning, but what does it look like in practice? The answer is machine learning models. A machine learning model is a statistical representation of a real-world process, like how we recognize cats or hourly changes in traffic. A process is modeled using data. We can enter new inputs into a model to get an outcome. For example, if we make a model based on historical traffic data, we can enter a future date into the model to predict how heavy traffic will be tomorrow afternoon.
The output can even be the probability of an outcome, for example, the probability that a tweet is fake. In this course, we'll unveil the "black box" that is the model.

<img src = "https://images2.imgbox.com/14/9d/sQtMpM0n_o.png" width="600">

### Three types of machine learning

There are three types of machine learning. The first listed, **reinforcement learning**, is used for deciding sequential actions, like a robot deciding its path or its next move in a chess game. Reinforcement learning is not as common as the others and uses complex mathematics, like game theory. It won't be covered any further. The most common types are **supervised** and **unsupervised learning**. Their main difference lies in their training data.

### Training data

Remember, how we said machine learning "learns" patterns from existing data and applies it to new data? We call this existing data "training data". When a model is being built and learning from training data, we call this "training a model". This can takes nanoseconds to weeks depending on the size of the data.

### Supervised learning training data

Let's look at training data for a supervised learning model. We'd like to train a model to predict whether a patient has heart disease. We have existing records from patients who've experienced chest pains and been tested for heart disease.

### Supervised learning training data
Our target variable is "heart disease", because this what we want to predict. The values "True" and "False" are labels for the target variable, meaning whether it's true or false that a patient has heart disease. Labels don't have to come in this form - they can be numbers or categories. These rows are the observations or examples that our model will learn from. You should get as many of these as possible. And, these columns are features. Features are different pieces of information that might help predict the target. Age, cholesterol, and smoking habits are known factors of heart disease. The magic of machine learning is that we can analyze many features at once, even the ones we're unsure about, and find relationships between different features. We input labels and features as data to train the model.

<img src = "https://images2.imgbox.com/66/97/VB8aSTGa_o.png" width="800">

### After training (supervised learning)

Once training is done, we can give the model new input. In our case, a new patient. The features are inputted and, the model outputs its prediction.

<img src = "https://images2.imgbox.com/64/50/wAk91HU6_o.png" width="600">

### Supervised vs unsupervised learning

In supervised learning, the training data is "labeled", meaning the values of our target are known. For instance, we knew if previous patients had heart disease based on the labels "true" and "false". In unsupervised learning, we don't have labels, only features. What can we do with this? Usually tasks like anomaly detection and clustering, which divides data into groups based on similarity. Let's explore this with our dataset.

### Unsupervised learning training data

There are different treatments for heart disease. Different types of patients respond better or worse to certain treatments. We can use unsupervised learning to understand the different types of patients we have. Let's filter our dataset to only include patients with heart disease. We can pass it into a clustering model and get categories of patients based on feature similarity. For example, one category could be patients with high cholesterol and blood sugar level of a certain age range. Note, we didn't know these categories and, even, the number of categories before running this. With this output, we can group patients and research better treatments for each group.

<img src = "https://images2.imgbox.com/35/aa/VERlg3mi_o.png" width="600">

### After training (unsupervised learning)

Now, with a new patient, we can input the features into the model and get which patient type they best fit into.

<img src = "https://images2.imgbox.com/58/e6/JlW4oWCd_o.png" width="600">

### Unsupervised Learning

In reality, data doesn't always come with labels. Either it's too much manual work to label or we don't even know what the labels are. Think of the effort it would take to label millions of road images for self-driving cars. This is when unsupervised learning shines. In this sense, the model is unsupervised and finds it's own patterns. In chapter 2, we'll dig more into how unsupervised and supervised learning work and their use-cases.

Machine learning workflow
So far, we know that training data is used to let a model learn, then that model can be used to make predictions. But, what are the steps in between? In this video, we'll introduce the machine learning workflow, which are four steps that go into building a model.

### Our scenario

We'll follow the steps with a scenario. New York City releases monthly records of all the apartments sold in the city. It includes information on the sale like the square feet of the apartment, its neighborhood, the year built, and the price it was sold, to name a few. We want to predict the price apartments will sell at, making our target the sale price. Since we have this labeled in our dataset, this is a supervised learning problem.

### Step 1: Extract features

The first step is to extract features. Datasets don't typically come naturally with clear features, so there's work to be done in reformatting the dataset. Additionally, you need to decide what features you want to begin with. In our case, we mentioned a few, such as square feet and neighborhood, but there are more that could affect our target, like distance to the nearest subway station!

<img src = "https://images2.imgbox.com/56/ce/NZZdXYUo_o.png" width="200">

### Step 2: Split dataset

After that we need to split the dataset into two datasets: the test and train dataset. The reason for doing this will become clear when in the last step. For now, keep in mind that there's two datasets!

<img src = "https://images2.imgbox.com/20/64/7c5HyNbb_o.png" width="600">

### Step 3: Train model

The third step is training the model.

<img src = "https://images2.imgbox.com/35/74/sDMpVBDg_o.png" width="600">

To do this, the train dataset is inputted into a chosen machine learning model. There are many different machine learning models to choose from with different use-cases and levels of complexity. You may have heard of some examples of models, from a neural network to a logistic regression.

<img src = "https://images2.imgbox.com/a5/3c/I8c4uZoH_o.png" width="500">

###  Step 4: Evaluate

Now we have a model and it needs to be evaluated! We can't assume the resulting model is going to be usable. What would be the best way to evaluate the model? In our case, we would want to put the features of known sold apartments into the model and see how accurately it predicts the sale price. We don't want to use any data used to train the model, because the model has already seen that data. Luckily this is exactly what the test dataset is for!

<img src = "https://images2.imgbox.com/e7/a3/XbU73kyZ_o.png" width="600">

We put the test dataset, often called "unseen data", into the model to get the model's predictions. There are many ways we could evaluate the performance of our model. For example, we could calculate the average error of the predictions or the percent of apartment sale prices that were accurately predicted within a 10% margin.

<img src = "https://images2.imgbox.com/e5/fd/yoMM7sOJ_o.png" width="400">

Whatever metric is chosen, a performance threshold needs to be decided. For example, let's say our model is predicting 80% of the apartments accurately. Is that good enough?

<img src = "https://images2.imgbox.com/27/02/W7bJM1mh_o.png" width="300">

If yes, our model is ready to use!

<img src = "https://images2.imgbox.com/76/4d/semOMZtm_o.png" width="500">

If not, we return to training the model, except we "tune it". Tuning can mean a couple different things, for example tweaking the model's options or features - we'll get more into that in chapter 2. Tuning the model can take a while and if performance isn't improving, often times it means you don't have enough data.

###  Machine learning workflow

And that's the workflow! Don't worry, you don't have to have all the details memorized. These topics will be re-iterated throughout the course.

<img src = "https://images2.imgbox.com/a7/e5/2BfTw2HL_o.png" width="700">

###  Summary of steps

In summary, we start by extracting features we want from our data. We then split the dataset into two for training and testing. Next, we train our model using the train dataset and a machine learning model. Finally, we evaluate the model! If the performance isn't good enough, we tune and go back to step 3.