# Course: Introduction to machine learning

This introductory lesson includes three parts. Each part is essential to the machine learning practitioner's workflow.


## The five steps of machine learning

### 2.1 Introduction to the five machine learning steps

Major steps in the machine learning process
In the preceding diagram, you can see an outline of the major steps of the machine learning process. Regardless of the specific model or training algorithm used, machine learning practitioners practice a common workflow to accomplish machine learning tasks.

![image.png](attachment:image.png)

These steps are iterative. In practice, that means that at each step along the process, you review how the process is going. Are things operating as you expected? If not, go back and revisit your current step or previous steps to try and identify the breakdown.
The rest of the course is designed around these very important steps.
Congratulations for completing this learning module. In this module, you learned about labeled data versus unlabeled data, and how that plays a role in defining the machine learning task you are going to use. Furthermore, you started to explore which machine learning tasks are used to solve certain kinds of machine learning problems.
Here’s a quick recap of the terms introduced in this lesson:

•	Clustering is an unsupervised learning task that helps to determine if there are any naturally occurring groupings in the data.

•	A categorical label has a discrete set of possible values, such as "is a cat" and "is not a cat."

•	A continuous (regression) label does not have a discrete set of possible values, which means there are potentially an unlimited number of possibilities.

•	Discrete is a term taken from statistics referring to an outcome that takes only a finite number of values (such as days of the week).

•	A label refers to data that already contains the solution.

•	Using unlabeled data means you don't need to provide the model with any kind of label or solution while the model is being trained.



### Step 1: Define the problem

This chapter demonstrates how to define a task that can be solved using machine learning.

**Supervised and unsupervised learning**
The presence or absence of labeling in your data is often used to identify a machine learning task.
![image.png](attachment:image.png) 
**Supervised tasks**
A task is supervised if you are using labeled data. We use the term labeled to refer to data that already contains the solutions, called labels.
For example, predicting the number of snow cones sold based on the average temperature outside is an example of supervised learning.
![image-2.png](attachment:image-2.png)
In the preceding graph, the data contains both a temperature and the number of snow cones sold. Both components are used to generate the linear regression shown on the graph. Our goal was to predict the number of snow cones sold, and we feed that value into the model. We are providing the model with labeled data and therefore, we are performing a supervised machine learning task.
**Unsupervised tasks**
A task is considered to be unsupervised if you are using unlabeled data. This means you don't need to provide the model with any kind of label or solution while the model is being trained.
Let's take a look at unlabeled data.
![image-3.png](attachment:image-3.png) 
•	Take a look at the preceding picture. Did you notice the tree in the picture? What you just did, when you noticed the object in the picture and identified it as a tree, is called labeling the picture. Unlike you, a computer just sees that image as a matrix of pixels of varying intensity.

•	Since this image does not have the labeling in its original data, it is considered unlabeled.

How do we further classify tasks when we don’t have a label?
Unsupervised learning involves using data that doesn't have a label. One common task is called clustering. Clustering helps to determine if there are any naturally occurring groupings in the data.

Let's look at an example of how clustering works in unlabeled data.

Example: Identifying book micro-genres with unsupervised learning

Imagine that you work for a company that recommends books to readers.

The assumption is that you are fairly confident that micro-genres exist, and that there is one called Teen Vampire Romance. However, you don’t know which micro-genres exist specifically, so you can't use supervised learning techniques.

This is where the unsupervised learning clustering technique might be able to detect some groupings in the data. The words and phrases used in a book's description might provide some guidance on its micro-genre.

Classifying based on label type
Initially, we divided tasks based on the presence or absence of labeled data while training our model. Often, tasks are further defined by the type of label that is present.
![image.png](attachment:image.png) 
In supervised learning, there are two main identifiers that you will see in machine learning:

•	A categorical label has a discrete set of possible values. In a machine learning problem in which you want to identify the type of flower based on a picture, you would train your model using images that have been labeled with the categories of the flower that you want to identify. Furthermore, when you work with categorical labels, you often carry out classification tasks, which are part of the supervised learning family.

•	A continuous (regression) label does not have a discrete set of possible values, which often means you are working with numerical data. In the snow cone sales example, we are trying to predict the number of snow cones sold. Here, our label is a number that could, in theory, be any value.

In unsupervised learning, clustering is just one example. There are many other options, such as deep learning.

**Wrap-up**
Congratulations for completing this learning module. In this module, you learned about labeled data versus unlabeled data, and they play a role in defining the machine learning task that you are going to use. Furthermore, you started to explore which machine learning tasks are used to solve certain kinds of machine learning problems.
Here’s a quick recap of the terms introduced in this lesson:

•	Clustering, an unsupervised learning task that helps to determine if there are any naturally occurring groupings in the data.

•	A categorical label has a discrete set of possible values, such as "is a cat" and "is not a cat".

•	A continuous (regression) label does not have a discrete set of possible values, which means there are potentially possibly an unlimited number of possibilities.

•	Discrete: A term taken from statistics referring to an outcome taking on only a finite number of values (such as days of the week).

•	A label refers to data that already contains the solution.

•	Using unlabeled data means you don't need to provide the model with any kind of label or solution while the model is being trained.


### Step 2: Build the dataset

This chapter introduces you to the key parts required to put together an effective dataset.
The next step in the machine learning process is to build a dataset that can be used to solve your machine learning-based problem. Understanding the data needed helps you select better models and algorithms so you can build more effective solutions.

**The most important step of the machine learning process**
Working with data is perhaps the most overlooked—yet most important—step of the machine learning process. In 2017, an O’Reilly study showed that machine learning practitioners spend 80% of their time working with their data.

**The Four Aspects of Working with Data**
![image.png](attachment:image.png)
You can take an entire class just on working with, understanding, and processing data for machine learning applications. Good, high-quality data is essential for any kind of machine learning project. Let's explore some of the common aspects of working with data.

**Data collection**
Data collection can be as straightforward as running the appropriate SQL queries or as complicated as building custom web scraper applications to collect data for your project. You might even have to run a model over your data to generate needed labels. Here is the fundamental question:

Does the data you've collected match the machine learning task and problem you have defined?
The quality of your data will ultimately be the largest factor that affects how well you can expect your model to perform. As you inspect your data, look for:

•	Outliers

•	Missing or incomplete values

•	Data that needs to be transformed or preprocessed so it's in the correct format to be used by your model

**Summary statistics**
Models can make assumptions about how your data is structured.
Now that you have some data in hand, it is a good best practice to check that your data is in line with the underlying assumptions of the machine learning model that you chose.

Using statistical tools, you can calculate things like the mean, inner-quartile range (IQR), and standard deviation. These tools can give you insights into the scope, scale, and shape of a dataset.


**Data visualization**
You can use data visualization to see outliers and trends in your data and to help stakeholders understand your data.
Look at the following two graphs. In the first graph, some data seems to have clustered into different groups. In the graph immediately preceding it, some data points might be outliers.
![image-2.png](attachment:image-2.png) 
![image-3.png](attachment:image-3.png)

**Wrap-up**
Nice work completing this chapter! There is a lot information in this section. Let’s review a couple of key parts:


One
You learned that having good data is key to being able to successfully answer the problem you have defined in your machine learning problem.

Two
To build a good dataset, there are four key aspects to be considered when working with your data. First, you need to collect the data. Second, you should inspect your data to check for outliers, missing or incomplete values, and to see if any kind of data reformatting is required. Third, you should use summary statistics to understand the scope, scale, and shape of the dataset. Finally, you should use data visualizations to check for outliers, and to see trends in your data.

**Key terms from this lesson:**
•	Impute is a common term referring to different statistical tools that can be used to calculate missing values from your dataset.
•	Outliers are data points that are significantly different from other date in the same sample.
Additional reading
•	In machine learning, you use several statistical-based tools to better understand your data. The sklearn library has many examples and tutorials, such as this example that demonstrates outlier detection on a real dataset.

**Model training**
Now that you’ve got some data, it is time to start training your first model. Model training is an iterative process. In this lesson, you will learn the different steps required to successfully train a model.


### Step 3: Model training

Modeling training is a process whereby the model's parameters are iteratively updated to minimize some loss function that has been previously defined.
Now, you are ready to start training your first model.

**Splitting your dataset**
The first step in model training is to randomly split the dataset.
This allows you to keep some data hidden during training, so that the data can be used to evaluate your model before you put it into production. Specifically, you do this to test against the bias-variance trade-off. If you're interested in learning more, see the optional Extended learning section.
Splitting your dataset gives you two sets of data:
•	Training dataset: The data on which the model will be trained. Most of your data will be here. Many developers estimate about 80%.
•	Test dataset: The data withheld from the model during training, which is used to test how well your model will generalize to new data.

**Putting it all together and key modeling training terms**
The model training algorithm iteratively updates a model's parameters to minimize some loss function.
Let's define those two terms:

•	Model parameters: Model parameters are settings or configurations that the training algorithm can update to change how the model behaves. Depending on the context, you’ll also hear other specific terms used to describe model parameters such as weights and biases. Weights, which are values that change as the model learns, are more specific to neural networks.

•	Loss function: A loss function is used to codify the model’s distance from a goal. For example, if you were trying to predict the number of snow cone sales based on the day’s weather, you would care about making predictions that are as accurate as possible. So you might define a loss function to be “the average distance between your model’s predicted number of snow cone sales and the correct number.” You can see in the snow cone example; this is the difference between the two purple dots.

**Putting it all together**

The end-to-end training process is:

•	Feed the training data into the model.

•	Compute the loss function on the results.

•	Update the model parameters in a direction that reduces loss.

You continue to cycle through these steps until you reach a predefined stop condition. This might be based on training time, the number of training cycles, or an even more intelligent or application-aware mechanism.

Advice from the experts
Remember the following advice when training your model.
1.	Practitioners often use machine learning frameworks that already have working implementations of models and model training algorithms. You could implement these from scratch, but you probably won't need to do so unless you’re developing new models or algorithms.
2.	Practitioners use a process called model selection to determine which model or models to use. The list of established models is constantly growing, and even seasoned machine learning practitioners try many different types of models while solving a problem with machine learning.
3.	Hyperparameters are settings on the model that are not changed during training, but can affect how quickly or how reliably the model trains, such as the number of clusters the model should identify.
4.	Be prepared to iterate.
Pragmatic problem solving with machine learning is rarely an exact science, and you might have assumptions about your data or problem that turn out to be false. Don’t get discouraged. Instead, foster a habit of trying new things, measuring success, and comparing results across iterations.

**(Optional) Extended learning**
This information wasn't covered in the video from the previous section, but it is provided for the advanced reader.

**Linear models**
One of the most common models covered in introductory coursework, linear models simply describe the relationship between a set of input numbers and a set of output numbers through a linear function (think of y = mx + b or a line on a x vs y chart). Classification tasks often use a strongly related logistic model, which adds an additional transformation mapping the output of the linear function to the range [0, 1], interpreted as “probability of being in the target class.” Linear models are fast to train and give you a great baseline against which to compare more complex models. A lot of media buzz is given to more complex models, but for most new problems, consider starting with a simple model.

**Tree-based models**
Tree-based models are probably the second most common model type covered in introductory coursework. They learn to categorize or regress by building an extremely large structure of nested if/else blocks, splitting the world into different regions at each if/else block. Training determines exactly where these splits happen and what value is assigned at each leaf region. For example, if you’re trying to determine if a light sensor is in sunlight or shadow, you might train tree of depth 1 with the final learned configuration being something like if (sensor_value > 0.698), then return 1; else return 0;. The tree-based model XGBoost is commonly used as an off-the-shelf implementation for this kind of model and includes enhancements beyond what is discussed here. Try tree-based models to quickly get a baseline before moving on to more complex models.

**Deep learning models**
Extremely popular and powerful, deep learning is a modern approach that is based around a conceptual model of how the human brain functions. The model (also called a neural network) is composed of collections of neurons (very simple computational units) connected together by weights (mathematical representations of how much information thst is allowed to flow from one neuron to the next). The process of training involves finding values for each weight. Various neural network structures have been determined for modeling different kinds of problems or processing different kinds of data.

A short (but not complete!) list of noteworthy examples includes:

•	FFNN: The most straightforward way of structuring a neural network, the Feed Forward Neural Network (FFNN) structures neurons in a series of layers, with each neuron in a layer containing weights to all neurons in the previous layer.

•	CNN: Convolutional Neural Networks (CNN) represent nested filters over grid-organized data. They are by far the most commonly used type of model when processing images.

•	RNN/LSTM: Recurrent Neural Networks (RNN) and the related Long Short-Term Memory (LSTM) model types are structured to effectively represent for loops in traditional computing, collecting state while iterating over some object. They can be used for processing sequences of data.

•	Transformer: A more modern replacement for RNN/LSTMs, the transformer architecture enables training over larger datasets involving sequences of data.
Machine learning using Python libraries

•	For more classical models (linear, tree-based) as well as a set of common ML-related tools, take a look at scikit-learn. The web documentation for this library is also organized for those getting familiar with space and can be a great place to get familiar with some extremely useful tools and techniques.
•	For deep learning, mxnet, tensorflow, and pytorch are the three most common libraries. For the purposes of the majority of machine learning needs, each of these is feature-paired and equivalent.

**Wrap-up**
Nice working completing this chapter! Let’s review a couple of key parts from this lesson:

One
The model training algorithm iteratively updates a model's parameters to minimize some loss function.

Two
During model training, the training data is fed into the model, and then the loss function is computed based on the results. The model parameters are then updated in a direction that reduces loss. You will continue to cycle through these steps until your reach a predefined stop condition.

**Key terms from this lesson:**

•	Hyperparameters are settings on the model that are not changed during training but can affect how quickly or how reliably the model trains, such as the number of clusters the model should identify.

•	A loss function is used to codify the model’s distance from this goal.

•	Training dataset: The data on which the model will be trained. Most of your data will be here.

•	Test dataset: The data withheld from the model during training, which is used to test how well your model will generalize to new data.

•	Model parameters are settings or configurations the training algorithm can update to change how the model behaves.


### Step 4: Evaluating a trained model

After you have collected your data and trained a model, you can start to evaluate how well your model is performing. The metrics used for evaluation are likely to be very specific to the problem you defined. As you grow in your understanding of machine learning, you will be able to explore a wide variety of metrics that can enable you to evaluate effectively.

**Model accuracy**
Model accuracy is a fairly common evaluation metric. Accuracy is the fraction of predictions a model gets right.
Here's an example:
![image.png](attachment:image.png)

Imagine that you build a model to identify a flower as one of two common species based on measurable details like petal length. You want to know how often your model predicts the correct species. This would require you to look at your model's accuracy.

**An iterative process**
![image-2.png](attachment:image-2.png)

Every step we have gone through is highly iterative and can be changed or rescoped during the course of a project. At each step, you might find that you need to go back and reevaluate some assumptions you had in previous steps. Don't worry! This ambiguity is normal.

**(Optional) Extended learning**
This information hasn't been covered in the above video but is provided for the advanced reader.

**Using Log Loss**
![image-3.png](attachment:image-3.png)
Let's say you're trying to predict how likely a customer is to buy either a jacket or t-shirt.
Log loss could be used to understand your model's uncertainty about a given prediction. In a single instance, your model could predict with 5% certainty that a customer is going to buy a t-shirt. In another instance, your model could predict with 80% certainty that a customer is going to buy a t-shirt. Log loss enables you to measure how strongly the model believes that its prediction is accurate.
In both cases, the model predicts that a customer will buy a t-shirt, but the model's certainty about that prediction can change.


### Step 5: Use the model

**Model inference**
Congratulations! You're ready to deploy your model. Once you have trained your model, have evaluated its effectiveness, and are satisfied with the results, you're ready to generate predictions on real-world problems using unseen data in the field. In machine learning, this process is often called inference.


**Machine learning is iterative**
 
Even after you deploy your model, you're always monitoring to make sure your model is producing the kinds of results that you expect. There may be times where you reinvestigate the data, modify some of the parameters in your model training algorithm, or even change the model type used for training.

**Wrap-up**
Great job getting through all the steps in the machine learning process. Let’s review some key takeaways from these lessons.
One
Solving problems using machine learning is an evolving and iterative process.
Two
To solve a problem successfully in machine learning finding high quality data is essential.
Three
To evaluate models, you often use statistical metrics. The metrics you choose are tailored to a specific use case.


## Examples of machine learning

### Case studies that use machine learning

In these remaining chapters, you will go through three case studies that demonstrate how you can use machine learning to solve real-world problems.

### Summary of examples

Through the remainder of the lesson, we will walk through three different case studies. In each example, you will see how machine learning can be used to solve real-world problems.

**Supervised learning**

Using machine learning to predict housing prices in a neighborhood, based on lot size and the number of bedrooms.

**Unsupervised learning**

Using machine learning to isolate micro-genres of books by analyzing the wording on the back cover description.

**Deep neural network**

While this type of task is beyond the scope of this lesson, we wanted to show you the power and versatility of modern machine learning. You will see how it can be used to analyze raw images from lab video footage from security cameras, trying to detect chemical spills.

### Example 1: Predicting home prices

House price prediction is one of the most common examples used to introduce machine learning.

Traditionally, real estate appraisers use many quantifiable details about a home (such as number of rooms, lot size, and year of construction) to help them estimate the value of a house.

You detect this relationship and believe that you could use machine learning to predict home prices.

![houseValue.f062a034.png](attachment:houseValue.f062a034.png)

In the next sections, you will go through the 5 major steps in machine learning in the context of this example.!

#### Step 1: Define the problem

Problem

Can we estimate the price of a house based on lot size or the number of bedrooms?

You access the sale prices for recently sold homes or have them appraised. Since you have this data, this is a supervised learning task. You want to predict a continuous numeric value, so this task is also a regression task.

![labeledData.19ee4ede.png](attachment:labeledData.19ee4ede.png)



#### Step 2: Build the dataset

For this project, you need data about home prices, so you do the following tasks:

**Data collection:** You collect numerous examples of homes sold in your neighborhood within the past year, and pay a real estate appraiser to appraise the homes whose selling price is not known.

**Data exploration:** You confirm that all of your data is numerical because most machine learning models operate on sequences of numbers. If there is textual data, you need to transform it into numbers. You'll see this in the next example.

**Data cleaning:** Look for things such as missing information or outliers, such as the 10-room mansion. You can use several techniques to handle outliers, but you can also just remove them from your dataset.

![datatable.34e7f293.png](attachment:datatable.34e7f293.png)

You also want to look for trends in your data, so you use data visualization to help you find them.

You can plot home values against each of your input variables to look for trends in your data. In the following chart, you see that when lot size increases, house value increases.




![lotSize.b231d1f1.png](attachment:lotSize.b231d1f1.png)

#### Step 3: Model training
Prior to actually training your model, you need to split your data. The standard practice is to put 80% of your dataset into a training dataset and 20% into a test dataset.

***Linear model selection***

As you see in the preceding chart, when lot size increases, home values increase too. This relationship is simple enough that a linear model can be used to represent this relationship.
A linear model across a single input variable can be represented as a line. It becomes a plane for two variables, and then a hyperplane for more than two variables. The intuition, as a line with a constant slope, doesn't change.

***Using a Python library***

The Python [scikit-learn library](https://scikit-learn.org/stable/)  has tools that can handle the implementation of the model training algorithm for you.

#### Step 4: Model evaluation

One of the most common evaluation metrics in a regression scenario is called root mean square or RMS. The math is beyond the scope of this lesson, but RMS can be thought of roughly as the "average error" across your test dataset, so you want this value to be low.

![rms.7b9e4450.png](attachment:rms.7b9e4450.png)
In the following chart, you can see where the data points are in relation to the blue line. You want the data points to be as close to the "average" line as possible, which would mean less net error.

You compute the root mean square between your model’s prediction for a data point in your test dataset and the true value from your data. This actual calculation is beyond the scope of this lesson, but it's good to understand the process at a high level.

![rmsChart.dffe2281.png](attachment:rmsChart.dffe2281.png)
***Interpreting Results***

In general, as your model improves, you see a better RMS result. You may still not be confident about whether the specific value you’ve computed is good or bad.

Many machine learning engineers manually count how many predictions were off by a threshold (for example, $50,000 in this house pricing problem) to help determine and verify the model's accuracy.

#### Step 5: Model inference

Now you are ready to put your model into action. As you can see in the following image, this means seeing how well it predicts with new data not seen during model training.

![inf1.3f8cb93f.png](attachment:inf1.3f8cb93f.png)

#### Wrap-up

In this example, you saw how you can use machine learning to help predict home prices.

***One***
Solving problems using machine learning is an evolving and iterative process.

***Two***
To solve a problem successfully in machine learning, finding high quality data is essential.

***Three***
To evaluate models, you often use statistical metrics. The metrics you choose are tailored to a specific use case.

***Terminology***

Regression: A common task in supervised machine learning used to understand the relationship between multiple variables from a dataset.

Continuous: Floating-point values with an infinite range of possible values. This is the opposite of categorical or discrete values, which take on a limited number of possible values.
Hyperplane: A mathematical term for a surface that contains more than two planes

Plane: A mathematical term for a flat surface (like a piece of paper) on which two points can be joined by drawing a straight line.

# Example 2: Using ML to predict a book's genre

This case study demonstrates how you can use an unsupervised learning approach, bag of words, to help identify a book’s genre.


In [4]:
from IPython.display import HTML

HTML('<iframe width="640" height="360" src="https://www.youtube.com/embed/XP4-FOvlxVs" title="ND065 AWSND C1 L02 A12 Example 2 V2" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')

***Summary***

In this video, you saw how the machine learning process can be applied to an unsupervised machine learning task that uses book description text to identify different micro-genres.

Additionally, this case study uses the same 5 steps of machine learning that were outlined in the previous case study and lesson. In the follow sections, you can review each of the 5 steps and how they apply to this specific case study.

#### Step 1: Define the problem
Is it possible to find clusters of similar books based on the presence of common words in the book descriptions?

![books.41b6c9f2.png](attachment:books.41b6c9f2.png)
You do editorial work for a book recommendation company, and you want to write an article on the largest book trends of the year. You believe that a trend called "micro-genres" exists, and you have confidence that you can use the book description text to identify these micro-genres.

By using an unsupervised machine learning technique called clustering, you can test your hypothesis that the book description text can be used to identify these "hidden" micro-genres.

***Identify the machine learning task you could use***

By using an unsupervised machine learning technique called clustering, you can test your hypothesis that the book description text can be used to identify these "hidden" micro-genres.

Earlier in this lesson, you were introduced to the idea of unsupervised learning. This machine learning task is especially useful when your data is not labeled.

![unsuperclust.c6b5e1c6.png](attachment:unsuperclust.c6b5e1c6.png)

#### Step 2: Build your dataset
To test the hypothesis, you gather book description text for 800 romance books published in the current year. You plan to use this text as your dataset.

***Data exploration, cleaning, and preprocessing***

In the lesson about building your dataset, you learned about how sometimes it is necessary to change the format of the data that you want to use. In this case study, we need use a process called vectorization. Vectorization is a process whereby words are converted into numbers.

***Data cleaning and exploration***

For this project, you believe capitalization and verb tense will not matter, and therefore you remove capitals and convert all verbs to the same tense using a Python library built for processing human language. You also remove punctuation and words you don’t think have useful meaning, like 'a' and 'the'. The machine learning community refers to these words as stop words.

***Data preprocessing***

Before you can train the model, you need to do a type of data preprocessing called data vectorization, which is used to convert text into numbers.
As shown in the following image, you transform this book description text into what is called a bag of words representation, so that it is understandable by machine learning models.

How the bag of words representation works is beyond the scope of this lesson. If you are interested in learning more, see the What's next section at the end of this chapter.

![bagOfWords.b044011a.png](attachment:bagOfWords.b044011a.png)

#### Step 3: Train the model

Now you are ready to train your model.

You pick a common cluster-finding model called k-means. In this model, you can change a model parameter, k, to be equal to how many clusters the model will try to find in your dataset.

Your data is unlabeled and you don't how many micro-genres might exist. So, you train your model multiple times using different values for k each time.

What does this even mean? In the following graphs, you can see examples of when k=2 and when k=3.

![k2b.7f5a0f0c.png](attachment:k2b.7f5a0f0c.png)

During the model evaluation phase, you plan on using a metric to find which value for k is the most appropriate.

#### Step 4: Model evaluation

In machine learning, numerous statistical metrics or methods are available to evaluate a model. In this use case, the silhouette coefficient is a good choice. This metric describes how well your data was clustered by the model. To find the optimal number of clusters, you plot the silhouette coefficient as shown in the following image below. You find the optimal value is when k=19.

![k19b.5c2dead3.png](attachment:k19b.5c2dead3.png)

Often, machine learning practitioners do a manual evaluation of the model's findings.

You find one cluster that contains a large collection of books that you can categorize as "paranormal teen romance." This trend is known in your industry, and therefore you feel somewhat confident in your machine learning approach. You don’t know if every cluster is going to be as cohesive as this, but you decide to use this model to see if you can find anything interesting about which to write an article.

#### Step 5: Model inference

As you inspect the different clusters found when k=19, you find a surprisingly large cluster of books. Here's an example from fictionalized cluster #7.

![silhou.a8ac69af.png](attachment:silhou.a8ac69af.png)
As you inspect the preceding table, you can see that most of these text snippets indicate that the characters are in some kind of long-distance relationship. You see a few other self-consistent clusters and feel you now have enough useful data to begin writing an article on unexpected modern romance micro-genres.

#### Wrap-up

In this example, you saw how you can use machine learning to help find micro-genres in books by using the text found on the back of the book. Here is summary of key moments from the lesson you just finished.

**One**

For some applications of machine learning, you need to not only clean and preprocess the data but also convert the data into a format that is machine readable. In this example, the words were converted into numbers through a process called data vectorization.

**Two**

Solving problems in machine learning requires iteration. In this example you saw how it was necessary to train the model multiples times for different values of k. After training your model over multiple iterations you saw how the silhouette coefficient could be use to determine the optimal value for k.

**Three**

During model inference you continued to inspect the clusters for accuracy to ensure that your model was generative useful predictions.

**Terminology**

Bag of words: A technique used to extract features from text. It counts how many times a word appears in a document (corpus), and then transforms that information into a dataset.

Data vectorization: A process that converts non-numeric data into a numerical format so that it can be used by a machine learning model.

Silhouette coefficients: A score from -1 to 1 describing the clusters found during modeling. A score near zero indicates overlapping clusters, and scores less than zero indicate data points assigned to incorrect clusters. A score approaching 1 indicates successful identification of discrete non-overlapping clusters.

Stop words: A list of words removed by natural language processing tools when building your dataset. There is no single universal list of stop words used by all-natural language processing tools.

## Example #3 Using ML to detect spills

This case study demonstrates how you can use a neural network to detect spills from a camera feed.

In [2]:
from IPython.display import HTML

HTML('<iframe width="429" height="550" src="https://www.youtube.com/embed/VTmiITFTuEo" title="ND065 AWSND C1 L02 A13 Example 3 V3" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')



In the previous two examples, we used classical methods like linear models and k-means to solve machine learning tasks. In this example, we’ll use a more modern model type.

Note This example uses a neural network. The algorithm for how a neural network works is beyond the scope of this lesson. However, there is still value in seeing how machine learning applies, in this case.

### Step 1: Defining the problem

Imagine you run a company that offers specialized on-site janitorial services. One client - an industrial chemical plant - requires a fast response for spills and other health hazards. You realize if you could automatically detect spills using the plant's surveillance system, you could mobilize your janitorial team faster.

Machine learning could be a valuable tool to solve this problem.

![spills.b9ee8efe.png](attachment:spills.b9ee8efe.png)

**Choosing a model**

As shown in the image above, your goal will be to predict if each image belongs to one of the following classes:

* Contains spill
* Does not contain spill

![superclass.305479fc.png](attachment:superclass.305479fc.png)

### Step 2: Building a dataset

**Collecting**

Using historical data, as well as safely staged spills, quickly build a collection of images that contain both spills and non-spills in multiple lighting conditions and environments.

**Exploring and cleaning**

Go through all of the photos to ensure that the spill is clearly in the shot. There are Python tools and other techniques available to improve image quality, which you can use later if you determine that you need to iterate.

**Data vectorization (converting to numbers)**

Many models require numerical data, so you must transform all of your image data needs to be transformed into a numerical format. Python tools can help you do this automatically.

In the following image, you can see how each pixel in the image immediately below can be represented in the image beneath it using a number between 0 and 1, with 0 being completely black and 1 being completely white.

![download.png](attachment:download.png)

![spillnumbers.0b8e60a8.png](attachment:spillnumbers.0b8e60a8.png)

**Split the data**

Split your image data into a training dataset and a test dataset.

### Step 3: Model Training
Traditionally, solving this problem would require hand-engineering features on top of the underlying pixels (for example, locations of prominent edges and corners in the image), and then training a model on these features.
Today, deep neural networks are the most common tool used for solving this kind of problem. Many deep neural network models are structured to learn the features on top of the underlying pixels so you don’t have to learn them. You’ll have a chance to take a deeper look at this in the next lesson, so we’ll keep things high-level for now.

**CNN (convolutional neural network)**

Neural networks are beyond the scope of this lesson, but you can think of them as a collection of very simple models connected together. These simple models are called neurons, and the connections between these models are trainable model parameters called weights.
Convolutional neural networks are a special type of neural network that is particularly good at processing images.

### Step 4: Model evaluation
As you saw in the last example, there are many different statistical metrics that you can use to evaluate your model. As you gain more experience in machine learning, you will learn how to research which metrics can help you evaluate your model most effectively. Here's a list of common metrics:

* Accuracy
* Confusion matrix
* F1 score
* False positive rate
* False negative rate
* Log loss
* Negative predictive value
* Precession
* Recall
* ROC Curve
* Specificity

In cases such as this, accuracy might not be the best evaluation mechanism.

Why not? The model will see the does not contain spill' class almost all the time, so any model that just predicts no spill most of the time will seem pretty accurate.

What you really care about is an evaluation tool that rarely misses a real spill.

After doing some internet sleuthing, you realize this is a common problem and that precision and recall will be effective. Think of precision as answering the question, "Of all predictions of a spill, how many were right?" and recall as answering the question, "Of all actual spills, how many did we detect?"

Manual evaluation plays an important role. If you are unsure if your staged spills are sufficiently realistic compared to actual spills, you can get a better sense how well your model performs with actual spills by finding additional examples from historical records. This allows you to confirm that your model is performing satisfactorily.

### Step 5: Model inference
The model can be deployed on a system that enables you to run machine learning workloads such as AWS Panorama.

Thankfully, most of the time, the results will be from the class does not contain spill.

![nospilled.746294f5.png](attachment:nospilled.746294f5.png)

But, when the class contains spill' is detected, a simple paging system could alert the team to respond.

![spilled.4486f127.png](attachment:spilled.4486f127.png)

**Wrap-up**

In this example, you saw how you can use machine learning to help detect spills in a work environment. This example also used a modern machine learning technique called a convolutional neural network (CNN).

Here is summary of key moments from the lesson that you just finished.

**One**

For some applications of machine learning, you need to use more complicated techniques to solve the problem. While modern neural networks are a powerful tool, don’t forget their cost in terms of being easily explained.

**Two**

High quality data once again was very important to the success of this application, to the point where even staging some fake data was required. Once again, the process of data vectorization was required so it was important to convert the images into numbers so that they could be used by the neural network.

**Three**

During model inference you continued to inspect the predictions for accuracy. It is especially important in this case because you created some fake data to use when training your model.

**Terminology**

Neural networks: a collection of very simple models connected together.

* These simple models are called neurons.
* The connections between these models are trainable model parameters called weights.

Convolutional neural networks(CNN): a special type of neural network particularly good at processing images.

# Course: Introduction to reinforcement learning with AWS DeepRacer

This lesson covers the fundamentals of reinforcement learning with AWS DeepRacer. Each chapter introduces you to concepts in reinforcement learning using AWS DeepRacer as an example.

Using this content can help you prequalify for the AWS AI & ML Scholarship program. To learn more about the program, see the What is the AWS AI & ML Scholarship program topic in Learn more.

##  Machine Learning Refresher
This chapter will introduce the content of this course, and provide a refresher on the key concepts in machine learning.

Throughout this course you are going to learn the fundamental concepts which underpin Reinforcement Learning - an exciting branch of machine learning which has real-world applications from training computers how to play computer games through to training autonomous vehicles.

But before we start our journey let’s have a refresher on the key machine learning concepts.

In [3]:
from IPython.display import HTML

HTML('<iframe width="521" height="500" src="https://www.youtube.com/embed/riYohxyHg-k" title="DeepRacer Student - Chapter 1.1 (Machine learning refresher)" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')

**Summary**

Machine learning is part of the broader field known as artificial intelligence. This involves simulating human intelligence and decision making by building algorithms to take and process information which inform future decisions. Machine learning involves teaching an algorithm how to learn without being explicitly programmed to do so.

### Supervised learning

In supervised learning, every training sample from the dataset has a corresponding label associated with it, which essentially tells the machine learning algorithm what the training sample is. As a result, the algorithm can then learn from this data to predict labels for unseen data in the future.

![Lesson13SupervisedLearning.951e60b5.jpg](attachment:Lesson13SupervisedLearning.951e60b5.jpg)

In this example, the algorithm is being trained to identify flowers. So it would be given training data with images of flowers along with kind of flower each image contains (i.e. the label). The algorithm then uses this training data to learn and identify flowers in unseen images it may be provided in the future.

### Unsupervised learning
In unsupervised learning, there are no labels for the training data. The machine learning algorithm tries to learn the underlying patterns or distributions that govern the data.

![Lesson13UnsupervisedLearning.7e2e6610.png](attachment:Lesson13UnsupervisedLearning.7e2e6610.png)

In this example, the training data given to the machine learning algorithm does not contain labels to predict. Instead, the algorithm must identify patterns in the data itself. This can often be a benefit since it allows you to use massive datasets where labels are often not available.

### Reinforcement learning
Reinforcement learning is very different to supervised and unsupervised learning. In reinforcement learning, the algorithm learns from experience and experimentation. Essentially, it learns from trial and error.

![Lesson13ReinforcementLearning.9f215e51.png](attachment:Lesson13ReinforcementLearning.9f215e51.png)

In this example, we are training a dog. The dog will try and do different things in response to commands you may issue, such as “sit” or “stay”, and when it does the right thing you provide a treat, like a doggie biscuit. Over time the dog learns that to get a reward it needs to correctly follow your commands.

### Wrap-Up
In this lesson you have completed a refresher of the three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning.

The rest of this course will be focusing on reinforcement learning and how you can use this to get started with AWS DeepRacer. See you on the track!

### Wrap-Up
In this lesson you have completed a refresher of the three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning.

The rest of this course will be focusing on reinforcement learning and how you can use this to get started with AWS DeepRacer. See you on the track!

In [4]:
from IPython.display import HTML

HTML('<iframe width="521" height="500" src="https://www.youtube.com/embed/_pNU0hqFTxw" title="DeepRacer Student - Chapter 1.2 (Introduction to reinforcement learning)" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')

**Summary**
Reinforcement learning consists of several key concepts:

* Agent is the entity being trained. In our example, this is a dog.
* Environment is the “world” in which the agent interacts, such as a park.
* Actions are performed by the agent in the environment, such as running around, sitting, or playing ball.
* Rewards are issued to the agent for performing good actions.

Keep these terms in mind as you continue your journey with reinforcement learning. They will come up frequently and are very important.

Let’s look at some examples of how reinforcement learning can be applied towards real world problems.

### Playing games

![Lesson14Breakout.28f951a8.jpg](attachment:Lesson14Breakout.28f951a8.jpg)

Playing games is a classic example of applied reinforcement learning.

Let’s use the game Breakout as an example. The objective of the game is to control the paddle and direct the ball to hit the bricks and make them disappear. A reinforcement learning model has no idea what the purpose of the game is, but by being rewarded for good behavior (in this case, hitting a brick with the ball) it learns over time that it should do that to maximise reward.

In this situation, the:

* Agent is the paddle;
* Environment is the game scenes with the bricks and boundaries;
* Actions are the movement of the paddle; and
* Rewards are issued by the reinforcement learning model based upon the number of bricks hit with the ball.

### Traffic signaling

![Lesson14TrafficSignals.a677f52a.jpg](attachment:Lesson14TrafficSignals.a677f52a.jpg)

Another use case for reinforcement learning is controlling and coordinating traffic signals to minimize traffic congestion.

How many times have you driven down a road filled with traffic lights and have to stop at every intersection as the lights are not coordinated? Using reinforcement learning, the model wants to maximise its total reward which is done through ensuring that the traffic signals change to keep maximum possible traffic flow.

In this use-case, the:

* Agent is the traffic light control system;
* Environment is the road network;
* Actions are changing the traffic light signals (red-yellow-green); and
* Rewards are issued by the reinforcement learning model based upon traffic flow and throughput in the road network.

### Autonomous vehicles

![Lesson14AutonomousCars.877111ff.jpg](attachment:Lesson14AutonomousCars.877111ff.jpg)

A final example of reinforcement learning is for self-driving, autonomous, cars.

It's obviously preferable for cars to stay on the road, not run into anything, and travel at a reasonable speed to get the passengers to their destination. A reinforcement learning model can be rewarded for doing these things and will learn over time that it can maximize rewards by doing these things.

In this case, the:

* Agent is the car (or, more correctly, the self-driving software running on the car);
* Environment is the roads and surrounds on which the car is driving;
* Actions are things such as steering angle and speed; and
* Rewards are issued by the reinforcement learning model based upon how successfully the car stays on the road and drives to the destination.

### Wrap-up
In this lesson you learned more about reinforcement learning and the key steps in developing a reinforcement learning model.

## Reinforcement learning with a mechanical computer
The previous lesson gave you a high-level overview of how reinforcement learning works and some ways that it can be applied in the real world. Now, let's take a deeper dive into how reinforcement learning utilizes rewards and punishments to optimize a strategy.

In [1]:
from IPython.display import HTML

HTML('<iframe width="467" height="500" src="https://www.youtube.com/embed/eSc_Hi9LZeg" title="DeepRacer Student - Chapter 1.3 (Reinforcement learning with a mechanical computer)" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')



### Conclusion
This chapter uses the game of Hexapawn and a simple mechanical computer to provide insight into how reinforcement learning works. The computer is made out of 24 matchboxes that represent every state in the game and all possible moves from within each state. The matchboxes are filled with a few colored beads that each represent a move possible in that state. The computer is punished when it makes a poor decision by removing the bead corresponding to that move so that it can not be repeated. Through additional turns and negative feedback, the computer learns to optimize its moves and eventually win games against human players.

To learn more about Hexapawn, you can find the original article written by AI researcher Martin Gardner here: “How to build a game-learning machine and then teach it to play and win” by Martin Gardner, Scientific American: http://cs.williams.edu/~freund/cs136-073/GardnerHexapawn.pdf .

## Reinforcement learning with AWS DeepRacer

In this chapter, you can apply the knowledge you have gained so far about reinforcement learning and apply this towards AWS DeepRacer.

The previous lessons took us through the fundamental concepts of reinforcement learning, so let’s now dive into how this applies to AWS DeepRacer.

In [1]:
from IPython.display import HTML

HTML('<iframe width="493" height="500" src="https://www.youtube.com/embed/lPo9n_LzYAI" title="DeepRacer Student - Chapter 2.1 (Reinforcement learning with AWS DeepRacer)" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')



**Summary**

AWS DeepRacer is a 1/18th scale racing car, with the objective being to drive around a track as fast as possible. To achieve this goal, AWS DeepRacer uses reinforcement learning.

* The agent is the AWS DeepRacer car (or, more specifically, the software running on the car);
* The agent wants to achieve the goal of finishing laps around the track as fast as possible, so the track is the environment.
* The agent knows about the environment through the state which is the portion of the environment known to the agent. In the case of AWS DeepRacer, it is the images being captured by the camera.
* Once the agent knows its state in the environment, it can perform actions in the environment to help it achieve its goal. In the case of DeepRacer, this might be accelerating, braking, turning left, turning right, or going straight.
* The agent then receives feedback in the form of a reward about how well that action contributed towards achieving its goal.
* And all this happens within an episode. This can be thought of as a cycle of the agent performing an action in the environment (based upon the state it has observed) and then receiving feedback in the form of a reward which informs future actions it might take.

## Training your first AWS DeepRacer model

In this chapter, you will be provided with step-by-step guidance on how to train your first AWS DeepRacer model using the AWS DeepRacer Student console.

Well done on making it this far! You should now have a good understanding of the fundamentals of reinforcement learning. This lesson will cover training your very first DeepRacer model.

**Creating a model in AWS DeepRacer Student is a six-step process:**

* Name your model
* Choose track
* Choose algorithm type
* Customize reward function
* Choose duration
* Train your model

The following sections will review these six steps which were covered in the video.

In [2]:
from IPython.display import HTML

HTML('<iframe width="429" height="500" src="https://www.youtube.com/embed/pnc0z76bKzA" title="DeepRacer Student - Chapter 2.2 (Training your first AWS DeepRacer model)" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')

## Training algorithms

In this chapter, you will be given a deep dive into the PPO and SAC training algorithms, gaining more understanding of the differences between these two algorithms.

In the previous chapters we discussed that when training a machine learning model an algorithm is used. Algorithms are sets of instructions, essentially computer programs. Machine learning algorithms are special programs which learn from data. This algorithm then outputs a model which can be used to make future predictions.

AWS DeepRacer offers two training algorithms:

* Proximal Policy Optimization (PPO)
* Soft Actor Critic (SAC)

This chapter is going to take you through the differences between these two algorithms.

However, before we get started we’ll need to look more closely at how reinforcement learning works.



### Policies

A policy defines the action that the agent should take for a given state. This could conceptually be represented as a table - given a particular state, perform this action.

This is called a deterministic policy, where there is a direct relationship between state and action. This is often used when the agent has a full understanding of the environment and, given a state, always performs the same action.

Consider the classic game of rock, paper, scissors. An example of a deterministic policy is always playing rock. Eventually the other players are going to realize that you are always playing rock and then adapt their strategy to win, most likely by always playing paper. So in this situation it’s not optimal to use a deterministic policy.

So, we can alternatively use a stochastic policy. In a stochastic policy you have a range of possible actions for a state, each with a probability of being selected. When the policy is queried to return an action for a state it selects one of these actions based on the probability distribution.

This would obviously be a much better policy option for our rock, paper, scissors game as our opponents will no longer know exactly which action we will choose each time we play.

You might now be asking, with a stochastic policy how do you determine the value of being in a particular state and update the probability for the action which got us into this state? This question can also be applied to a deterministic policy; how do we pick the action to be taken for a given state?

Well, we somehow need to determine how much benefit we have derived from that choice of action. We can then update our stochastic policy and either increase or decrease the probability of that chosen action being selected again in the future, or select the specific action with the highest likelihood of future benefit as in our deterministic policy.

If you said that this is based on the reward, you are correct. However, the reward only gives us feedback on the value of the single action we just chose. To truly determine the value of that action (and resulting state) we should not only look at the current reward, but future rewards we could possibly get from being in this state.



### Value function

In the previous section we discussed policies in reinforcement learning, particularly deterministic policies and stochastic policies. The chapter finished with the question about how we can determine possible future rewards from being in a certain state.


This is done through the value function. Think of this as looking ahead into the future and figuring out how much reward you expect to get given your current policy.

Say the DeepRacer car (agent) is approaching a corner. The algorithm queries the policy about what to do, and it says to accelerate hard. The algorithm then asks the value function how good it thinks that decision was - but unfortunately the results are not too good, as it’s likely the agent will go off-track in the future due to his hard acceleration into a corner. As a result, the value is low and the probabilities of that action can be adjusted to discourage selection of the action and getting into this state.

This is an example of how the value function is used to critique the policy, encouraging desirable actions while discouraging others.

We call this adjustment a policy update, and this regularly happens during training. In fact, you can even define the number of episodes that should occur before a policy update is triggered.

In practice the value function is not a known thing or a proven formula. The reinforcement learning algorithm will estimate the value function from past data and experience.

### PPO and SAC

Now that we have some understanding of how machine learning algorithms work, particularly policies, let’s take a look at the similarities and differences between PPO and SAC in relation to how they learn.

The first thing to point out is that AWS DeepRacer uses both PPO and SAC algorithms to train stochastic policies. So they are similar in that regard. However, there is a key difference between the two algorithms.

PPO uses “on-policy” learning. This means it learns only from observations made by the current policy exploring the environment - using the most recent and relevant data. Say you are learning to drive a car, on-policy learning would be analogous to you reviewing a video of your most recent lesson and taking note of what you did well, and what needs improvement.

In contrast, SAC uses “off-policy” learning. This means it can use observations made from previous policies exploration of the environment - so it can also use old data. Going back to our learning to drive analogy, this would involve reviewing videos of your driving lessons from the last few weeks. Even though you have probably improved since those lessons, it can still be helpful to watch those videos in order to reinforce good and bad things. It could also include reviewing videos of other drivers to get ideas about good and bad things they might be doing.

So what are some benefits and drawbacks of each approach?

* PPO generally needs more data as it has a reasonably narrow view of the world, since it does not consider historical data - only the data in front of it during each policy update. In contrast, SAC does consider historical data so it needs less new data for each policy update.

* That said, PPO can produce a more stable model in the short-term as it only considers the most recent, relevant data - compared with SAC which might produce a less stable model in the short-term since it considers less relevant, historical data.

So which should you use? There is no right or wrong answer. SAC and PPO are two algorithms from a field which is constantly evolving and growing. Both have their benefits and either one could work best depending on the circumstance.

As you’ll learn as you continue along your machine learning journey, it involves a lot of experimentation and tuning to see what is going to work best for you.

### Reward functions

In this chapter, you will be given a deep dive into writing and improving reward functions, with practical advice that you can apply towards your own reward functions.

You might recall from when you trained your first model that you can define a reward function. The purpose of the reward function is to issue a reward based upon how good, or not so good, the actions performed are at reaching the ultimate goal. In the case of AWS DeepRacer, that goal is getting around the track as quickly as possible.

So the logical question you might be asking is - "how does the reward get calculated and issued?". Well, this one is over to you - as you have control over the reward function, which is the piece of code that determines and returns the reward value.

In the next section you will learn about more the reward function and how it works.

In [5]:
from IPython.display import HTML

HTML('<iframe width="437" height="500" src="https://www.youtube.com/embed/ANIRYsZZ4XI" title="DeepRacer Student - Chapter 3.2 Part 1 (Reward functions)" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')



### The reward function

In order to calculate an appropriate reward you need information about the state of the agent and perhaps even the environment. These are provided to you by the AWS DeepRacer system in the form of input parameters - in other words, they are parameters for input into your reward function.

There are over 20 parameters available for use, and the reward function is simply a piece of code which uses the input parameters to do some calculations and then output a number, which is the reward.

The reward function is written in Python as a standard function, but it must be called reward_function with a single parameter - which is a Python dictionary containing all the input parameters provided by the AWS DeepRacer system.

In [3]:
from IPython.display import HTML

HTML('<iframe width="437" height="500" src="https://www.youtube.com/embed/pov0afxAvlo" title="DeepRacer Student - Chapter 3.2 Part 2 (Reward functions)" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>')



## Improving the reward function

Just because you have trained a model doesn't mean you cannot change the reward function. You might find that the model is exhibiting behavior you want to de-incentivize, such as zig-zagging along the track. In this case you may include code to penalize the agent for that behavior.

These reward functions we looked at in this section are just examples, and you should experiment and find one which works well for you. A list of all the different input parameters available to the reward function can be found in the AWS DeepRacer Developer Guide, with full explanations and examples: here 

The reward function can be as simple, or as complex, as you like - just remember, a more complex reward function doesn't necessarily mean better results.

Well done on completing this deep dive on reward functions. You should now have a good working knowledge of how the sample reward functions work, along with some ideas about how you might be able to craft your very own reward function!

As a final tip, whenever you implement a reward function make sure you select the “Validate” button. This will check to make sure your reward function doesn’t have any errors that will prevent it from running. Note, this does not provide any assurances about how good your reward function might actually be in practice - it is simply a check to make sure there's no syntax errors.