# Classification with Azure Machine Learning *designer*

In the previous lab, you used the automated machine learning feature of Azure Machine Learning to train and deploy a machine learning model for *regression*, in which you predicted a numeric value. In this lab, you'll create a model for *classification*, in which the model predicts the category, or *class*, a given entity belongs to.

While automated machine learning makes it easy to try lots of algorithms and find the best performing model for your data, there may be cases where you want more control over how your data is prepared and used to train a model; so this time you'll use another Azure Machine Learning feature called *designer*, which enables you to define workflow for model training in a visual, drag and drop interface.

## What is Classification?

Classification is a form of supervised machine learning in which a model is trained to fit the *features* of an entity to a *label* that represents a particular class. The label is usually an integer indicator, such as 0, 1, or 2; with each indicator representing a different possible classification. As you might recall from the previous lab, we often think of a machine learning model as a function (***f***) that operates on features (**x**) to predict a label (**y**); and in this case, **y** is a numeric indicator for a class.

$$y = f(x)$$

Let's look at an example to make things a little clearer.


<p style='text-align:center'><img src='./images/diabetes.jpg' alt="Clinical data for multiple patients, some of whom are diabetic and some of whom aren't"/></p>

Suppose a health clinic offers patients a general health screening, where they can provide their personal details (age, number of children, and so on) and have some medical metrics measured (weight, blood pressure, and so on). The screening might also include a test for diabetes, for which some patients may test negative and others positive. So there's a sense in which there are two *classes* of patient; those with diabetes, and those without. We could assign numeric labels to those classes, with ***0*** meaning that the patient tested negative for diabetes, and ***1*** meaning a positive diagnosis.

Now, let's suppose that the diabetes test is expensive to conduct, and stressful for the patient being tested (important tip: it's not, if you think you're at risk *please* go and get tested. We're just imagining it is for the purposes of this lab example!). The clinic might want to restrict testing to only those patients with a high probability of testing positive. So the challenge is to take all of the other information we have about the patients (age, weight, blood pressure, and so on) and try to find a correlation with the classification of the patient as non-diabetic or diabetic. In other words, we use the patient's medical measurements as *features* to predict a class *label* that indicates the likelihood of a positive diagnosis for diabetes.

> **A slight technicality (which will be important later!)**
>
> Up to this point, we've throught of a classification model as predicting a numeric class indicator, like 0 or 1. Actually, it's a little more complicated than that. What the model actually calculates is the *probability* of the entity belonging to each possible label - so the result of the function (*f*) is actually a vector (in other words, an array of values) that contains a probability score for each possible class. For our diabetes example, the function might return a vector such as $[0.2, 0.8]$, which indicates that there's a 0.2 (20%) probability that this particular patient belongs to class 0 (non-diabetic), and 0.8 (80%) probability that they belong to class 1 (diabetic). We classify the entity based on the most probable label prediction, so in this case the patient would be classified as class 1 (diabetic).
> 
> Note that the individual class proabilities always add up to 1 - there's a 100% probability that the patient is either diabetic or non-diabetic!
> 
> In this case, the model we're creating only has two possible classes - we call this *binary* classification. However, the same principles are true for scenarios where there are multiple possible classes. For example, we could create a model to predict non-diabetic, type A diabetes, and type B diabetes classes. This would result in a vector of three probability values (one for each possible class), which would still add up to 1.