Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
9a9044c
Initial commit
luisquintanilla Oct 24, 2022
c05b8e3
Fix mdlint errors
luisquintanilla Oct 24, 2022
086a660
Updates to TOC
luisquintanilla Nov 7, 2022
dfd4c23
Initial how-to guide
luisquintanilla Nov 7, 2022
168257f
Fix md lint errors
luisquintanilla Nov 7, 2022
60ab063
More mdlint fixes
luisquintanilla Nov 7, 2022
ab06272
Create text classification tutorial
luisquintanilla Nov 7, 2022
e377385
Fix package issue
luisquintanilla Nov 7, 2022
ae52697
Add conceptual AutoML doc to TOC
luisquintanilla Nov 7, 2022
1dd7ccd
Fix broken links
luisquintanilla Nov 7, 2022
a5dc730
Update AutoML conceptual doc
luisquintanilla Nov 7, 2022
eca0eb8
Fix formatting
luisquintanilla Nov 7, 2022
070ebf3
Add forecasting
luisquintanilla Nov 7, 2022
3d3bc7b
Update ML.NET supported scenarios
luisquintanilla Nov 7, 2022
4220008
Fix column build issues
luisquintanilla Nov 7, 2022
ee27805
Fix build issues
luisquintanilla Nov 7, 2022
5f8a888
Remove extra multiclass scenario
luisquintanilla Nov 7, 2022
490c710
Add sample inputs and outputs to scenarios
luisquintanilla Nov 8, 2022
d23e865
Add xrefs
luisquintanilla Nov 8, 2022
2883e50
Updates scenarios
luisquintanilla Nov 8, 2022
170f8fd
Update scenario screenshot
luisquintanilla Nov 8, 2022
9fdecae
Fix broken link
luisquintanilla Nov 8, 2022
bbf7909
Fix mdlint errors
luisquintanilla Nov 8, 2022
90b8d50
Updates to scenario screenshot
luisquintanilla Nov 8, 2022
b52d830
Add link to training time
luisquintanilla Nov 8, 2022
f6e0af4
Add xrefs
luisquintanilla Nov 9, 2022
60c3324
Added lightbox to all scenarios
luisquintanilla Nov 9, 2022
2b86037
Remove preview note
luisquintanilla Nov 9, 2022
5eebd47
Update code snippets
luisquintanilla Nov 9, 2022
bbef5fa
Updates to sentiment analysis MB tutorial
luisquintanilla Nov 9, 2022
a831559
Updated GPU doc to list hw reqs and troubleshoot
luisquintanilla Nov 9, 2022
e50655f
Addtl updates
luisquintanilla Nov 9, 2022
b874b3e
Remove localization
luisquintanilla Nov 9, 2022
e10a337
Fix md lint errors
luisquintanilla Nov 9, 2022
3fafffc
Update intro automl doc
luisquintanilla Nov 9, 2022
62035ff
Final pass AutoML API how-to
luisquintanilla Nov 9, 2022
8beb331
Fix mdlint errors
luisquintanilla Nov 9, 2022
bcac244
Fix MB doc
luisquintanilla Nov 9, 2022
b887c04
Fix table formatting
luisquintanilla Nov 9, 2022
3f1d082
Promote advanced scenario headings
luisquintanilla Nov 9, 2022
59d2dda
Fix mdlint error
luisquintanilla Nov 9, 2022
c77c7af
Fix lightbox
luisquintanilla Nov 9, 2022
9c9a29d
Fix broken link
luisquintanilla Nov 9, 2022
6737130
Added deep learning doc
luisquintanilla Nov 9, 2022
26a2759
Fix mdlint errors
luisquintanilla Nov 9, 2022
f55d8e6
Fix lint errors
luisquintanilla Nov 9, 2022
8bb7791
Update docs/fundamentals/code-analysis/quality-rules/ca2109.md
luisquintanilla Nov 9, 2022
af0d840
Merge branch 'mlnet-automl' of https://github.com/luisquintanilla/doc…
luisquintanilla Nov 9, 2022
17a872e
More DL
luisquintanilla Nov 9, 2022
7ead54c
Deep learning doc complete
luisquintanilla Nov 10, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/machine-learning/automate-training-with-cli.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Automate model training with the ML.NET CLI
description: Discover how to use the ML.NET CLI tool to automatically train the best model from the command-line.
ms.date: 06/03/2020
ms.date: 11/10/2022
ms.custom: how-to, mlnet-tooling
#Customer intent: As a developer, I want to use ML.NET CLI to automatically train the "best model" from the command-prompt. I also want to understand the output provided by the tool (metrics and output assets)
---
Expand Down Expand Up @@ -71,7 +71,7 @@ The following image displays the classification metrics list for the top five mo

Accuracy is a popular metric for classification problems, however accuracy isn't always the best metric to select the best model from as explained in the following references. There are cases where you need to evaluate the quality of your model with additional metrics.

To explore and understand the metrics that are output by the CLI, see [Evaluation metrics for classification](resources/metrics.md#evaluation-metrics-for-multi-class-classification).
To explore and understand the metrics that are output by the CLI, see [Evaluation metrics for classification](resources/metrics.md#evaluation-metrics-for-multi-class-classification-and-text-classification).

### Metrics for Regression and Recommendation models

Expand Down
223 changes: 192 additions & 31 deletions docs/machine-learning/automate-training-with-model-builder.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: What is Model Builder and how does it work?
description: How to use the ML.NET Model Builder to automatically train a machine learning model
ms.date: 10/12/2021
ms.date: 11/10/2022
ms.custom: overview, mlnet-tooling
#Customer intent: As a developer, I want to use Model Builder to automatically train a model using a visual interface.
---
Expand All @@ -13,10 +13,7 @@ Model Builder uses automated machine learning (AutoML) to explore different mach

You don't need machine learning expertise to use Model Builder. All you need is some data, and a problem to solve. Model Builder generates the code to add the model to your .NET application.

![Model Builder Scenarios](./media/model-builder-scenarios.png#lightbox)

> [!NOTE]
> Model Builder is currently in Preview.
:::image type="content" source="media/model-builder-scenarios-2-0.png" alt-text="Model Builder scenario screen" lightbox="media/model-builder-scenarios-2-0.png":::

## Creating a Model Builder Project

Expand Down Expand Up @@ -45,14 +42,15 @@ A scenario is a description of the type of prediction you want to make using you

Each scenario maps to a different Machine Learning Task which include:

- Binary classification
- Multiclass classification
- Regression
- Clustering
- Anomaly detection
- Ranking
- Recommendation
- Forecasting
| Task | Scenario |
| --- | --- |
| Binary classification | Data classification |
| Multiclass classification | Data classification |
| Image classification | Image classification |
| Text classification | Text classification |
| Regression | Value prediction |
| Recommendation | Recommendation |
| Forecasting | Forecasting |

For example, the scenario of classifying sentiments as positive or negative would fall under the binary classification task.

Expand All @@ -62,49 +60,212 @@ For more information about the different ML Tasks supported by ML.NET see [Machi

In Model Builder, you need to select a scenario. The type of scenario depends on what type of prediction you are trying to make.

#### Data classification

Classification is used to categorize data into categories.
#### Tabular

![Diagram showing examples of binary classification including fraud detection, risk mitigation, and application screening](media/binary-classification-examples.png)
##### Data classification

![Examples of multiclass classification including document and product classification, support ticket routing, and customer issue prioritization](media/multiclass-classification-examples.png)
Classification is used to categorize data into categories.

#### Value prediction
<!-- ![Diagram showing examples of binary classification including fraud detection, risk mitigation, and application screening](media/binary-classification-examples.png)

![Examples of multiclass classification including document and product classification, support ticket routing, and customer issue prioritization](media/multiclass-classification-examples.png) -->

:::row:::
:::column:::
**Sample Input**
:::column-end:::
:::column:::
**Sample Output**
:::column-end:::
:::row-end:::
:::row:::
:::column:::
| SepalLength | SepalWidth | Petal Length | Petal Width | Species |
| --- | --- | --- | --- | --- |
| 5.1 | 3.5 | 1.4 | 0.2 | setosa |
:::column-end:::
:::column:::
| Predicted species |
| --- |
| setosa |
:::column-end:::
:::row-end:::

##### Value prediction

Value prediction, which falls under the regression task, is used to predict numbers.

![Diagram showing regression examples such as price prediction, sales forecasting, and predictive maintenance](media/regression-examples.png)

#### Image classification
:::row:::
:::column:::
**Sample Input**
:::column-end:::
:::column:::
**Sample Output**
:::column-end:::
:::row-end:::
:::row:::
:::column:::
| vendor_id | rate_code | passenger_count | trip_time_in_secs | trip_distance | payment_type | fare_amount |
| --- | --- | --- | --- | --- | --- | --- |
| CMT | 1 | 1 | 1271 | 3.8 | CRD | 17.5 |
:::column-end:::
:::column:::
| Predicted Fare |
| --- |
| 4.5 |
:::column-end:::
:::row-end:::

#### Recommendation

The recommendation scenario predicts a list of suggested items for a particular user, based on how similar their likes and dislikes are to other users'.

You can use the recommendation scenario when you have a set of users and a set of "products", such as items to purchase, movies, books, or TV shows, along with a set of users' "ratings" of those products.

:::row:::
:::column:::
**Sample Input**
:::column-end:::
:::column:::
**Sample Output**
:::column-end:::
:::row-end:::
:::row:::
:::column:::
| UserId | ProductId | Rating |
| --- | --- | --- |
| 1 | 2 | 4.2 |
:::column-end:::
:::column:::
| Predicted rating |
| --- |
| 4.5 |
:::column-end:::
:::row-end:::

##### Forecasting

The forecasting scenario uses historical data with a time-series or seasonal component to it.

You can use the forecasting scenario to forecast demand or sale for a product.

:::row:::
:::column:::
**Sample Input**
:::column-end:::
:::column:::
**Sample Output**
:::column-end:::
:::row-end:::
:::row:::
:::column:::
| Date | SaleQty |
| --- | --- |
| 1/1/1970 | 1000 |
:::column-end:::
:::column:::
| 3 Day Forecast |
| --- |
| [1000,1001,1002] |
:::column-end:::
:::row-end:::

#### Computer Vision

##### Image classification

Image classification is used to identify images of different categories. For example, different types of terrain or animals or manufacturing defects.

You can use the image classification scenario if you have a set of images, and you want to classify the images into different categories.

#### Object detection
:::row:::
:::column:::
**Sample Input**
:::column-end:::
:::column:::
**Sample Output**
:::column-end:::
:::row-end:::
:::row:::
:::column:::
:::image type="content" source="media/automate-training-with-model-builder/dog-classification.png" alt-text="Profile view of standing pug":::
:::column-end:::
:::column:::
| Predicted Label |
| --- |
| Dog |
:::column-end:::
:::row-end:::

##### Object detection

Object detection is used to locate and categorize entities within images. For example, locating and identifying cars and people in an image.

You can use object detection when images contain multiple objects of different types.

#### Recommendation

The recommendation scenario predicts a list of suggested items for a particular user, based on how similar their likes and dislikes are to other users'.

You can use the recommendation scenario when you have a set of users and a set of "products", such as items to purchase, movies, books, or TV shows, along with a set of users' "ratings" of those products.
:::row:::
:::column:::
**Sample Input**
:::column-end:::
:::column:::
**Sample Output**
:::column-end:::
:::row-end:::
:::row:::
:::column:::
:::image type="content" source="media/automate-training-with-model-builder/dog-classification.png" alt-text="Profile view of standing pug":::
:::column-end:::
:::column:::
:::image type="content" source="media/automate-training-with-model-builder/dog-object-detection-min.png" alt-text="Profile view of standing pug with bounding box and dog label":::
:::column-end:::
:::row-end:::

#### Natural Language Processing

##### Text classification

Text classification categorizes raw text input.

You can use the text classification scenario if you have a set of documents or comments, and you want to classify them into different categories.

:::row:::
:::column:::
**Example Input**
:::column-end:::
:::column:::
**Example Output**
:::column-end:::
:::row-end:::
:::row:::
:::column:::
| Review |
| --- |
| I really like this steak!|
:::column-end:::
:::column:::
| Sentiment |
| --- |
| Positive |
:::column-end:::
:::row-end:::

## Environment

You can train your machine learning model locally on your machine or in the cloud on Azure, depending on the scenario.

When you train locally, you work within the constraints of your computer resources (CPU, memory, and disk). When you train in the cloud, you can scale up your resources to meet the demands of your scenario, especially for large datasets.

Local CPU training is supported for all scenarios except Object Detection.

Local GPU training is supported for Image Classification.

Azure training is supported for Image Classification and Object Detection.
| Scenario | Local CPU | Local GPU | Azure |
|-----------------------|------------|------------|--------|
| Data classification | ✔️ | ❌ | ❌ |
| Value prediction | ✔️ | ❌ | ❌ |
| Recommendation | ✔️ | ❌ | ❌ |
| Forecasting | ✔️ | ❌ | ❌ |
| Image classification | ✔️ | ✔️ | ✔️ |
| Object detection | ❌ | ❌ | ✔️ |
| Text classification | ✔️ | ✔️ | ❌ |

## Data

Expand Down
71 changes: 71 additions & 0 deletions docs/machine-learning/automated-machine-learning-mlnet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
title: What is Automated Machine Learning (AutoML)?
description: Learn what automated machine learning is and its different components in ML.NETs
ms.date: 11/10/2022
ms.topic: overview
ms.custom: mvc
---

# What is Automated Machine Learning (AutoML)?

Automated machine learning (AutoML) automates the process of applying machine learning to data. Given a dataset, you can run AutoML to iterate over different data transformations, machine learning algorithms, and hyperparameters to select the best model.

> [!NOTE]
> This topic refers to the ML.NET AutoML API, which is currently in preview. Material may be subject to change.

## How does AutoML work?

In general, the workflow to train machine learning models is as follows:

- Define a problem
- Collect data
- Preprocess data
- Train a model
- Evaluate the model

:::image type="content" source="media/ml-automl-workflow.png" alt-text="Traditional ML and AutoML training workflow" lightbox="media/ml-automl-workflow.png":::

Preprocessing, training, and evaluation are an experimental and iterative process that requires multiple trials until you achieve satisfactory results. Because these tasks tend to be repetitive, AutoML can help automate these steps. In addition to automation, optimization techniques are used during the training and evaluation process to find and select algorithms and hyperparameters.

## When should I use AutoML?

Whether you're just getting started with machine learning or you're an experienced user, AutoML provides solutions for automating the model development process.

- **Beginners** - If you're new to machine learning, AutoML simplifies the model development process by providing a set of defaults that reduces the number of decisions you have to make when training your model. In doing so, you can focus on your data and the problem you're trying to solve and let AutoML do the rest.
- **Experienced users** - If you have some experience with machine learning, you can customize, configure, and extend the defaults provided by AutoML based on your needs while still leveraging its automation capabilities.

## AutoML in ML.NET

- **Featurizer** - Convenience API to automate data preprocessing.
- **Trial** - A single hyperparamters optimization run.
- **Experiment** - A collection of AutoML trials. ML.NET provides a high-level API for creating experiments which sets defaults for the individual Sweepable Pipeline, Search Space, and Tuner components.
- **Search Space** - The range of available options to choose hyperparameters from.
- **Tuner** - The algorithms used to optimize hyperparameters. ML.NET supports the following tuners:
- **Cost Frugal Tuner** - Implementation of [Frugal Optimization for Cost-related Hyperparameters](https://arxiv.org/abs/2005.01571) which takes training cost into consideration
- **Eci Cost Frugal Tuner** - Implementation of Cost Frugal Tuner for hierarchical search spaces. Default tuner used by AutoML.
- **SMAC** - Tuner that uses random forests to apply Bayesian optimization.
- **Grid Search** - Tuner that works best for small search spaces.
- **Random Search**
- **Sweepable Estimator** - An ML.NET estimator that contains a search space.
- **Sweepable Pipeline** - An ML.NET pipeline that contains one or more Sweepable Estimators.
- **Trial Runner** - AutoML component that uses sweepable pipelines and trial settings to generate trial results from model training and evaluation.

It's recommended for beginners to start with the defaults provided by the high-level experiment API. For more experienced users looking for customization options, use the sweepable estimator, sweepable pipeline, search space, trial runner and tuner components.

For more information on getting started with the AutoML API, see the [How to use the ML.NET Automated Machine Learning (AutoML) API](how-to-guides/how-to-use-the-automl-api.md) guide.

## Supported tasks

AutoML provides preconfigured defaults for the following tasks:

- Binary classification
- Multiclass classification
- Regression

For other tasks, you can build your own trial runner to enable those scenarios. For more information, see the [How to use the ML.NET Automated Machine Learning (AutoML) API](how-to-guides/how-to-use-the-automl-api.md) guide.

## Next steps

- [How to use the ML.NET Automated Machine Learning (AutoML) API](how-to-guides/how-to-use-the-automl-api.md)
- [Tutorial: Classify the severity of restaurant health violations with Model Builder](tutorials/health-violation-classification-model-builder.md)
- [Tutorial: Analyze sentiment using the ML.NET CLI](tutorials/sentiment-analysis-cli.md)
Loading