# Introduction to ML.NET Concepts

# Under Contstruction!

ML.NET gives you the ability to add machine learning to .NET applications, in either online or offline scenarios. With this capability, you can make automatic predictions using the data available to your application. Machine learning applications make use of patterns in the data to make predictions rather than needing to be explicitly programmed.

Central to ML.NET is a machine learning model. The model specifies the steps needed to transform your input data into a prediction. With ML.NET, you can train a custom model by specifying an algorithm, or you can import pre-trained TensorFlow and ONNX models.

Once you have a model, you can add it to your application to make the predictions.

ML.NET runs on Windows, Linux, and macOS using .NET Core, or Windows using .NET Framework. 64 bit is supported on all platforms. 32 bit is supported on Windows, except for TensorFlow, LightGBM, and ONNX-related functionality.

Examples of the type of predictions that you can make with ML.NET:

| Task      | Example |
| ----------- | ----------- |
| Classification/Categorization | Automatically divide customer feedback into positive and negative categories |
| Regression/Predict continuous values | Predict the price of houses based on size and location |
| Anomaly Detection | Detect fraudulent banking transactions |
| Recommendations | Suggest products that online shoppers may want to buy, based on their previous purchases |
| Time series/sequential data  | Forecast the weather/product sales |
| Image classification  | Categorize pathologies in medical images |
















# Hello ML.NET World
The code in the following snippet demonstrates the simplest ML.NET application. This example constructs a linear regression model to predict house prices using house size and price data. 

First step is to reference the [Microsoft.ML](https://www.nuget.org/packages/Microsoft.ML/) package in the project. This can be done in different ways. If using Visual Studio, the package reference can be added using NuGet Package Manager. If using `dotnet` cli, the package reference can be added using the command `dotnet add package Microsoft.ML`. Or the package reference can be added manually by adding the following element to the csproj project file: 

```Xml
  <ItemGroup>
    <PackageReference Include="Microsoft.ML" Version="1.7.1" />
  </ItemGroup>
```

Regarding this notebook, we add the reference to the package reference as follow:

In [1]:
#r "nuget: Microsoft.ML, 1.7.1"

The second step is to reference the ML.NET namespaces:

In [1]:
using Microsoft.ML;
using Microsoft.ML.Data;

Now we are ready to write the code to achieve the machine learning task we need to do. Always start with creating the [MLContext](https://docs.microsoft.com/dotnet/api/microsoft.ml.mlcontext?ranMID=43674&ranEAID=rl2xnKiLcHs&ranSiteID=rl2xnKiLcHs-LuTsrQLVgyEOYaht34D47g&epi=rl2xnKiLcHs-LuTsrQLVgyEOYaht34D47g&irgwc=1&OCID=AID2200057_aff_7795_1243925&tduid=(ir__2m3q0nl02wkf6gcatnnkkvci0e2xvxwafx3xgf9200)(7795)(1243925)(rl2xnKiLcHs-LuTsrQLVgyEOYaht34D47g)()&irclickid=_2m3q0nl02wkf6gcatnnkkvci0e2xvxwafx3xgf9200&view=ml-dotnet) which is the common context for all ML.NET operations

In [1]:
MLContext mlContext = new MLContext();

Next step is to define the data structures for the data we are going to use. This sample is about house prediction prices. Start defining the following data structure which contains the house size and price:

In [1]:
public class HouseData
{
    public float Size { get; set; }
    public float Price { get; set; }
}

Then define the house price prediction data structure

In [1]:
public class Prediction
{
    [ColumnName("Score")]
    public float Price { get; set; }
}

Now we are ready to train the pre-collected data we'll use for the house price prediction scenario

In [1]:
HouseData[] houseData = {
    new HouseData() { Size = 1.1F, Price = 1.2F },
    new HouseData() { Size = 1.9F, Price = 2.3F },
    new HouseData() { Size = 2.8F, Price = 3.0F },
    new HouseData() { Size = 3.4F, Price = 3.7F } };

Using the `MLContext` we previously created, load the training data into ML.NET [IDataView](https://docs.microsoft.com/dotnet/api/microsoft.ml.idataview?view=ml-dotnet) which is the fundamental ML.NET data type

In [1]:
IDataView trainingData = mlContext.Data.LoadFromEnumerable(houseData);

Now we have the data ready, next we'll create the ML.NET pipeline specifying the trainer we are going to use to build our machine learning model. For house price prediction, we are going to use the regression trainer. ML.NET supports other machine learning trainers which can be used for other scenarios as needed. The pipeline will create what is called [Estimator](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.iestimator-1?view=ml-dotnet) which used to define teh operations applied to the data

In [1]:
// 2. Specify data preparation and model training pipeline
var pipeline = mlContext.Transforms.Concatenate("Features", new[] { "Size" })
               .Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Price", maximumNumberOfIterations: 100));

After creating the estimator, we are ready to apply the transformations and trainer defined in the pipeline to the data. To do that, call the [Fit](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.iestimator-1.fit?ranMID=43674&ranEAID=rl2xnKiLcHs&ranSiteID=rl2xnKiLcHs-G8Db905fJ0jxggGna1mdkw&epi=rl2xnKiLcHs-G8Db905fJ0jxggGna1mdkw&irgwc=1&OCID=AID2200057_aff_7795_1243925&tduid=(ir__2m3q0nl02wkf6gcatnnkkvci0e2xvx1gft3xgf9200)(7795)(1243925)(rl2xnKiLcHs-G8Db905fJ0jxggGna1mdkw)()&irclickid=_2m3q0nl02wkf6gcatnnkkvci0e2xvx1gft3xgf9200&view=ml-dotnet) method.

In [1]:
var model = pipeline.Fit(trainingData);

Now we can evaluate the trained model. The way to do that is by loading a prepared test data and then calling the [Evaluate](https://docs.microsoft.com/dotnet/api/microsoft.ml.regressioncatalog.evaluate?view=ml-dotnet) method, then printing the [Coefficient of determination](https://en.wikipedia.org/wiki/Coefficient_of_determination) to find out how the model is fitted using the test data. The closer the Coefficient of determination to 1 is better-fitted model. Repeat the training and evaluation steps till getting a satisfactory result from the trained model.

In [1]:
HouseData[] testData = {
    new HouseData() { Size = 1.1F, Price = 1.2F },
    new HouseData() { Size = 1.2F, Price = 1.5F },
    new HouseData() { Size = 1.4F, Price = 1.7F },
    new HouseData() { Size = 1.6F, Price = 1.9F },
    new HouseData() { Size = 1.9F, Price = 2.3F },
    new HouseData() { Size = 2.8F, Price = 3.0F },
    new HouseData() { Size = 3.2F, Price = 3.5F },
    new HouseData() { Size = 3.3F, Price = 3.6F },
    new HouseData() { Size = 3.5F, Price = 3.9F }, 
    new HouseData() { Size = 3.7F, Price = 4.3F }, 
    new HouseData() { Size = 4.0F, Price = 6.1F }, 
    new HouseData() { Size = 5.0F, Price = 7.2F }, 
    new HouseData() { Size = 6.0F, Price = 8.5F }, 
    new HouseData() { Size = 7.0F, Price = 9.8F }, 
    new HouseData() { Size = 8.0F, Price = 10.3F }, 
};

// Load the test data
IDataView trainingTestData = mlContext.Data.LoadFromEnumerable(testData);

// transform the test data using the model
IDataView transformedTestData = model.Transform(trainingTestData);

// Evaluate the model against the test data
RegressionMetrics trainedModelMetrics = mlContext.Regression.Evaluate(transformedTestData, labelColumnName: "Size");

// Print the R-Squared value. The Closer to 1 indicates a better fitted model.
Console.WriteLine($"Coefficient of determination for the trained model: {trainedModelMetrics.RSquared:0.00}");

Now we have the trained model ready for prediction. Let's use this model to predict a sample house price. We do that by creating the the prediction engine [PredictionEngine<TSrc,TDst>](https://docs.microsoft.com/dotnet/api/microsoft.ml.predictionengine-2?view=ml-dotnet). The prediction engine is the class for making single predictions on a previously trained model (and preceding transform pipeline). Creation of the prediction engine from the trained mode can be done by the following code:


In [1]:
var predictionEngine = mlContext.Model.CreatePredictionEngine<HouseData, Prediction>(model);

Then using the created prediction engine we can predict the house price as follows:

In [1]:
var size = new HouseData() { Size = 2.5F };
var price = predictionEngine.Predict(size);
Console.WriteLine($"Predicted price for size: {size.Size*1000} sq ft= {price.Price*100:C}k");

Congrats! You have successfully trained an ML.NET regression model using your own data, then used this model to predict the house prices. Here is a diagram summarizing the end-to-end operation of creation and training ML.NET model then using it to predict the house prices.

![](https://docs.microsoft.com/dotnet/machine-learning/media/mldotnet-annotated-workflow.png)

# Continue learning

> [⏩ Next Module - Data Prep and Feature Engineering](https://raw.githubusercontent.com/dotnet/csharp-notebooks/fa302c12c7494e5f8a5fdbe5d8283d8ff1fb7009/machine-learning/02-Data%20Preparation%20and%20Feature%20Engineering.ipynb)