page_type | name | description | urlFragment | languages | products | ||||
---|---|---|---|---|---|---|---|---|---|
sample |
Train a sentiment analysis deep learning model with ML.NET Model Builder |
Train a deep learning text classification model to analyze and classify sentiment using ML.NET Model Builder |
mlnet-sentiment-analysis-model-builder |
|
|
Sentiment Analysis: Razor Pages sample optimized for scalability and performance when running/scoring an ML.NET model built with Model Builder (Using the new Text Classification API)
ML.NET version | Status | App Type | Data type | Scenario | ML Task | Algorithms |
---|---|---|---|---|---|---|
v2.0.0 | Up-to-date | Razor Pages | Single data sample | Text classification | Text Classification | NAS-BERT |
Create a Razor Pages web application that hosts an ML.NET deep learning text classification model trained using Model Builder to analyze the sentiment of comments from a website.
- SentimentRazor: A .NET Core Razor Pages web application that uses a deep learning text classification model to analyze sentiment from comments made on the website.
Each row in the wikipedia-detox-250-line-data.tsv dataset represents a different review left by a user on Wikipedia. The first column represents the sentiment of the text (0 is non-toxic, 1 is toxic), and the second column represents the comment left by the user. The columns are separated by tabs. The data looks like the following:
Sentiment | SentimentText |
---|---|
1 | ==RUDE== Dude, you are rude upload that carl picture back, or else. |
1 | == OK! == IM GOING TO VANDALIZE WILD ONES WIKI THEN!!! |
0 | I hope this helps. |
The Text Classification API is powered by TorchSharp. TorchSharp is a .NET library that provides access to libtorch, the library that powers PyTorch. TorchSharp contains the building blocks for training neural networks from scratch in .NET. The TorchSharp components however are low-level and building neural networks from scratch has a steep learning curve. In ML.NET, we’ve abstracted some of that complexity to the scenario level.
In direct collaboration with Microsoft Research, we’ve taken a TorchSharp implementation of NAS-BERT, a variant of BERT obtained with neural architecture search, and added it to ML.NET. Using a pre-trained version of this model, the Text Classification API uses your data to fine-tune the model.
The goal of the application is to predict whether a comment's sentiment belongs to one of two categories (toxic/not-toxic). The Machine Learning Task to use in this scenario is text classification. The model in this application was trained using Model Builder.
Model Builder is an intuitive graphical Visual Studio extension to build, train, and deploy custom machine learning models.
You don't need machine learning expertise to use Model Builder. All you need is some data, and a problem to solve. Model Builder generates the code to add the model to your .NET application.
In this solution, both the SentimentAnalysis.training.cs and SentimentAnalysis.consumption.cs classes are autogenerated by Model Builder.
SentimentAnalysis.zip is also autogenerated by Model Builder and is the serialized representation of your model.
Users interact with the application through a Razor Pages website. In a text box on the main page of the application, a user enters a comment which triggers a handler on the page's model to use the input to predict the sentiment of the comment using the trained model.
If you want to try out the application with a dataset that produces better results such as the UCI Sentiment Labeled Sentences dataset, you can make the following adjustments.
-
Download UCI Sentiment Labeled Sentences dataset ZIP file anywhere on your computer, and unzip it.
-
Open PowerShell and navigate to the unzipped folder in the previous step.
-
By default, the file does not have column names. To add column names to the training data, use the following PowerShell commands:
echo "Comment`tSentiment" | sc yelp_labelled_columns.tsv; cat yelp_labelled.tsv | sc yelp_labelled_columns.tsv
The output generated by the previous commands is a new file called yelp_labelled_columns.tsv containing the original data with the respective column names.
Each row in the yelp_labelled_columns.tsv dataset represents a different restaurant review left by a user on Yelp. The first column represents the comment left by the user, and the second column represents the sentiment of the text (0 is negative, 1 positive). The columns are separated by tabs. The data looks like the following:
Comment | Sentiment |
---|---|
Wow... Loved this place. | 1 |
Crust is not good. | 0 |
Not tasty and the texture was just nasty. | 0 |
- Use model builder to train a binary classification model using the new dataset.
- Update the
OnGetAnalyzeSentiment
handler in the Index.cshtml.cs file.
public IActionResult OnGetAnalyzeSentiment([FromQuery] string text)
{
if (String.IsNullOrEmpty(text)) return Content("Neutral");
var input = new ModelInput { Comment = text };
var prediction = _predictionEnginePool.Predict(input);
var sentiment = Convert.ToBoolean(prediction.Prediction) ? "Positive" : "Negative";
return Content(sentiment);
}
- Update the
updateSentiment
function in the site.js file
function updateSentiment() {
var userInput = $("#Message").val();
getSentiment(userInput)
.then((sentiment) => {
switch (sentiment) {
case "Positive":
updateMarker(100.0,sentiment);
break;
case "Negative":
updateMarker(0.0,sentiment);
break;
default:
updateMarker(45.0, "Neutral");
}
});
}