Skip to content
Branch: master
Find file History
bamurtaugh and CESARDELATORRE Migration/v1.3.1 (#597)
* Add anomaly detection example to solution

* Updated label/score printing for anomaly detect

With ML.NET v1.3.0, fixed issue where Predicted Label was always true. No longer need "hack" of comparing score to 0.2

* Update build props nuget versions

ML.NET v1.3.1, ML Preview 0.15.0

* Renamed solution to match v1.3.1

* Update C# readmes to v1.3.1

* Update F# E2E readme

* Update F# getting started readmes

Change to v1.3.1

* Rename F# solution to v1.3.1

* Update to preview v0.15.1

* Changed to ML from MLPreview

Update TimeSeries to v1.3.1 instead of preview

* Update timeseries from preview to regular v1.3.1

* Change TimeSeries from preview to regular v1.3.1

* Update TensorFlow from Preview to regular v1.3.1

* Update TensorFlow from preview to regular v1.3.1
Latest commit 2feb479 Aug 6, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
GitHubLabeler Removed measuring time to simplify code May 10, 2019
GitHubLabeler.sln #117: migrated end-to-end MulticlassClassification-GitHubLabeler samp… Nov 21, 2018 Migration/v1.3.1 (#597) Aug 6, 2019

GitHub Labeler

ML.NET version API type Status App Type Data sources Scenario ML Task Algorithms
v1.3.1 Dynamic API Up-to-date Console app .csv file and GitHub issues Issues classification Multi-class classification SDCA multi-class classifier

This is a simple prototype application to demonstrate how to use ML.NET APIs. The main focus is on creating, training, and using ML (Machine Learning) model that is implemented in Predictor.cs class.


GitHubLabeler is a .NET Core console application that:

  • trains ML model on your labeled GitHub issues to teach the model what label should be assigned for a new issue. (As an example, you can use corefx-issues-train.tsv file that contains issues from public corefx repository)
  • labeles a new issue. The application will get all unlabeled open issues from the GitHub repository specified at the appsettings.json file and label them using the trained ML model created on the step above.

This ML model is using multi-class classification algorithm (SdcaMultiClassTrainer) from ML.NET.

Enter you GitHub configuration data

  1. Provide your GitHub data in the appsettings.json file:

    To allow the app to label issues in your GitHub repository you need to provide the folloving data into the appsettings.json file.

          "GitHubToken": "YOUR-GUID-GITHUB-TOKEN",
          "GitHubRepoName": "YOUR-REPO-SINGLE-NAME"

    Your user account (GitHubToken) should have write rights to the repository (GitHubRepoName).

    Check out here how to create a Github Token.

    GitHubRepoOwner can be a GitHub user ID (i.e. "MyUser") or it can also be a GitHub Organization (i.e. "dotnet")

  2. Provide training file

    a. You can use existing corefx_issues.tsv data file for experimenting with the program. In this case the predicted labels will be chosen among labels from corefx repository. No changes required.

    b. To work with labels from your GitHub repository, you will need to train the model on your data. To do so, export GitHub issues from your repository in .tsv file with the following columns:

    • ID - issue's ID
    • Area - issue's label (named this way to avoid confusion with the Label concept in ML.NET)
    • Title - issue's title
    • Description - issue's description

    and add the file in Data folder. Update DataSetLocation field to match your file's name:

let dataSetLocation = sprintf @"%s/corefx-issues-train.tsv" baseDatasetsLocation


Training is a process of running an ML model through known examples (in our case - issues with labels) and teaching it how to label new issues. In this sample it is done by calling this method at the console app:

buildAndTrainModel dataSetLocation modelFilePathName MyTrainerStrategy.SdcaMultiClassTrainer

After the training is completed, the model is saved as a .zip file in MLModels\


When the model is trained, it can be used for predicting new issue's label.

For a single test/demo without connecting to a real GitHub repo, call this method from the console app:

testSingleLabelPrediction modelFilePathName

For accessing the real issues of a GitHub repo, you call this other method from the console app:

predictLabelsAndUpdateGitHub configuration modelFilePathName

For testing convenience when reading issues from your GitHub repo, it will only load not labeled issues that were created in the past 10 minutes and are subject to be labeled. You can chenge that config, though:

Since = Nullable (DateTimeOffset(DateTime.Now.AddMinutes(-10.)))

You can modify those settings. After predicting the label, the program updates the issue with the predicted label on your GitHub repo.

You can’t perform that action at this time.