Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking in the samples generated during bug bash for MissingNa, Repl… #2960

Merged
merged 2 commits into from Mar 15, 2019

Conversation

@sfilipi
Copy link
Member

sfilipi commented Mar 14, 2019

Towards #1209

Gathering the work of PR: #2814, #2779 and #2773


var samples = new List<DataPoint>()
{
new DataPoint(){ Label = 3, Features = new float[3] {1, 1, 0} },

This comment has been minimized.

@wschin

wschin Mar 14, 2019

Member

If Label column is not used in the following transform, we can remove it completely.

@wschin

wschin approved these changes Mar 14, 2019

Copy link
Member

wschin left a comment

You have in-memory class definition, in-memory data set creation, in-memory prediction. What else I can ask?

// 'true' where the value in the input column is NaN. This value can be used
// to replace missing values with other values.

IEstimator<ITransformer> pipeline = mlContext.Transforms.IndicateMissingValues("MissingIndicator", "Features");

This comment has been minimized.

@rogancarr

rogancarr Mar 14, 2019

Contributor

IEstimat [](start = 12, length = 8)

Blank line above #Resolved


// a small printing utility
Func<object[], string> vectorPrinter = (object[] vector) =>
{

This comment has been minimized.

@rogancarr

rogancarr Mar 14, 2019

Contributor

Break out of main code path and into a helper. #Pending

This comment has been minimized.

@sfilipi

sfilipi Mar 14, 2019

Author Member

I feel like the main logic is above. Breaking out would just change the order of what comes first to the attention of the users: the definition of printing or printing itself..


In reply to: 265789218 [](ancestors = 265789218)

// And finally, we can write out the rows of the dataset, looking at the columns of interest.
foreach (var row in rowEnumerable)
{
Console.WriteLine($"Label: {row.Label} Features: {vectorPrinter(row.Features.Cast<object>().ToArray())} MissingIndicator: {vectorPrinter(row.MissingIndicator.Cast<object>().ToArray())}");

This comment has been minimized.

@rogancarr

rogancarr Mar 14, 2019

Contributor

.Cast().ToArray() [](start = 92, length = 25)

This is a bit confusing for a sample, IMHO. Maybe better to just have two helper functions? #Pending

This comment has been minimized.

@sfilipi

sfilipi Mar 14, 2019

Author Member

feels self-explanatory since it casts, than ToArray. Addign yet another sample that does the same thing might make the sample look less professional.


In reply to: 265789625 [](ancestors = 265789625)

{
// Create a new ML context, for ML.NET operations. It can be used for exception tracking and logging,
// as well as the source of randomness.
var ml = new MLContext();

This comment has been minimized.

@rogancarr

rogancarr Mar 14, 2019

Contributor

ml [](start = 16, length = 2)

mlContext #Resolved

@rogancarr
Copy link
Contributor

rogancarr left a comment

Just some nits!
:shipit:

@codecov

This comment has been minimized.

Copy link

codecov bot commented Mar 14, 2019

Codecov Report

Merging #2960 into master will increase coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2960      +/-   ##
==========================================
+ Coverage   72.29%   72.29%   +<.01%     
==========================================
  Files         796      796              
  Lines      142349   142349              
  Branches    16051    16051              
==========================================
+ Hits       102905   102908       +3     
+ Misses      35063    35062       -1     
+ Partials     4381     4379       -2
Flag Coverage Δ
#Debug 72.29% <ø> (ø) ⬆️
#production 68.01% <ø> (ø) ⬆️
#test 88.48% <ø> (ø) ⬆️
Impacted Files Coverage Δ
src/Microsoft.ML.Transforms/CategoricalCatalog.cs 100% <ø> (ø) ⬆️
src/Microsoft.ML.Transforms/ExtensionsCatalog.cs 75% <ø> (ø) ⬆️
...ML.Transforms/Text/StopWordsRemovingTransformer.cs 85.69% <0%> (+0.16%) ⬆️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 84.9% <0%> (+0.2%) ⬆️
...StandardTrainers/Standard/LinearModelParameters.cs 60.9% <0%> (+0.26%) ⬆️
@@ -6,7 +6,7 @@ internal static class Program
{
static void Main(string[] args)
{
CustomMapping.Example();
ReplaceMissingValues.Example();

This comment has been minimized.

@shmoradims

shmoradims Mar 15, 2019

Contributor

ReplaceMissingValues [](start = 12, length = 20)

please don't change this file. It creates unnecessary merge conflicts.

@shmoradims
Copy link
Contributor

shmoradims left a comment

:shipit:

@sfilipi sfilipi merged commit 9cd9a8c into dotnet:master Mar 15, 2019

3 checks passed

MachineLearning-CI #20190314.31 succeeded
Details
MachineLearning-CodeCoverage #20190314.30 succeeded
Details
license/cla All CLA requirements met.
Details

@sfilipi sfilipi deleted the sfilipi:bugBashSamples branch Mar 15, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.