Draft version of ML.NET CLI specs with AutoML capabilities #2693

CESARDELATORRE · 2019-02-22T02:52:06Z

This is a draft version of ML.NET CLI specs to be discussed in the open with the ML.NET community.
Its initial functionality will be based on .NET AutoML (Which will be also part of ML.NET)

For further details, read the MLNET-CLI-Specs.md document in the PR.

Related issues:
#2694
#1203

codecov · 2019-02-22T04:09:35Z

Codecov Report

❗ No coverage uploaded for pull request base (master@412e1f9). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff            @@
##             master    #2693   +/-   ##
=========================================
  Coverage          ?    71.7%           
=========================================
  Files             ?      809           
  Lines             ?   142489           
  Branches          ?    16116           
=========================================
  Hits              ?   102174           
  Misses            ?    35885           
  Partials          ?     4430

Flag	Coverage Δ
#Debug	`71.7% <ø> (?)`
#production	`67.93% <ø> (?)`
#test	`85.9% <ø> (?)`

eerhardt

This looks like it is going to be an awesome feature that will help many developers use machine learning!

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

eerhardt · 2019-02-22T15:38:31Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+![alt text](Images/MLNET-AutoML-Positioning.png "ML.NET and AutoML")
+
+As mentioned at the begining of the spec doc, the CLI will be branded as the ML.NET CLI since this CLI will also have additional features where AutoML is not needed.


I'm not sure I follow what this sentence is trying to convey.

the CLI will be branded as the ML.NET CLI since this CLI will also have additional features where AutoML is not needed.

"AutoML" is just a component of ML.NET.... This would be like saying "The .NET CLI will be branded as the .NET CLI since this CLI will also have additional features where Roslyn is not needed".

eerhardt · 2019-02-22T15:41:34Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+- Future versions will be able to run AutoML compute and other compute processes (such as a regular model training) in Azure.
+- The ML.NET CLI  will consume the AutoML API (Microsoft.ML.Auto NuGet package) which will only consume public surface of ML.NET.
+- The CLI proposed here will not provide for “continue sweeping” after sweeping has ended.
+- When running locally with the by default behaviour (no Azure), the CLI will not make any webservice calls and will not require any authentication.


(nit) wording here is a bit hard to understand - When running locally with the by default behaviour (no Azure),. Maybe changing it to When training locally, the CLI will not make any ....

eerhardt · 2019-02-22T15:42:22Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+- Future versions will be able to run AutoML compute and other compute processes (such as a regular model training) in Azure.
+- The ML.NET CLI  will consume the AutoML API (Microsoft.ML.Auto NuGet package) which will only consume public surface of ML.NET.
+- The CLI proposed here will not provide for “continue sweeping” after sweeping has ended.
+- When running locally with the by default behaviour (no Azure), the CLI will not make any webservice calls and will not require any authentication.


the CLI will not make any webservice calls

This begs the question about telemetry. I assume this means this tool will not capture any telemetry, like the .NET CLI does?

We should seriously consider sending opt-in telemetry from CLI.

In reply to: 259394498 [](ancestors = 259394498)

eerhardt · 2019-02-22T15:44:01Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+- Add additonal commands to do *"machine learning without code"*:
+    - *train*: It will only generate the best model .ZIP file. For example: 
+        - `mlnet train --ml-task Regression --dataset "/MyDataSets/Sales.csv"`


Is ml in --ml-task redundant? mlnet train --ml-task

eerhardt · 2019-02-22T15:44:50Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+The `mlnet new` command provides a CLI oriented way to create projects or solutions such as:
+
+- Create a single project (console app) with:
+    - *Training ML.NET code:* One seggregated method per ranked model, but part of the same console app.


(nit) spelling seggregated

eerhardt · 2019-02-22T15:46:49Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+This argument provides the filepath to either one of the following:
+
+- *A: The whole dataset file:* If using this option and the user is not providing `--test-dataset` and `--validation-dataset`, then cross-validation (k-fold, etc.) or automated data split approaches will be used internally for validating the model. I nthat case, the user will just need to provide the dataset filepath.


(nit) I nthat case

eerhardt · 2019-02-22T15:48:53Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+- `true`
+- `false`
+
+The by default value is `true`. 


Does this mean that if I specify --has-header on the command line, it will default to true? Or does it mean that if I DON'T specifiy --has-header it will default to true?

I think in either case it defaults to true.

If so, then the parameter name should be flipped the other way --no-header. It doesn't make sense to have a command:

foo.exe

and

foo.exe --my-option

Do the same things.

Should this rule be applied to all Boolean arguments with default values in general?

Typically, yes.

Look at any other command line. Let's say git.

git-fetch - Download objects and refs from another repository SYNOPSIS git fetch [<options>] [<repository> [<refspec>…]] git fetch [<options>] <group> git fetch --multiple [<options>] [(<repository> | <group>)…] git fetch --all [<options>]

By default, git fetch won't fetch from all remotes. But when you specify git fetch --all, it does.

@CESARDELATORRE FYI

eerhardt · 2019-02-22T15:58:10Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+- Will overriden default values just be provided as static values in the CLI commands or also based on a related configuration .JSON file placed along with the CLI executable? See [comparable .JSON files for dotnet CLI templates](https://github.com/dotnet/dotnet-template-samples/blob/master/05-multi-project/.template.config/template.json))
+
+- Support Custom templates for "automlnet new" such as [Custom templates for dotnet new](https://docs.microsoft.com/en-us/dotnet/core/tools/custom-templates)? - That could allow extensibility for other application project types or even for other languages like F# or additional scenarios.


what is "automlnet new"?

eerhardt · 2019-02-22T15:58:15Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+- Stopping criteria  - { Default timeout or timeout provided by the user }
+
+Gleb's (Cesar: Although, ins't this related to AutoML API instead the CLI?):


I don't understand this and the below "priority order" list.

Is this just an author's "TODO Notes" list?

(nit) ins't.

luisquintanilla · 2019-02-22T16:55:29Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+- [Uber Ludwig CLI Blog Post](https://eng.uber.com/introducing-ludwig/)
+- [Uber Ludwig CLI Getting Started](https://uber.github.io/ludwig/getting_started/)
+- [Uber Ludwig CLI syntax](https://uber.github.io/ludwig/user_guide/)
+


Adding this as a potential reference:

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Spark with minimal hand tuning

TransmogrifAI Site
TransmogrifAI Repo

glebuk · 2019-02-22T18:43:46Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+## Context
+
+**Commitments: This specs document is 100% aspirational and will change while it's being discussed and implementation is evolving based on feedback. There are no commitments derived from this document except for the first upcoming minor version at any given time (v0.1, initially).**


v0.1, [](start = 264, length = 5)

If CLI is bound to specific version of ML.NET, consider synchronizing the version in some way to know that this version of CLI works for a given version of ML.NET. For examle, consider calling CLI for ML.NET v0.10 to be CLI v0.10. Otherwise it would be a version zoo. Then you can simply refer to -- CLI for release v0.11 and so fourth - align releases with ML.NET as well.

Coupling like that can't happen unless we want to always tie the versions together. For example, when we ship ML.NET 1.0, will the CLI be 1.0? I assume not... But I agree it does become a version zoo (we on the .NET team know...), but it is necessary until you can sync up the schedules.

See https://github.com/dotnet/designs/blob/master/accepted/sdk-version-scheme.md for how .NET Core tackles this problem.

glebuk · 2019-02-22T18:45:32Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+**Commitments: This specs document is 100% aspirational and will change while it's being discussed and implementation is evolving based on feedback. There are no commitments derived from this document except for the first upcoming minor version at any given time (v0.1, initially).**
+
+The CLI will be branded as the ML.NET CLI since this CLI will also have additional features in addition to AutoML features.


since this CLI will also have additional features in addition to AutoML features. [](start = 42, length = 81)

remove. This fragment makes no sense in this context as AutoML has not yet been introduced or mentioned above. Basically you want to rephrase this paragraph to say tha tthis is an ML.NET CLI that would also use some AutoML features that will be included in the future.

glebuk · 2019-02-22T18:47:35Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+The .NET AutoML API (.NET based) will be part of the [ML.NET](https://github.com/dotnet/machinelearning) API.
+AutoML features will be used for certain important foundational features of the ML.NET CLI.
+
+This specs-doc focuses most of all on the CLI features related to AutoML, but it will also consider (in less detail) the scenarios where AutoML is not needed, so the CLI syntax will be consistent end-to-end for all the possible scenarios in the future. 


specs-doc [](start = 5, length = 9)

what is that? perhaps we should call it spec instead?

glebuk · 2019-02-22T18:47:56Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+# Problem to solve
+
+Customers (.NET developers) have tolds us through many channels that they can get started with [ML.NET](https://github.com/dotnet/machinelearning) and follow the initial simple examples. However, as soon as they have to create their own model to solve their problems, they are blocked because they don't know what learner/algorithms are better for them to pick and use, what hyper-parameters to use or even what data transformations they need to do.


tolds [](start = 33, length = 5)

told

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

artidoro · 2019-02-22T20:05:08Z

@jwood803 who had a PR open #1620 on this subject.

glebuk · 2019-02-23T00:07:42Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+We need a way to enable regular .NET developers to easily use [ML.NET](https://github.com/dotnet/machinelearning) to create custom models solving typical ML scenarios in the enterprise. 
+
+If we don't provide a really simple way to use [ML.NET](https://github.com/dotnet/machinelearning) for regular developers (almost no data science knowledge at all), then we won't be able to really "democratize" machine learning for .NET developers. 


If we don't provide a really [](start = 0, length = 28)

rephrase this as a positive statement - > in order to democratize x we need to...

glebuk · 2019-02-23T00:09:06Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+- Regular .NET developers getting started with machine learning while trying to use .NET (C# and F# most of all) for ML.
+
+- Specific developer roles are: enterprise developers, start-up developers, ISV developers and internal MSFT teams developers.


MSFT [](start = 104, length = 4)

internal slang, change to Microsoft

glebuk · 2019-02-23T00:10:55Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+- Specific developer roles are: enterprise developers, start-up developers, ISV developers and internal MSFT teams developers.
+
+
+# Goals


Goals [](start = 2, length = 5)

also a goal should be:

Ease deployment and productization of models via service templates

Teach best practices via generated code templates.

glebuk · 2019-02-23T00:11:57Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+**Foundational features:** 
+
+- Provide an end-to-end **ML.NET CLI** for developers (i.e. *"mlnet new"*) to generate either the final trained model and the pipeline's C#/ML.NET implementation code in a similar fashion to the [.NET Core CLI](https://docs.microsoft.com/en-us/dotnet/core/tools/?tabs=netcore2x). The CLI is also a foundation upon which higher-level tools, such as Integrated Development Environments (IDEs) can rest.


mlnet new [](start = 62, length = 9)

Function should do one thing and do it well, thus:
mlnet new -> generates code
mlnet fit -> trains a model.
mlnet transofrm -> scores/inferences a model.
Having simple verbs do simple things would: simplify docs, simplify the api, clarify meaning, allow each one to be more powerful,

This is something to be discussed and validated/invalidated by users. My poit of view is that a "mlnet new" should generate the scoring application code with anything it needs to run (end-to-end) which means to have the .ZIP file, already. And since that was already created by AutoML, it can be provided without the user needing to do an additional step (train the model). Training code could be optional for users who want further learning or custom modifications of trainers and hyper-parameters, which is not common for regular .NET developers.

I think we need to think about what .NET developers new to ML.NET would want for their usual workflow more than structure everything in very granular operations "Function should do one thing and do it well". The SRP (Single Responsability Principle) applies to classes and methods, not necessarily to a CLI which should accommodate to the user's workflow most of all.

glebuk · 2019-02-23T01:25:52Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+- The ML.NET CLI automation will be able to run locally, on any development environment PC (Windows, Mac or Linux).
+- Future versions will be able to run AutoML compute and other compute processes (such as a regular model training) in Azure.
+- The ML.NET CLI  will consume the AutoML API (Microsoft.ML.Auto NuGet package) which will only consume public surface of ML.NET.
+- The CLI proposed here will not provide for “continue sweeping” after sweeping has ended.


The CLI proposed here will not provide for “continue sweeping” after sweeping has ended. [](start = 1, length = 89)

this should go under future features. This is a valuable functionality, but not needed for V1

glebuk · 2019-02-23T01:28:03Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+- The CLI proposed here will not provide for “continue sweeping” after sweeping has ended.
+- When running locally with the by default behaviour (no Azure), the CLI will not make any webservice calls and will not require any authentication.
+- The CLI will provide feedback output (such as % work done or high level details on what's happening under the covers) while working on the long-running tasks.
+- The ML.NET CLI will be aligned and integrated to the [.NET Core CLI](https://docs.microsoft.com/en-us/dotnet/core/tools/?tabs=netcore2x). A good approach is to implement the ML.NET CLI as a [.NET Core Global Tool](https://docs.microsoft.com/en-us/dotnet/core/tools/global-tools) (i.e. named "mlnet" package) on top of the "dotnet CLI". 


integrated [](start = 37, length = 11)

this paragraph basically repeats one under Foundational features.

glebuk · 2019-02-23T01:30:07Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+### CLI default behaviour and overridability
+
+The CLI will have default behavior for each of these mentioned features – however the CLI by default settings should be able to be overridden by providing new/overriden values in the console command (and optionally the advanced configuration .YAML file and response file .rsp placed along with the CLI executable).


.YAML file and response file .rsp [](start = 242, length = 33)

YAML or RSP but but not both. Having CLI, YAML and RSP is too much. How about CLI + YAML. RSP can be easily reproduced with the CMD file.

glebuk · 2019-02-23T01:33:20Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+You can use it with:
+
+```console
+mlnet


add help command: [-h|--help]

glebuk · 2019-02-23T01:35:27Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+mlnet
+```
+
+## Command 'new'


can we have two commands:

new - -- only generates the project template. Perhaps generates featurization and perhaps learning projects and code basted on heuristics, similar to the other tool's GUI wizard?
-- The model is obtained by compiling and running the generated project.
auto -- does all the automl stuff?

glebuk · 2019-02-23T01:37:42Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+(*Release 0.2 examples*)
+
+Simplest command where the tool infers the type of ML taks to perform based on the data:


taks [](start = 54, length = 4)

task

glebuk · 2019-02-23T01:38:37Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+Create and train a model based on parameters specified in the .rsp file plus more advanced model settings in the .yaml file:
+
+` mlnet new @my_cli_config_args.rsp --model-settings-file "./my_model_settings.yaml"  `


can we have the single format for everything?

glebuk · 2019-02-23T01:40:03Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+--------------- (v0.1) -------------------
+
+--ml-task <value>


value [](start = 11, length = 5)

For each argument:

specify default value

add short version such as [--ml-task| -t]

add list of supported argument values

specify if many can be added

imagine you have to type each command and it's a pain to have multi-line CLI. Consider make them as small as possible while maintaining readibility.

glebuk · 2019-02-23T01:41:04Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+  --test-dataset <value>
+]
+
+--label-column-name <value>


reduce name and index to a single parameter:
[--label-column| -label] Usually index/name can be understood from context.

glebuk · 2019-02-23T01:42:19Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+--label-column-name <value>
+|
+--label-column-index <value>


must add "feature", groupid, weight, ignore, and rowid columns at least eventually.
Need to support index syntax for columns, such as 0-4., 5,10-*
Having feature cols arg is a P0 feature. Without it, it would be impossible to use from the command line for most datasets.

glebuk · 2019-02-23T01:42:38Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+[--has-header <value>]
+
+[--max-exploration-time <value>]


[--timeout | t]

glebuk · 2019-02-23T01:43:02Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+[--verbosity <value>]
+
+[--name <value>]
+[--list-ml-tasks]


this should be part of help.

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

glebuk · 2019-02-23T01:43:58Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+--ml-task <value>
+
+--dataset <value>


--dataset [](start = 0, length = 9)

let's be consistent rename dataset::
[--train-dataset | -data | -d ] -- data used for training or cross-validation.
Consider changing the name for better sortability and readibility:
--data-train, --data-test, --data-validation -- that way they will sort nicely and easy to find.

We had that approach originally, --train-dataset | --dataset but based on the tests with the CLI it was getting pretty confusing. It is a lot simpler to re-use a single --dataset argument either for a single data file or for a training-dataset for a split approach.
In any case, we'll ask about this to the users and see what they prefer. I agree that this is a discussion point.

glebuk · 2019-02-23T01:47:31Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+|
+--label-column-index <value>
+
+[--has-header <value>]


header [](start = 7, length = 6)

need another argument - delimeter.

glebuk · 2019-02-23T01:48:39Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+- Create a single project (console app) with:
+    - *Training ML.NET code:* One seggregated method per ranked model, but part of the same console app.
+    - *Scoring/consuming ML.NET code:* One seggregated method per ranked model, but part of the same console app.


One seggregated method per ranked model [](start = 38, length = 40)

Can we get a way with a single method for varous model versions? All the signatures for input/output data of all models will be the same. Only thing different is which zip file to load.

glebuk · 2019-02-23T01:50:24Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+### Arguments
+
+Invalid input of arguments should cause it to emit a list of valid inputs and an error message explaining which arg is missing, if that is the case.


other behaviour questions:
-What happens when we can not infer label?
-How do we tell what we decided the label and features be?
-Should we make certain section of our CLI interactive? For example,. if we infer label and feature columns, it would be nice for user to review our choice, optionally edit, and then accept.

glebuk · 2019-02-23T01:53:03Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+    - *Training project:* Console project with model-training ML.NET code
+    - *Common-code project:* Class library project with common code (Data/Observation class, Prediction class, etc.
+    - *End-user-app project:* End-user application type (depending on template) with ML.NET code scoring the model/s.
+    - *Trained models:* Multiple ranked trained models in the form of several .ZIP files.


Trained models: Multiple ranked trained models in the form of several .ZIP files [](start = 6, length = 82)

That does not make sense to me. If the project has the "train" - why do we need to store models? Every time you run it, you will generate a new zip. End user should not check in models with training code.

jwood803 · 2019-02-23T14:52:03Z

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

+
+- Generate the number of "best models" (project folders and file models) specified by `--best-models-count`
+
+- Simple HTML report with minimum models' metrics.


Random thought, does this have to just be an HTML report? Can it also have an option to output the metrics to the console?

Metrics will be on the console. The HTML report would be additional when needed an additional level of analysis, for instance, if comparing multiple models in a chart, etc.
The initial versions of the CLI will only have it on the console, though.
So, you're right. 👍

bartczernicki · 2019-02-26T03:23:01Z

I assume we are going to have the public NuGet be called "Automated ML"? (in line with what is in the AML service) AutoML is a Google's product that is different than this functionality.

eerhardt · 2019-05-03T22:19:39Z

@CESARDELATORRE - what's the status of this PR? Can it be either merged or closed?

CESARDELATORRE · 2019-05-04T00:43:36Z

Can you merge it? I’d like to have it as a reference for the upcoming evolution.
We didn’t implement the whole scope planned in there for the first public preview we're releasing.

Keep is as "Draft" on the title, please.
Thanks,

Intial draft version of ML.NET CLI specs

2169ab0

CESARDELATORRE added enhancement New feature or request documentation Related to documentation of ML.NET command-line Issues pertaining to the command-line interface labels Feb 22, 2019

CESARDELATORRE mentioned this pull request Feb 22, 2019

ML.NET CLI using AutoML capabilities (Specs document for discussion) #2694

Closed

Minor updates

097c25f

eerhardt approved these changes Feb 22, 2019

View reviewed changes

luisquintanilla reviewed Feb 22, 2019

View reviewed changes

glebuk reviewed Feb 22, 2019

View reviewed changes

docs/specs/mlnet-cli/MLNET-CLI-Specs.md Show resolved Hide resolved

glebuk reviewed Feb 23, 2019

View reviewed changes

Minor updates

ca7ea85

glebuk reviewed Feb 23, 2019

View reviewed changes

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

You can use it with:

```console

mlnet

Copy link

Contributor

glebuk Feb 23, 2019 •

edited

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add help command: [-h|--help]

glebuk reviewed Feb 23, 2019

View reviewed changes

docs/specs/mlnet-cli/MLNET-CLI-Specs.md

[--has-header <value>]

[--max-exploration-time <value>]

Copy link

Contributor

glebuk Feb 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[--timeout | t]

glebuk reviewed Feb 23, 2019

View reviewed changes

docs/specs/mlnet-cli/MLNET-CLI-Specs.md Show resolved Hide resolved

glebuk reviewed Feb 23, 2019

View reviewed changes

jwood803 reviewed Feb 23, 2019

View reviewed changes

CESARDELATORRE added 5 commits February 27, 2019 11:04

Updated remote service design criteria

a1efca3

fix typo

6c2825b

Added --cache-enabled

01adbd9

Updated --cache argument

d9fc1d4

specified version for --cache

4af2bf8

eerhardt merged commit 7b7a2bc into dotnet:master May 6, 2019

kdcllc mentioned this pull request May 6, 2019

Update ML ModelBuilder base class with an interface kdcllc/Bet.AspNetCore#31

Closed

ghost locked as resolved and limited conversation to collaborators Mar 24, 2022


		![alt text](Images/MLNET-AutoML-Positioning.png "ML.NET and AutoML")

		As mentioned at the begining of the spec doc, the CLI will be branded as the ML.NET CLI since this CLI will also have additional features where AutoML is not needed.


		This argument provides the filepath to either one of the following:

		- A: The whole dataset file: If using this option and the user is not providing `--test-dataset` and `--validation-dataset`, then cross-validation (k-fold, etc.) or automated data split approaches will be used internally for validating the model. I nthat case, the user will just need to provide the dataset filepath.


		- Will overriden default values just be provided as static values in the CLI commands or also based on a related configuration .JSON file placed along with the CLI executable? See [comparable .JSON files for dotnet CLI templates](https://github.com/dotnet/dotnet-template-samples/blob/master/05-multi-project/.template.config/template.json))

		- Support Custom templates for "automlnet new" such as [Custom templates for dotnet new](https://docs.microsoft.com/en-us/dotnet/core/tools/custom-templates)? - That could allow extensibility for other application project types or even for other languages like F# or additional scenarios.


		- Stopping criteria - { Default timeout or timeout provided by the user }

		Gleb's (Cesar: Although, ins't this related to AutoML API instead the CLI?):


		## Context

		Commitments: This specs document is 100% aspirational and will change while it's being discussed and implementation is evolving based on feedback. There are no commitments derived from this document except for the first upcoming minor version at any given time (v0.1, initially).


		Commitments: This specs document is 100% aspirational and will change while it's being discussed and implementation is evolving based on feedback. There are no commitments derived from this document except for the first upcoming minor version at any given time (v0.1, initially).

		The CLI will be branded as the ML.NET CLI since this CLI will also have additional features in addition to AutoML features.


		# Problem to solve

		Customers (.NET developers) have tolds us through many channels that they can get started with [ML.NET](https://github.com/dotnet/machinelearning) and follow the initial simple examples. However, as soon as they have to create their own model to solve their problems, they are blocked because they don't know what learner/algorithms are better for them to pick and use, what hyper-parameters to use or even what data transformations they need to do.


		We need a way to enable regular .NET developers to easily use [ML.NET](https://github.com/dotnet/machinelearning) to create custom models solving typical ML scenarios in the enterprise.

		If we don't provide a really simple way to use [ML.NET](https://github.com/dotnet/machinelearning) for regular developers (almost no data science knowledge at all), then we won't be able to really "democratize" machine learning for .NET developers.


		- Regular .NET developers getting started with machine learning while trying to use .NET (C# and F# most of all) for ML.

		- Specific developer roles are: enterprise developers, start-up developers, ISV developers and internal MSFT teams developers.

		- Specific developer roles are: enterprise developers, start-up developers, ISV developers and internal MSFT teams developers.


		# Goals


		Foundational features:

		- Provide an end-to-end ML.NET CLI for developers (i.e. "mlnet new") to generate either the final trained model and the pipeline's C#/ML.NET implementation code in a similar fashion to the [.NET Core CLI](https://docs.microsoft.com/en-us/dotnet/core/tools/?tabs=netcore2x). The CLI is also a foundation upon which higher-level tools, such as Integrated Development Environments (IDEs) can rest.


		### CLI default behaviour and overridability

		The CLI will have default behavior for each of these mentioned features – however the CLI by default settings should be able to be overridden by providing new/overriden values in the console command (and optionally the advanced configuration .YAML file and response file .rsp placed along with the CLI executable).


		(Release 0.2 examples)

		Simplest command where the tool infers the type of ML taks to perform based on the data:


		Create and train a model based on parameters specified in the .rsp file plus more advanced model settings in the .yaml file:

		` mlnet new @my_cli_config_args.rsp --model-settings-file "./my_model_settings.yaml" `


		### Arguments

		Invalid input of arguments should cause it to emit a list of valid inputs and an error message explaining which arg is missing, if that is the case.


		- Generate the number of "best models" (project folders and file models) specified by `--best-models-count`

		- Simple HTML report with minimum models' metrics.

Draft version of ML.NET CLI specs with AutoML capabilities #2693

Draft version of ML.NET CLI specs with AutoML capabilities #2693

Conversation

CESARDELATORRE commented Feb 22, 2019 • edited

codecov bot commented Feb 22, 2019 • edited

Codecov Report

eerhardt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srsaggam Feb 22, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk Feb 22, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk Feb 22, 2019 • edited

Choose a reason for hiding this comment

glebuk Feb 22, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

artidoro commented Feb 22, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

CESARDELATORRE Feb 27, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

glebuk Feb 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartczernicki commented Feb 26, 2019

eerhardt commented May 3, 2019

CESARDELATORRE commented May 4, 2019

CESARDELATORRE commented Feb 22, 2019 •

edited

codecov bot commented Feb 22, 2019 •

edited

srsaggam Feb 22, 2019 •

edited

glebuk Feb 22, 2019 •

edited

glebuk Feb 22, 2019 •

edited

glebuk Feb 22, 2019 •

edited

glebuk Feb 23, 2019 •

edited

CESARDELATORRE Feb 27, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited

glebuk Feb 23, 2019 •

edited