Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToString Featurizer added #4622

Merged
merged 5 commits into from Jan 13, 2020
Merged

Conversation

@michaelgsharp
Copy link
Member

michaelgsharp commented Jan 3, 2020

This change adds in the ToStringTransformer into the new Featurizers project. It is the fourth of a series of PR's that will go in. The ToStringTransformer is implemented in native code, so this is mostly just a wrapper around that with the appropriate entrypoints for NimbusML as well.

The ToStringTransformer converts values into the appropriate string representations.

This code is auto generated. The functionality will be migrated into the existing type conversion over the next several weeks and this file will be removed, but its going in as is to unblock other development work.

@michaelgsharp michaelgsharp requested a review from dotnet/mlnet-core as a code owner Jan 3, 2020
@michaelgsharp michaelgsharp self-assigned this Jan 3, 2020
@codecov

This comment has been minimized.

Copy link

codecov bot commented Jan 9, 2020

Codecov Report

❗️ No coverage uploaded for pull request base (master@5ed4f0e). Click here to learn what that means.
The diff coverage is 82.35%.

@@            Coverage Diff            @@
##             master    #4622   +/-   ##
=========================================
  Coverage          ?   75.75%           
=========================================
  Files             ?      942           
  Lines             ?   170623           
  Branches          ?    18419           
=========================================
  Hits              ?   129255           
  Misses            ?    36256           
  Partials          ?     5112
Flag Coverage Δ
#Debug 75.75% <82.35%> (?)
#production 71.33% <77.35%> (?)
#test 90.62% <100%> (?)
Impacted Files Coverage Δ
....ML.Tests/Transformers/ToStringTransformerTests.cs 100% <100%> (ø)
...rc/Microsoft.ML.Featurizers/ToStringTransformer.cs 77.35% <77.35%> (ø)
@michaelgsharp

This comment has been minimized.

Copy link
Member Author

michaelgsharp commented Jan 10, 2020

This code is auto generated. The functionality will be migrated into the existing type conversion over the next several weeks and this file will be removed, but its going in as is to unblock other development work.

using static Microsoft.ML.Featurizers.CommonExtensions;

[assembly: LoadableClass(typeof(ToStringTransformer), null, typeof(SignatureLoadModel),
ToStringTransformer.UserName, ToStringTransformer.LoaderSignature)]

This comment has been minimized.

Copy link
@justinormont

justinormont Jan 10, 2020

Member

Shouldn't ToStringTransformer be part of the ConvertTransform (info, info, code)?

The Convert transform's purpose is to convert the type of columns, for instance, convert a String to a Float, or an Int32 to Int64.

If I recall, the ConvertTransform didn't allow conversion to the String type from numeric; I'm unsure the reasoning, but it could be added.

Historically, the Expression transform does make the ToString() conversion easy: xf=Expression{ col=ColAsString:MyNumericColumn expr={(x) : text(x)} }, and this is available in ML.NET (see: "​convert to text").

This comment has been minimized.

Copy link
@michaelgsharp

michaelgsharp Jan 10, 2020

Author Member

You are correct Justin. This is going in as is for now to unblock the work on ONNX export, and then over the next several weeks I will be converting this, as well as PRs 4594 and 4597, to be in the correct ML.NET transformer, and then removing these classes.

This comment has been minimized.

Copy link
@michaelgsharp

michaelgsharp Jan 10, 2020

Author Member

Does the Expression transform export to ONNX?

This comment has been minimized.

Copy link
@justinormont

justinormont Jan 11, 2020

Member

There's an issue logged for ONNX for the Expression transform: #4615

Hopefully the ONNX support is added. The particular AutoML ask was for basic math functions { log(a), exp(a), a-b }. With ONNX output for the Expression transform, it would cover those needs. Otherwise the discussion was making independent transforms for each of log/exp/subtract, which would end up with an endless parade of trivial transforms.

This comment has been minimized.

Copy link
@justinormont

justinormont Jan 14, 2020

Member

/cc @davidbrownellWork : My recommendation for implementing the requested log(a), exp(a), and a-b transforms is to provide ONNX export for the Expression transform.

@michaelgsharp michaelgsharp force-pushed the michaelgsharp:to-string-transformer branch from a811488 to c9d19f6 Jan 11, 2020
@michaelgsharp michaelgsharp merged commit 0a64f9c into dotnet:master Jan 13, 2020
19 checks passed
19 checks passed
MachineLearning-CI Build #20200110.9 had test failures
Details
MachineLearning-CI (Centos_x64_NetCoreApp30 Debug_Build) Centos_x64_NetCoreApp30 Debug_Build succeeded
Details
MachineLearning-CI (Centos_x64_NetCoreApp30 Release_Build) Centos_x64_NetCoreApp30 Release_Build succeeded
Details
MachineLearning-CI (MacOS_x64_NetCoreApp21 Debug_Build) MacOS_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (MacOS_x64_NetCoreApp21 Release_Build) MacOS_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Ubuntu_x64_NetCoreApp21 Debug_Build) Ubuntu_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Ubuntu_x64_NetCoreApp21 Release_Build) Ubuntu_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp21 Debug_Build) Windows_x64_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp21 Release_Build) Windows_x64_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp30 Debug_Build) Windows_x64_NetCoreApp30 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetCoreApp30 Release_Build) Windows_x64_NetCoreApp30 Release_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetFx461 Debug_Build) Windows_x64_NetFx461 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x64_NetFx461 Release_Build) Windows_x64_NetFx461 Release_Build succeeded
Details
MachineLearning-CI (Windows_x86_NetCoreApp21 Debug_Build) Windows_x86_NetCoreApp21 Debug_Build succeeded
Details
MachineLearning-CI (Windows_x86_NetCoreApp21 Release_Build) Windows_x86_NetCoreApp21 Release_Build succeeded
Details
MachineLearning-CodeCoverage Build #20200110.9 had test failures
Details
MachineLearning-CodeCoverage (Windows_x64 Build_Debug) Windows_x64 Build_Debug succeeded
Details
WIP Ready for review
Details
license/cla All CLA requirements met.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.