Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting samples to PowerShell #407

Open
mikeTWC1984 opened this issue May 3, 2019 · 1 comment
Open

Porting samples to PowerShell #407

mikeTWC1984 opened this issue May 3, 2019 · 1 comment

Comments

@mikeTWC1984
Copy link

mikeTWC1984 commented May 3, 2019

I'm experimenting with translating some samples from C# to PowerShell (Sentiment Analysis for now). PS doesn't support (work well) with some C# features (like extension methods, attributes, etc), but it doesn't feel impossible to port it. At this point I'm struggling with executing fit method after appending Trainer to Estimator. When I follow C# sample, I'm getting "Shuffle input cursor reader" error. If I change trainer option to disable shuffling I'm getting "Splitter/consolidator worker" error. Interestingly, if I apply fit directly on estimator I'm not getting errors, and run prediction/evaluation with some dummy results. Getting same error on PS 5.1 and core 6.2. C# sample works fine on the same machine (with dotnet run).
So at least I want to clarify if there is any blocker for PS to interact with ml.net I think it would be extremely useful to port it to PS, so even non-developers might use it. Below is my code (it will export data set and libraries if needed)

<# Downloading assemblies and data set

# download nuget if needed
# iwr "https://dist.nuget.org/win-x86-commandline/latest/nuget.exe" -OutFile "nuget.exe"

nuget install Microsoft.ML -version 1.0.0-preview

mkdir bin

gci "*\lib\netstandard*\*.dll" | copy-item -Destination ".\bin"

$url = "https://raw.githubusercontent.com/lucasalexander/mlnet-samples/master/sentiment-analysis/data/yelp_labelled.txt"
Invoke-WebRequest -Uri $url -OutFile "yelp_labelled.txt"

#>


Add-Type -Path "$pwd\bin\*.dll" 


$dataPath = "$pwd\yelp_labelled.txt"

$mlCOntext = [Microsoft.ML.MLContext]::new()

$columns = [System.Collections.Generic.List``1[Microsoft.ML.Data.TextLoader+Column]]::new()

$columns.Add([Microsoft.ML.Data.TextLoader+Column]::new("SentimentText", "String", 0))
$columns.Add([Microsoft.ML.Data.TextLoader+Column]::new("Label", "Boolean", 1))

$columns.Add([Microsoft.ML.Data.TextLoader+Column]::new("PredictedLabel", "Boolean", 2))
$columns.Add([Microsoft.ML.Data.TextLoader+Column]::new("Probability", "Single", 3))
$columns.Add([Microsoft.ML.Data.TextLoader+Column]::new("Score", "Single", 4))

$opt = [Microsoft.ML.Data.TextLoader+Options]::new()
$opt.Separators = "`t"
$opt.Columns = $columns
$opt.HasHeader = $false


$dataView = [Microsoft.ML.TextLoaderSaverCatalog]::LoadFromTextFile($mlCOntext.Data, $dataPath, $opt)

# preview data
# [Microsoft.ML.DebuggerExtensions]::Preview($dataView).rowview | foreach { $_.Values.Value -join " | " }


$splitDataView =  $mlCOntext.Data.TrainTestSplit($dataView, 0.2)
$trainSet = $splitDataView.TrainSet
$testSet = $splitDataView.TestSet


$estimator = [Microsoft.ML.TextCatalog]::FeaturizeText($mlCOntext.Transforms.Text, "Features", "SentimentText")


$optTrain = [Microsoft.ML.Trainers.SdcaLogisticRegressionBinaryTrainer+Options]::new()
$optTrain.FeatureColumnName = "Features"
$optTrain.LabelColumnName = "Label"


# this will avoid 'Shuffle input cursor' error, but raise 'Splitter/consolidator' error
#$optTrain.Shuffle = $false  

$trainer = [Microsoft.ML.StandardTrainersCatalog]::SdcaLogisticRegression($mlCOntext.BinaryClassification.Trainers, $optTrain)


$pipe = [Microsoft.ML.LearningPipelineExtensions]::Append($estimator, $trainer, "Everything")


$model = $pipe.Fit($trainSet)  # GETTING ERROR HERE !

# if apply fit on estimator no error will occur and predict/evaluate block will work (with some dummy results)

# $model = $estimator.Fit($splitDataView.TrainSet)

$predict = $model.Transform($TestSet)

$mlCOntext.BinaryClassification.Evaluate($predict, "Label")
@CESARDELATORRE
Copy link
Contributor

This is more a 'platform issue' than a 'purely samples issue' (even when you are migrating samples).
Can you open an issue at the ML.NET repo asking for PowerShell support running the ML.NET API?

Here: https://github.com/dotnet/machinelearning/issues

Btw, I also wanted to highlight our ML.NET CLI that can be used from PowerShell, CMD (or Bash in macOS and Linux). It might interest you. Check this tutorial:

CLI intro: https://docs.microsoft.com/en-us/dotnet/machine-learning/automate-training-with-cli

CLI Tutorial: https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/mlnet-cli?tabs=windows

Thanks for your interest on ML.NET! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants