Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .Net Interactive .ipynb CSV data loading example with C# #23

Open
RandomFractals opened this issue Oct 28, 2022 · 4 comments
Open
Labels
documentation Improvements or additions to documentation notebook Notebooks feature

Comments

@RandomFractals
Copy link
Owner

Use .Net Interactive Notebooks extension: https://marketplace.visualstudio.com/items?itemName=ms-dotnettools.dotnet-interactive-vscode

and Microsoft.Data.Analysis api: https://learn.microsoft.com/en-us/dotnet/api/microsoft.data.analysis.dataframe?view=ml-dotnet-preview

@RandomFractals RandomFractals added documentation Improvements or additions to documentation notebook Notebooks feature labels Oct 28, 2022
@RandomFractals
Copy link
Owner Author

Something is off while trying to load smaller 2022 crimes CSV data file with msft DataFrame:

chicago-crimes-dotnet-csv-read

@RandomFractals
Copy link
Owner Author

RandomFractals commented Nov 1, 2022

@colombod from .Net Interactive team suggested to try the latest preview version of .Net ML libs using:

#i "nuget:https://pkgs.dev.azure.com/dnceng/public/_packaging/MachineLearning/nuget/v3/index.json"
#r "nuget:Microsoft.Data.Analysis,0.20.0-preview.22514.1"

This is using a daily build that will be out soon for the Dataframe nuget.

Sample ml project notebook:

https://github.com/microsoft/dotnetconf-studentzone/blob/main/Using%20ML.NET%20for%20Machine%20Learning/WaterConsumptionMLproject.ipynb

@RandomFractals
Copy link
Owner Author

Updated .Net Interactive notebooks setup to use new Polyglot Notebooks ext.:

https://marketplace.visualstudio.com/items?itemName=ms-dotnettools.dotnet-interactive-vscode

Changed imports to ML .Net preview nugets listed above.

Still getting load CSV data error, even for the smaller 33Mb file:

crimes-dotnet-load-csv-error

@RandomFractals
Copy link
Owner Author

RandomFractals commented Nov 22, 2022

ML .net nuget is very beta and can't parse CSV with missing data fields yet.

Devs suggested to try 3rd party parquet library instead:

https://github.com/G-Research/ParquetSharp.DataFrame

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation notebook Notebooks feature
Projects
None yet
Development

No branches or pull requests

1 participant