-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add regression example #19
Comments
Do you want the example to reference the nuget package? or could it reference the projects libraries directly? Are you using the ReviewPredictor as the Azure example? |
I think the best example would be if it referenced the NuGet package. We could certainly use the ReviewPredictor as an Azure example once we an alpha version published to nuget.org but if you are up for creating another example, feel free to go ahead :) |
If we were to use NuGet package: Do we need create separate NuGet package for each additional storage related project (i.e. MLOps.NET.Azure. MLOps.NET.SqLight, etc...)? Do we need to recompile the examples to update when we make changes to the NuGet project? This requirement may need the addition of a GitHub Action for example. |
I believe we will need a separate package for each new storage related project. Great question. I don't think the example repo need to stay in constant update with our NuGet package updates, e.g. since we may have breaking changes we probably want to update the examples manually. With that said though, it may make sense to hold off on this issue for another week or so until the SDK API is a bit more solidified and we have uploaded an alpha version to nuget.org. What we could do in the mean time is update the readme file to include an example. |
I have been working through an possible example. I am doing it with SQLite because I don't have an Azure account, yet. I also been viewing your recent streams (super useful) to catch me up with everything. In addition, I have been catching up on the on the contributing guidelines. I am working through the forking and pulling new updates from the source MLOps.NET repository to to my fork. It is not as straight forward in GitHub, so I am planning on having a branch with my example on my fork soon with all the new updates from this project. As for my example, the sample dataset is borrowed from ML.NET samples' regression taxi-fare example. I am finding that the pipeline/feature engineering pipeline is important in evaluating your final model's metric, and I want to be able to see which data columns and their corresponding transforms from the training run, so I can compare the runs in each experiment to see which pipeline produced the best accuracy, for example. Is there a way to serialize and save the iTransformer/pipeline used in training to storage with a run on a specific experiment? Recent postings of issues #49 and #48 brought me to asking this question. I am also seeking guidance that I am on the right track. Also, did we report or resolve the bug that was suggested in the SQLite library? |
I gonna fix it today. |
@sammysemantics great question and insight. I actually don't think we need to store the ITransforms, what I believe we would do in a real-world example is actually to have different feature branches for each variation of the model training. I do see your point though that it may not be practice to have a new feature branch for small tweaks that we want to run, so serializing the entire pipeline may be something we actually want to add as a feature in the future, but it may be overkill right now :) |
https://github.com/sammysemantics/MLOps.NET/tree/Issue19_AddingSQLiteExampleProject Here is my branch to start an example for MLOps.NET and MLOps.NET.SQLite. I can't run build yet because I believe I am hitting the bug at #51 where I am getting an System.InvalidOperationException when I am creating an experiment near the beginning. The error also says. "The storage provider has not been properly set up. Please call Rutix, Daniel and usertyuu should you have any questions". This will be my first pull request. I may not have committed as much as I should during the process, per what is suggested from contributing.md page, but I am learning. There is much more work needed, but any advice is welcome. I created another folder called example on the top level to store the projects, and I created a solution folder to separate the example from the actually source code in Visual Studio. I don't know if that is the best approach, just let me know. Much of this example is inspired by the ReviewPredictor Azure example from @aslotte and the code that is automatically published by the ML.NET Model Builder. I am also knocking on issue #25 on logging the evaluation metrics. I am sure there is much improvement needed. I just can't test it out yet because of the bug. |
@sammysemantics first thank you for looking into this and the willingness to contribute to the repo! I think the issue you are running into is that we didn't have support for the model repository for SQLite yet, but this is actually being worked on as we speak by @dcostea in #55. As soon as we merge that you should be able to re-base from master and it should work better. Regarding the structure, I think naming it Let me know if you need any help with creating the PR or to bounce some ideas. |
@sammysemantics have a look at #58 as well, this may be a good first step to associate a run with e.g. a comment or git commit hash. Would welcome your input. |
@sammysemantics , I have added a binary classification example today to the examples project. |
Sure. Go ahead. I have a lot to catching up in real life. I've been attending the streams, so I'm sure I can contribute later. Thanks! |
Ok. Take care. |
This issue has been resolved. |
We should add an example solution on how this SDK can be used. It should probably go hand-in-hand with a page for documentation as well.
The text was updated successfully, but these errors were encountered: