Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Elyra tutorial with Neural Magic version. #297

Closed
6 of 7 tasks
pacospace opened this issue Sep 10, 2021 · 19 comments
Closed
6 of 7 tasks

Extend Elyra tutorial with Neural Magic version. #297

pacospace opened this issue Sep 10, 2021 · 19 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@pacospace
Copy link

pacospace commented Sep 10, 2021

Is your feature request related to a problem? Please describe.
Extend Elyra tutorial with Neural Magic [1] version.

High-level Goals
Elyra tutorial has one version that focuses on the use of Neural Magic.

Describe the solution you'd like
https://docs.google.com/document/d/1GNICmV_EcI-PPygxvzTupBCIt9HG9MKiEUeVy6YVO2c/edit#heading=h.fbxyxnppymly

Describe alternatives you've considered
n/a

Additional context
This is a key result we would like to achieve in 21Q3

Related-To: AICoE/aicoe-ci#120
Related-To: AICoE/aicoe-ci#118

Tasks

  • Extend docs for Neural Magic
  • Create new notebooks
  • Add new dataset
  • Add new models
  • Add Elyra pipeline for Neural Magic.

Acceptance Criteria

References

cc @goern @riekrh @TreeinRandomForest

@goern
Copy link
Member

goern commented Sep 10, 2021

sounds good to me, thanks for writing this up. please take it to the next Tech Talk session to circulate it with the team

@pacospace pacospace added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Sep 10, 2021
@pacospace
Copy link
Author

Depends-On: neuralmagic/sparseml#386

@pacospace pacospace self-assigned this Sep 14, 2021
@pacospace
Copy link
Author

pacospace commented Sep 14, 2021

@pacospace
Copy link
Author

Related-To: neuralmagic/sparsify#83

@pacospace
Copy link
Author

pacospace commented Sep 16, 2021

I was able to create an example for MNIST Classification using Pytorch + Neural Magic Tools. The output is a MNIST pruned ONNX model, optimized with Neural Magic Sparsify Recipe and optimizer.

The next step will be to deploy that ONNX model with NeuralMagic deepsparse inference engine and maybe compare results with a non-optimized model.

This example can be repeated once we get GPU and we can use some bigger model: ResNet or VGG.

@pacospace
Copy link
Author

Nice-to-Have: neuralmagic/sparsify#84

@pacospace
Copy link
Author

pacospace commented Sep 17, 2021

I was able to deploy locally deepsparse with MNIST example and got some very nice performance results. I will move to the gather metrics pipeline extension now for cluster deployment.

@pacospace
Copy link
Author

pacospace commented Sep 30, 2021

Deployment in the cluster with deepsparse is currently failing because there are no node with flags avx2 or avx512. So I'm blocked on that.

@pacospace
Copy link
Author

pacospace commented Sep 30, 2021

demo is also blocked until we get KFP in smaug: https://github.com/operate-first/SRE/issues/382

@pacospace
Copy link
Author

pacospace commented Oct 1, 2021

Depends-On: operate-first/support#409
Depends-On: operate-first/support#408

@pacospace
Copy link
Author

pacospace commented Oct 5, 2021

Deployment in the cluster with deepsparse is currently failing because there are no node with flags avx2 or avx512. So I'm blocked on that.

Neural magic deployment cannot be run on smaug cluster at the moment, but in rick cluster worked: neuralmagic/deepsparse#186

I added a note for users:

NOTE: Deepsparse deployment at the moments work only with CPU that supports avx2, avx512 flags (even better if avx512_vnni with VNNI is available: neuralmagic/deepsparse#186). Please check your environment running cat /proc/cpuinfo to identify flags and verify if your machine supports the deployment. In the deployment manifests in nm-inference overlay you need to set the env variable NM_ARCH=avx2|avx512.

@pacospace
Copy link
Author

@pacospace
Copy link
Author

pacospace commented Oct 15, 2021

Demo is ready to be tested, so waiting for https://github.com/operate-first/SRE/issues/382, operate-first/support#435

@goern
Copy link
Member

goern commented Jan 18, 2022

what is the status of this?

@pacospace
Copy link
Author

what is the status of this?

See comment above :) we are still waiting for KFP in Operate FIrst

@sesheta
Copy link

sesheta commented Apr 18, 2022

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@sesheta sesheta added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 18, 2022
@sesheta
Copy link

sesheta commented May 18, 2022

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

@sesheta sesheta added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 18, 2022
@sesheta
Copy link

sesheta commented Jun 17, 2022

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

@sesheta sesheta closed this as completed Jun 17, 2022
@sesheta
Copy link

sesheta commented Jun 17, 2022

@sesheta: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

3 participants