Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Modelzoo]Rebuild ESMM to update API and enable DeepRec features #44

Closed
shanzhou2186 opened this issue Apr 1, 2022 · 0 comments
Closed

Comments

@shanzhou2186
Copy link
Collaborator

shanzhou2186 commented Apr 1, 2022

Rebuild ESMM to update API and Enable DeepRec Features
Goal
Rebuild ESMM to update API and enable DeepRec Features.

Requirement Details

Features list
Enable the following DeepRec feature(Docs about the features from Alibaba https://deeprec.readthedocs.io/zh/latest/index.html):

  • Enabled By Default and test the AUC/ACC/Gsteps, which needs to be close to the result before rebuilding

8) Auto Micro Batch same with DeepRec-AI#127
9) FusedEmbedding API, embedding fusion
10) Smart Stage same with DeepRec-AI#122
11) Auto Graph Fusion DeepRec-AI#144
12) CPU Memory Optimization:START_STATISTIC_STEP, STOP_STATISTIC_STEP, jemalloc
14) AdamAsync Optimizer
15) BF16

  • Disabled by default and test pass is fine. Don't need to ensure the same performance as before

1) Embedding Variable
7) GRPC++ and StarServer
13) Incremental Checkpoint
14) AdagradDecay
2) EmbeddingVariable advanced features:Embedding Elimination
3) EmbeddingVariable advanced feature:Embedding Filter
4) Dynamic-dimension Embedding Variable
5) Adaptive Embedding
17) WorkQueue

  • Other Features : Disabled by default and test pass is fine. Don't need to ensure the same performance as before. This feature is not supported in feature_column API. We are waiting for Alibaba's update.

6) Multi-Hash Variable

Test

  • All of the features needs to be enabled in the code by adding flags.(WDL is the template)
  • Feature8~15 needs to be enabled by default and test passed with the same performance as before.
  • Other Features need to pass test, not ensure performance. Some of the features have known issues we submitted. If not passed, describe it clearly.

Other Requirements: Dockerfile and Documents

  • Waiting for Alibaba's requirements

Code Style and commit

  • Python: Keep aligned with DeepRec code.

Maintain

  • All of the issues and bugs related to this model need to be covered in the future.

Definition of Done

  • Run successfully in DeepRec and could get the same performance as the code before rebuilding.
  • Integrated into DeepRec successfully and commit the code follow DeepRec commit standard.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants