Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUBMARINE-1023. Submarine SDK sqlalchemy store (model registry) #752

Closed
wants to merge 8 commits into from

Conversation

KUAN-HSUN-LI
Copy link
Member

What is this PR for?

  • Implement the model registry SQL method in Python SDK
  • Apply sqlalchemy mypy checks
  • Replace submarine tracking_uri with db_uri

What type of PR is it?

[Feature]

Todos

What is the Jira issue?

https://issues.apache.org/jira/browse/SUBMARINE-1023

How should this be tested?

All of the tests are provided in submarine-sdk/pysubmarine/tests/store/model_registry/test_sqlalchemy_store.py

Screenshots (if appropriate)

Questions:

  • Do the license files need updating? No
  • Are there breaking changes for older versions? No
  • Does this need new documentation? No

@KUAN-HSUN-LI
Copy link
Member Author

@pingsutw @jeff-901 Can you help me review the code? Thanks

@cxorm cxorm requested a review from pingsutw September 17, 2021 08:52
submarine-sdk/pysubmarine/submarine/utils/db_utils.py Outdated Show resolved Hide resolved
submarine-sdk/pysubmarine/submarine/utils/db_utils.py Outdated Show resolved Hide resolved
pass

@abstractmethod
def create_registered_model(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model registry including registered_model, registered_model_tag, model_version, model_version_tag four tables.
tracking including param, metric, experiment three tables.
In my opinion, it is more clear to put these functions in different directories. In this PR, I only worked on the model registry directory maybe I should also refactor the tracking directory and implement the experiment table.

# single object
session.add(objs)

def create_registered_model(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

experiment_id: str,
dataset: str = None,
description: str = None,
tags: List[str] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's different between the tags in model version and registered model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registered model tag focuses on what is the goal of this model or the category of this model. model version tag focuses on what this model perform or feature of this model. I think it is a little confused in these two parts. Hopefully, this explanation can help.

@KUAN-HSUN-LI
Copy link
Member Author

@pingsutw @jeff-901 Thanks for your review. I have replaced registered model with model container and replaced model version with model metadata. Additionally, I have fixed serval bugs.

raise SubmarineException("Registered model name cannot be empty.")


def validate_model_version(model_version: int) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model_version forget to change to model metadata

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function name is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

@jeff-901 jeff-901 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KUAN-HSUN-LI
Copy link
Member Author

@pingsutw @jeff-901 I am so sorry for changing the table name in the database again. I think the previous names are the best. I will explain it as follow:
workflow of model registry:

  1. save models in s3 bucket (unregistered model)
  2. choose the best model and registry it (registered model, version 1)
  3. train the model with different parameters or datasets and save models in s3 bucket (unregistered model)
  4. Again, choose the best model and registry it (registered model, version 2)

Summary:

  • The table is storing registered models so the name registered_model is more proper.
  • The version of the registered model with the same name will define a different version which will be stored in another table. It is also proper to name the table model_version

We also need two tag tables.
example:
image

@asfgit asfgit closed this in 4e4cfd9 Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants