Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create tags table #2036

Merged
merged 37 commits into from
Nov 14, 2023
Merged

Create tags table #2036

merged 37 commits into from
Nov 14, 2023

Conversation

avishniakov
Copy link
Contributor

@avishniakov avishniakov commented Nov 9, 2023

Describe changes

I implemented Tags and TagsResources tables as follows:
image

The connection between Models (as a first entity) and linking table TagResource is soft and controlled by SQLModel, shown only for demo purposes - it is not a real FK on the DB level.

Tags are equipped with the full scope of CLI commands:

zenml tag --help
Usage: zenml tag [OPTIONS] COMMAND [ARGS]...

  Interact with tags.

Options:
  --help  Show this message and exit.

Commands:
  delete    Delete an existing tag.
  list      List tags with filter.
  register  Register a new tag.
  update    Update an existing tag.

How does the connection between entities and tags work?
TagResources

    resource_id: UUID
    resource_type: int
    model: List["ModelSchema"] = Relationship(
        back_populates="tags",
        sa_relationship_kwargs=dict(
            primaryjoin=f"and_(TagResourceSchema.resource_type=={TaggableResourceTypes.MODEL.value}, foreign(TagResourceSchema.resource_id)==ModelSchema.id)",
        ),
    )

Models (same for any other entity)

tags: List["TagResourceSchema"] = Relationship(
        back_populates="model",
        sa_relationship_kwargs=dict(
            primaryjoin=f"and_(TagResourceSchema.resource_type=={TaggableResourceTypes.MODEL.value}, foreign(TagResourceSchema.resource_id)==ModelSchema.id)",
            cascade="delete",
        ),
    )

What are the needed changes for adding other Taggable entities?

  1. Extend zenml.enums.TaggableResourceTypes
  2. Add relationship between tables as shown above
  3. On update handle tags separately inside update in SQL store:
            if model_update.add_tags:
                self._attach_tags_to_resource(
                    tag_names=model_update.add_tags,
                    resource_id=existing_model.id,
                    resource_type=TaggableResourceTypes.MODEL,
                )
            model_update.add_tags = None
            if model_update.remove_tags:
                self._detach_tags_from_resource(
                    tag_names=model_update.remove_tags,
                    resource_id=existing_model.id,
                    resource_type=TaggableResourceTypes.MODEL,
                )
            model_update.remove_tags = None

            existing_model.update(model_update=model_update)
  1. Update object creation codes:
if model.tags:
    attach_tags_to_resource(
        tag_names=model.tags,
        resource_id=model_schema.id,
        resource_type=TaggableResourceTypes.MODEL,
        sql_store=self,
    )
session.commit()

@Cahllagerfeld , this PR changes the Update model for Models by removing tags and introducing add_tags and remove_tags

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • If my change requires a change to docs, I have updated the documentation accordingly.
  • If I have added an integration, I have updated the integrations table and the corresponding website section.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

@avishniakov avishniakov changed the base branch from main to develop November 9, 2023 15:04
@github-actions github-actions bot added internal To filter out internal PRs and issues enhancement New feature or request labels Nov 9, 2023
Copy link
Contributor

@AlexejPenner AlexejPenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a general question, how implicit should this whole tagging be. The way I alway imagined it was when I create a resource with a list of tags (strings) the whole db entity creation happens completely abstracted away from me as a user. Even at the API layer I would imagine that we use updates on the resource to set/remove tags. What do you think about this in general.

src/zenml/enums.py Outdated Show resolved Hide resolved
src/zenml/enums.py Outdated Show resolved Hide resolved
pass


def delete_links(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea where to write this - but do we have any way of implementing cascading delete? Like if I delete resource 1 - will all entries for tags on resource 1 be deleted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be cascaded - this is need for removal of tags via update, I'll rename it for clarity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This folk is cascading (set in Models and other taggable entities going forward)

tags: List["TagResourceSchema"] = Relationship(
        back_populates="model",
        sa_relationship_kwargs=dict(
            primaryjoin=f"and_(TagResourceSchema.resource_type=={TaggableResourceTypes.MODEL.value}, foreign(TagResourceSchema.resource_id)==ModelSchema.id)",
            cascade="delete",
        ),
    )

@@ -267,7 +267,7 @@ def update_model(
trade_offs=tradeoffs,
ethics=ethical,
limitations=limitations,
tags=tag,
add_tags=tag,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to take a list of strings not just one string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a list, just the limitation of click - the name of arg must match the name of arg in the function and the user pass it as -t a -t b ...

@avishniakov
Copy link
Contributor Author

Just a general question, how implicit should this whole tagging be. The way I alway imagined it was when I create a resource with a list of tags (strings) the whole db entity creation happens completely abstracted away from me as a user. Even at the API layer I would imagine that we use updates on the resource to set/remove tags. What do you think about this in general.

It is exactly like you described - the user passes in a list of strings, and creation/retrieval happens under the hood. Next with Update models one can add or remove tags

Copy link
Contributor

@fa9r fa9r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, love the overall design, very clean how tags and resources are connected now, well done!!

Overall, the entire PR looks great and very polished already. High-level the only change I'd request is to handle the link creation inside the update_{resource} ZenStore methods and get rid of the TagResource endpoints.

Also, I noticed the RestZenStore methods are still missing, but I'm sure integration tests would reveal that sooner or later anyways.

tags: List[TagResponseModel] = Field(
title="Tags associated with the model",
)
tagged: List[str]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tagged seems redundant, this is just tag.name for tag in tags right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for CLI only, yes it is a bit redundant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I'd suggest to delete that arg and simply call tag.name for tag in tags wherever you need it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

src/zenml/models/tag_models.py Show resolved Hide resolved
src/zenml/enums.py Outdated Show resolved Hide resolved
src/zenml/utils/tag_utils.py Outdated Show resolved Hide resolved
src/zenml/zen_server/routers/tags_endpoints.py Outdated Show resolved Hide resolved
src/zenml/zen_stores/schemas/__init__.py Outdated Show resolved Hide resolved
src/zenml/zen_stores/schemas/__init__.py Outdated Show resolved Hide resolved
src/zenml/zen_stores/zen_store_interface.py Outdated Show resolved Hide resolved
src/zenml/zen_stores/schemas/model_schemas.py Outdated Show resolved Hide resolved
src/zenml/zen_stores/schemas/model_schemas.py Outdated Show resolved Hide resolved
@fa9r
Copy link
Contributor

fa9r commented Nov 10, 2023

Just a general question, how implicit should this whole tagging be. The way I alway imagined it was when I create a resource with a list of tags (strings) the whole db entity creation happens completely abstracted away from me as a user. Even at the API layer I would imagine that we use updates on the resource to set/remove tags. What do you think about this in general.

Yes, that's exactly what I meant as well with links being created in update methods instead 👍

Copy link
Contributor

@fa9r fa9r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful 🙌

@@ -6560,14 +6563,17 @@ def create_tag_resource(
def delete_tag_resource(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method and the other still used now that the interface no longer exists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, called inside helpers.

Copy link
Contributor

E2E template updates in examples/e2e have been pushed.

Copy link
Contributor

@stefannica stefannica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautifully crafted PR ! I only have a couple of suggestions, but nothing that would keep this from being merged.

src/zenml/zen_stores/schemas/tag_schemas.py Show resolved Hide resolved
src/zenml/utils/tag_utils.py Show resolved Hide resolved
@avishniakov avishniakov merged commit d847b31 into develop Nov 14, 2023
30 checks passed
@avishniakov avishniakov deleted the feature/OSS-2610-create-tags-table branch November 14, 2023 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal To filter out internal PRs and issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants