Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Add activeloop deeplake plugin #2594

Merged

Conversation

drahnreb
Copy link
Contributor

hub is now deeplake

I added a DeeplakeDataset that covers the old HubDataset behavior.

Including:

  • tests
  • docstring with example usage
  • deprecation warning for HubDataset
  • UserWarning of uncommitted deeplake dataset changes logged to a run

@CLAassistant
Copy link

CLAassistant commented Mar 16, 2023

CLA assistant check
All committers have signed the CLA.

@SGevorg
Copy link
Member

SGevorg commented Mar 16, 2023

Hi @drahnreb Thanks for the contribution, it's exciting. 🙌
Any chance you could open an issue for this as well as per our contribution guide?

@drahnreb
Copy link
Contributor Author

@SGevorg Done: #2595

Could you remove the second duplicated CLA workflow? I cannot approve twice. Thanks.

@aimhubio aimhubio deleted a comment from CLAassistant Mar 16, 2023
@gorarakelyan
Copy link
Contributor

@drahnreb just removed the duplicated workflow.

@drahnreb
Copy link
Contributor Author

@drahnreb just removed the duplicated workflow.

Thanks @SGevorg

I'm done. Feel free to merge after reviewing.

Copy link
Member

@alberttorosyan alberttorosyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drahnreb, awesome job! Thank you for taking care of this.
Left a couple of comments/questions.

"""
AIM_NAME = 'deeplake.dataset'

def __init__(self, dataset: deeplake.Dataset, auto_commit: bool = True, auto_save_view: bool = True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking loud here; what are the reasonable defaults here? Could the automatic commit/save have some performance implications unexpected for the user?
CC @gorarakelyan

Copy link
Contributor Author

@drahnreb drahnreb Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it could for saving views. Committing should be fairly fast.

Not doing any, you won‘t have actual traceability for the run. Attaching the dataset meta information is almost useless.

Defaulting both to False will disable commit/save with warnings and leave more flexibility to take care of this before runs (as it should be done in the first place).

Copy link
Contributor Author

@drahnreb drahnreb Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved with 78bdb80

@@ -4,6 +4,11 @@

@CustomObject.alias('hub.dataset')
class HubDataset(CustomObject):
from aim.utils.deprecation import deprecation_warning
deprecation_warning(remove_version='3.17', msg='Using HubDataset is deprecated!\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really a deprecation? What if one continues to use the older version of deeplake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That‘s a valid question. The name change happened only roughly half a year ago. But seemingly without downwards compatibility nor official deprecation warning. I should have split the deprecation from the deeplake implementation. Also, since both implementations work in parallel.

aim/sdk/objects/plugins/deeplake_dataset.py Show resolved Hide resolved
@drahnreb drahnreb changed the title [feat] add activeloop deeplake plugin [feat] Add activeloop deeplake plugin Mar 17, 2023
@drahnreb
Copy link
Contributor Author

drahnreb commented Mar 17, 2023

Renamed the PR, since the PR naming check failed.

@drahnreb drahnreb force-pushed the drahnreb/fix-activeloop-deeplake-plugin branch from 5f56826 to f68b398 Compare March 17, 2023 09:27
@drahnreb
Copy link
Contributor Author

drahnreb commented Mar 17, 2023

Rebased and removed the deprecation from this PR. Will add another dedicated PR for it to discuss. The hub package is stale and unmaintained, up to you to decide how long to keep it supported in aim.

@drahnreb drahnreb force-pushed the drahnreb/fix-activeloop-deeplake-plugin branch from f68b398 to 78bdb80 Compare March 17, 2023 20:53
Copy link
Member

@alberttorosyan alberttorosyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving PR as all the comments are addressed.
Will add CHANGELOG.md entry and merge changes

@alberttorosyan alberttorosyan merged commit bdc109a into aimhubio:main May 4, 2023
mihran113 pushed a commit that referenced this pull request May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants