Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): service add HuggingFace file system #3670

Merged
merged 20 commits into from
Dec 2, 2023

Conversation

morristai
Copy link
Member

Description

close: #3605

Addition Note

  • HuggingFace doesn't host official HTTP API docs. Detailed HTTP request API information can be found on the HuggingFace Hub's source code.
  • For message.rs, I'm unsure whether we should drop unused fields or add #[allow(dead_code)] to disable warnings.

core/src/services/huggingface/backend.rs Show resolved Hide resolved
core/src/services/huggingface/backend.rs Show resolved Hide resolved
core/src/services/huggingface/backend.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/core.rs Show resolved Hide resolved
core/src/services/huggingface/core.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/core.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/message.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/message.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/mod.rs Outdated Show resolved Hide resolved
Copy link

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! I'm @Wauplin, maintainer of huggingface_hub -the Python client library for the Hugging Face Hub-. This is a very cool integration! I quickly had a look and added a few high-level comments about the existing huggingface ecosystem (to better align the naming conventions).

I also wanted to point you to https://github.com/huggingface/hf-hub which is the rust client to interact with the Hub. Though it's not as complete as the Python equivalent, it already has support for most file read/write use cases -including compatibility with the HF cache system- . It would be good to check what's there to avoid duplicating logic across libraries. cc @Narsil who worked on this

core/src/services/huggingface/backend.rs Show resolved Hide resolved
core/src/services/huggingface/mod.rs Outdated Show resolved Hide resolved
core/src/types/scheme.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/docs.md Outdated Show resolved Hide resolved
@Xuanwo
Copy link
Member

Xuanwo commented Nov 28, 2023

I also wanted to point you to https://github.com/huggingface/hf-hub which is the rust client to interact with the Hub. Though it's not as complete as the Python equivalent, it already has support for most file read/write use cases -including compatibility with the HF cache system- .

Thank you for the advice, hf-hub is an excellent project! However, OpenDAL prefers to implement all APIs using its own HttpClient to ensure full control, eliminating the need to interact with various SDKs and keep it's deps tree as low as possible.

@Wauplin
Copy link

Wauplin commented Nov 28, 2023

Thank you for the advice, hf-hub is an excellent project! However, OpenDAL prefers to implement all APIs using its own HttpClient to ensure full control, eliminating the need to interact with various SDKs and keep it's deps tree as low as possible.

Fair enough @Xuanwo! I just wanted to make sure you know about it :)

@Xuanwo
Copy link
Member

Xuanwo commented Nov 28, 2023

I just wanted to make sure you know about it :)

You are so nice! And thanks for the review. ❤️

core/src/services/huggingface/backend.rs Show resolved Hide resolved
core/src/services/huggingface/backend.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/backend.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/core.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/core.rs Outdated Show resolved Hide resolved
core/src/services/huggingface/lister.rs Outdated Show resolved Hide resolved
core/src/types/scheme.rs Outdated Show resolved Hide resolved
core/src/types/scheme.rs Outdated Show resolved Hide resolved
@morristai
Copy link
Member Author

Seems not fixed. With ConfigDeserializer, we don't need to use write map.get("abc") by hand.

(It appears that I am unable to reply in that review block, so I'll respond here.)
@Xuanwo, Sorry, it's my negligence. But I want to point out that I'm referring to service Etcd. It seems like it's also miss refactored?
https://github.com/apache/incubator-opendal/blob/0711e6ce97660380628b72a010d092fd29ec9553/core/src/services/etcd/backend.rs#L200

@Xuanwo
Copy link
Member

Xuanwo commented Nov 30, 2023

Sorry, it's my negligence. But I want to point out that I'm referring to service Etcd. It seems like it's also miss refactored?

Yes, there is an on-going effort to refactor them one by one: #3240

@morristai
Copy link
Member Author

Yes, there is an on-going effort to refactor them one by one: #3240

I believe there might be a misunderstanding. After checking the refactoring progress list and the pull request code, it appears that Etcd has already been refactored, but the from_map part seems to have been overlooked. Could you please verify this from your end? I'm not certain if I've missed something here. (By the way, I am currently on a business trip, so my response may be delayed.)

@Xuanwo
Copy link
Member

Xuanwo commented Dec 1, 2023

I believe there might be a misunderstanding. After checking the refactoring progress list and the pull request code, it appears that Etcd has already been refactored, but the from_map part seems to have been overlooked. Could you please verify this from your end?

Seems etcd missed this part, we need to fix it, tracked at #3702

By the way, I am currently on a business trip, so my response may be delayed

Enjoy your trip!

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Xuanwo Xuanwo merged commit 547c23d into apache:main Dec 2, 2023
171 checks passed
@Xuanwo
Copy link
Member

Xuanwo commented Dec 2, 2023

I have setup a repo at https://huggingface.co/datasets/opendal/huggingface-testdata

Would you like to add a behaivor test for this?

@morristai
Copy link
Member Author

I have setup a repo at https://huggingface.co/datasets/opendal/huggingface-testdata

Would you like to add a behaivor test for this?

Yes, I'd love to work on that.

@morristai morristai deleted the feat/huggingface_filesystem_support branch December 5, 2023 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add hugging face file system support
4 participants