Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for resumable walk #2786

Open
prabirshrestha opened this issue Aug 6, 2023 · 3 comments
Open

Add support for resumable walk #2786

prabirshrestha opened this issue Aug 6, 2023 · 3 comments

Comments

@prabirshrestha
Copy link

I would like to implement indexing using opendal and need a mechanism to pause and resume. While I can have code that waits I would like if there is a native support for it. This will allow my queue to be free for other jobs to use.

@Xuanwo
Copy link
Member

Xuanwo commented Aug 6, 2023

Lister is a stream, which means you can store it in memory and only call next when necessary. If you want to persist the state of the lister on disk and resume the list later, you can use list_with("path").start_after("last_key"). This way, you will need to save the last key returned by Lister, pass it back, and allow listing from that point onwards.

For now, only service s3 implements this feature. Please let me know which service you wanna to use.

@prabirshrestha
Copy link
Author

start_after is what I'm interested in.

Since I'm targeting selfhosted scenarios, I would love to have support for OS filesystem and Webdav (works for Nextcloud as well as Synology nas) first.

@prabirshrestha
Copy link
Author

Now I'm at a point where I need to walk and index the files. I got minio s3 working on my app but seems like minio these days no longer support mounting existing directory based on minio/minio#15496.

Do you have an ETA when this would be available so I can plan accordingly.

One option I can also think of is having a generic start_after. One could implement it via always sorting alphabetically and walk this way one could implement this for all backends but each backend can provide custom implement which is performant for them. I do plan to support as many backends as opendal provides so having a generic seems like it benefits everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants