Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear if filelocker has to be used with filestore #1110

Closed
mitar opened this issue Apr 14, 2024 · 7 comments
Closed

Unclear if filelocker has to be used with filestore #1110

mitar opened this issue Apr 14, 2024 · 7 comments
Labels

Comments

@mitar
Copy link

mitar commented Apr 14, 2024

Question

So I am reading embedding documentation and there is filestore package which states for its New function:

In addition, a locking mechanism is provided.

But looking at its implementation I do not see any locking?

So should filelocker (or even memorylocker) be used together with filestore or not?

Setup details

I am trying to use tusd embedded in my own Go server.

@mitar mitar added the question label Apr 14, 2024
@Acconut
Copy link
Member

Acconut commented Apr 17, 2024

Yes, you should always use a locker. Please read https://tus.github.io/tusd/advanced-topics/locks/ for more details.

The locker implementation has been pulled out of filestore into its own package, filelocker, and I will adjust the comments. Thanks for letting us know. We should also consider making the locker interface mandatory in tusd, so that people do not leave it out accidentally.

@mitar
Copy link
Author

mitar commented Apr 17, 2024

Oh, I missed that section completely because I went straight for the "embedding in Go program" section. Yes, that section nicely explains it.

We should also consider making the locker interface mandatory in tusd, so that people do not leave it out accidentally.

Yea.

On the other hand, there could also be a different protocol design, where each "upload" would send to a temporary file ID and only at the end it would be renamed to target name. So similar to how you support uploading in parallel to different files and then combining, we could use the same design always, and then there is no lock necessary, every client just starts uploading its chunk. You only have to store metadata about which chunk it is so that you can then combine it into full file. There might be duplicate chunks uploaded but that is fine, extra data just gets discarded when combining.

Acconut added a commit that referenced this issue Apr 23, 2024
Acconut added a commit that referenced this issue Apr 23, 2024
This should help users like #1110
@Acconut
Copy link
Member

Acconut commented Apr 23, 2024

Oh, I missed that section completely because I went straight for the "embedding in Go program" section. Yes, that section nicely explains it.

I have push a few commits to update the documentation and examples in a few places to hopefully make the relevance of lockers easier to understand for new users. Feel free to check them out and let me know if we could do better. The commits are linked above my comment.

We will also discuss making the locker mandatory in future releases. Using tusd programmtically without a locker would then produce a meaningful error. See #1119

On the other hand, there could also be a different protocol design, where each "upload" would send to a temporary file ID and only at the end it would be renamed to target name. So similar to how you support uploading in parallel to different files and then combining, we could use the same design always, and then there is no lock necessary, every client just starts uploading its chunk. You only have to store metadata about which chunk it is so that you can then combine it into full file. There might be duplicate chunks uploaded but that is fine, extra data just gets discarded when combining.

Yes, that would be a different approach with its own advantages and drawbacks. It sounds similar to how S3 handles multipart uploads and works well in same applications. But that is not the design we have with tus :)

@Acconut
Copy link
Member

Acconut commented Apr 23, 2024

By the way, I will close this issue for now since I feel that your original question was answered. Let me know if that's not the case and I will reopen this. If you have other topics, feel free to open a new issue.

@Acconut Acconut closed this as completed Apr 23, 2024
@mitar
Copy link
Author

mitar commented Apr 23, 2024

Awesome, thanks.

Yes, that would be a different approach with its own advantages and drawbacks.

Do you have some reference or something for the drawbacks of such a approach? Just out of curiosity. (Especially given that you do support this for parallel uploading.)

@Acconut
Copy link
Member

Acconut commented Apr 25, 2024

One drawbacks of an S3-multipart-like upload is that you need an additional request to commit the upload notifying the client which parts to use in which order.

In addition, it would be harder for clients to resume an upload. With tus' approach, the client retrieves the offset and continues the upload. With a multipart-like upload, the client would have to retrieve a list of uploaded parts and then figure out how to resume from there.

These are not unsolvable issues, bit still hurdles.

@mitar
Copy link
Author

mitar commented Apr 25, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants