-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where is the uploaded file? #60
Comments
I would have to disagree with you, tus is designed to be deployed at massive scale. What if two users upload a file called Should:
The scenario for If your app is not susceptible to the listed concerns and you really must have a folder with the original filenames for your usecase, you can still use tusd hooks to rename the file as soon as it has been written to disk. Hooks are separate programs that get presented the meta data and can perform an action with that, they can be written in any language as long as they are executable. But I'd be very careful about the mentioned concerns, going that route. Does this make sense or do you think we overlooked something here? For reference, the original ticket mentioned is #44 |
Thanks kvz for the quick answer. That's a very valuable use case indeed; it seems essentially an object store like OpenStack Swift, Amazon S3, Caringo Swarm, Go's own Minio, or many KV databases capable of large values, but embeddable and extensible. In our use case, we are uploading dozens to thousands of medical images, each image 100 GB to 1 TB in size, into controlled-access folders. Other applications need to access the images by their original file names and extensions, and there will be no filename conflicts. Some of the other applications are written in ASP.NET, some in Go. So far, I've embedded Tusd in a Go wrapper that controls the login and target directory. I'm assuming that without rewriting parts of Tusd (I'm not a fan of forking projects, it's the scourge of modern collaborative deb development...), my next step would be to read the .info file, get the original filename, and rename the .bin file, then delete the .info file. Am I on the right track, or might you suggest an easier/better approach? Your mention of hooks above might be obviated since I'm using Go on the backend. Ideally, perhaps Tusd could have two modes on upload config, one as now, one with preservation of filenames? Also, unrelatedly, does Tusd support HTTP/2, and does it use multiple HTTP streams to speed chunk delivery by utilizing more pipe bandwidth? WebSockets for maintained connection? Go's gobs for encoding/decoding speed? Just thoughts if not yet implemented. Thanks! |
It's not an object store itself, but we do offer adapters for S3, google cloud files, etc. tus is really only about the transfer, not the storage per se.
Wow that's super interesting. We'd love to cover that in a case study if you're comfortable with that.
If you can, I would avoid running a fork as well. I think hooks are the way to fly. That way you can run a release binary, which will prove helpful if you ever run into an issue. It will be harder for the community to replicate failures in custom builds. And it would be easier to dismiss issues too (not cool, but this is due to a human trait that all open source projects have to endure). Anyway, I think hooks are the way to fly, you'll get your meta data over STDIN in JSON like so: https://github.com/tus/tusd/blob/master/.hooks/post-finish You can use any language there to parse the JSON and move the file to a different location - preserving the original filename - not having to run a fork. For authentication / etc I'd probably run tusd on localhost, and use HAProxy or some other kind of proxy. This also solves the problem of having to run tusd as root if you want it listening on a port <1024.
I'm afraid there is little chance of this happening since the collision of filenames in most cases is so likely it is almost a certainty. Meaning we'd have to support behavior to serve a very small usecase, and people not aware of the issues around this might actually pick this more convenient option and then have files destroyed because of it.
It is compatible. For chunks I refer to our concat extension. Websockets aren't needed as we'll just open many more connections. We'll likely not support Gob as the protocol is intended to be spoken in an interoperable way across many platforms and languages. |
Sadly, @kvz, the answer is not that easy :) First of all, the tus protocol on its own absolutely supports HTTP/2, however for tusd, the story is a bit different. Go 1.6 introduced transparent and seamless support for HTTP/2 (see https://golang.org/pkg/net/http/):
The tusd binary (the one inside cmd/tusd/) currently has no functionality to use TLS and therefore does not support HTTP/2. The tusd package, however, can be mounted to either HTTP or HTTPS listeners and is therefore possible to talk the new HTTP protocol, when configured correctly. |
A I see, sorry for getting that part wrong, thanks for correcting! Sent from mobile, pardon the brevity.
|
@kvz how can i get the file extension name? |
@ReverseFlash28 If the uploader supplies the filename using metadata you may be able to extract the extension from there even though this requires strict validation and cannot be trusted in general. Therefore you may want to detect the file's type be looking for a magic numbers (e.g. see unix file(1) command) and then choosing based on the result the corresponding extension. |
Hi @Acconut , could you provide example wrapper code on how to enable HTTP/2 on tusd over TLS? |
@heri16 What does you setup look like? Do you use the tusd package in a custom Go application or run the tusd binary behind a proxy (such as nginx or Apache)? |
Closing this issue due to a lack of information. Feel free to leave a comment if you want to continue the discussion :) |
Another issue (closed) says, regarding uploaded files being renamed to *.bin and *.info: "That's a good use case but a very specialized one. In general, I think, people will need more information than just the original file name and therefore have to read the additional data anyway. So I am sorry, but your suggestion won't make it into tusd."
I believe this is a major shortcoming for a "file uploader" - the uploaded file is essentially not there at all; this is a huge use case, not specialized in the least.
Is there a procedure for transferring "x.jpg" for example, to the server as "x.jpg"? Or, perhaps, a config hook to rename the file upon upload completion?
The text was updated successfully, but these errors were encountered: