Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Add a user-specifiable root to upload link request #577

Merged
merged 5 commits into from
Jun 22, 2023

Conversation

wild-endeavor
Copy link
Contributor

@wild-endeavor wild-endeavor commented Jun 15, 2023

TL;DR

This is useful for uploading folders.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

How did you fix the bug, make the feature etc. Link to any design docs etc

Tracking Issue

Remove the 'fixes' keyword if there will be multiple PRs to fix the linked issue

fixes https://github.com/flyteorg/flyte/issues/

Follow-up issue

NA
OR
https://github.com/flyteorg/flyte/issues/

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
@codecov
Copy link

codecov bot commented Jun 15, 2023

Codecov Report

Merging #577 (f181987) into master (949f5a2) will increase coverage by 1.47%.
The diff coverage is 0.00%.

❗ Current head f181987 differs from pull request most recent head c18a0ee. Consider uploading reports for the commit c18a0ee to get more accurate results

@@            Coverage Diff             @@
##           master     #577      +/-   ##
==========================================
+ Coverage   58.60%   60.08%   +1.47%     
==========================================
  Files         168      168              
  Lines       16256    13293    -2963     
==========================================
- Hits         9527     7987    -1540     
+ Misses       5887     4464    -1423     
  Partials      842      842              
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
dataproxy/service.go 59.83% <0.00%> (+1.72%) ⬆️

... and 154 files with indirect coverage changes

@wild-endeavor wild-endeavor changed the title add prefix Add a user-specifiable root to upload link request Jun 15, 2023
@wild-endeavor wild-endeavor marked this pull request as ready for review June 19, 2023 16:29
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
dataproxy/service.go Outdated Show resolved Hide resolved
var prefix string
if req.FilenameRoot != "" {
prefix = req.FilenameRoot
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason the previous path was "safe" without RBAC is that even if two people try to upload to the same path, unless they are uploading the same file, they won't end up with conflicting content in the same location...

I understand the motivation behind the change and maintaining a consistent structure...
Would it make sense to maintain the safety here?
Perhaps it's a client-side driven logic to include the SHA of the ordered SHAs of all files of a directory as part of the FilenameRoot?
so you end up with:
s3:////dir/name/path/filename.foo

Perhaps the SHAofSHAs logic is something like:
build a json of [MD5] = filename

"dir": {
  "filename": <md5>,
...
}

Then computing the SHA of the json string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so do this computation on the client side? or are you saying change the idl to take a json string, and admin hashes the json string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client side yes... I think it'll need a whole lot more changes to incorporate that behavior on the server side...
On top of my head, a new API that's built for uploading directories:

rpc UploadDirectory(stream UploadDirectoryRequest) UploadDirectoryResponse

message UploadDirectoryRequest {
    string directory_md5 = 1;
    string filename = 2;
    bytes file_contents = 3;
}

or something like the Multipart upload semantics

rpc StartUploadDirectory(StartUploadDirectoryRequest) StartUploadDirectoryResponse
rpc UploadFile(UploadFileRequest) UploadFileResponse
rpc TerminateUploadDirectory(TerminateUploadDirectoryRequest) TerminateUploadDirectoryResponse

And then the server can continue to generate/enforce SHAs of directories... and fail the call if SHA doesn't match or if the client fails to complete the upload within a specified time...

Co-authored-by: Haytham Abuelfutuh <haytham@afutuh.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
EngHabu
EngHabu previously approved these changes Jun 20, 2023
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
…nto consistent-uploads

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
@wild-endeavor wild-endeavor merged commit 9b22c0e into master Jun 22, 2023
12 of 14 checks passed
@wild-endeavor wild-endeavor deleted the consistent-uploads branch June 22, 2023 23:00
eapolinario pushed a commit that referenced this pull request Sep 6, 2023
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants