-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Use S3 links in internal computational backend cluster, prepare for temporary tokens (⚠️ devops) #3006
✨ Use S3 links in internal computational backend cluster, prepare for temporary tokens (⚠️ devops) #3006
Conversation
da86799
to
6ffef00
Compare
Codecov Report
@@ Coverage Diff @@
## master #3006 +/- ##
=======================================
Coverage 79.7% 79.7%
=======================================
Files 688 690 +2
Lines 28748 28872 +124
Branches 3707 3719 +12
=======================================
+ Hits 22917 23038 +121
+ Misses 5015 5010 -5
- Partials 816 824 +8
Flags with carried forward coverage won't be shown. Click here to find out more.
|
09b96b1
to
ba5895a
Compare
4b068f3
to
6b45df0
Compare
6b45df0
to
95db602
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 Nice! Just minor comments
packages/dask-task-models-library/src/dask_task_models_library/container_tasks/io.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Please see my comments regarding the settings.
mocked_log_publishing_cb.assert_called() | ||
mocked_log_publishing_cb.reset_mock() | ||
|
||
# USE-CASE 3: if destination is a zip, but we pass a target mime type that is not, then we decompress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this description correct?
I would expect not to decompress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is correct. it uses the target mime type, which is what the service expects. Example:
my service has an input port that is defined as in:
type: data:*/*
Then it means it wants to have it unzipped.
If the service is defined as:
type: data:application/zip
then it means it wants a zip file
s3_settings: S3Settings = await sts.get_or_create_temporary_token_for_user( | ||
request.app, user_id | ||
) | ||
return {"data": s3_settings.dict()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the default values. Regarding my PR #2993 I would still not add this. Will enabled it once we decide on how to use STS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes do not add it. my gut feeling is that probably that STS stuff is not the way to go.
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@cached_property | ||
def endpoint(self) -> str: | ||
return AnyHttpUrl.build( | ||
scheme="http", | ||
host=self.STORAGE_HOST, | ||
port=f"{self.STORAGE_PORT}", | ||
path=f"/{self.STORAGE_VTAG}", | ||
) | ||
|
||
@cached_property | ||
def storage_endpoint(self) -> str: | ||
"""used to re-create STORAGE_ENDPOINT: used by node_ports and must be | ||
in style host:port | ||
without scheme or version tag""" | ||
return f"{self.STORAGE_HOST}:{self.STORAGE_PORT}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might already be aware of this one, just wanted to point it out.
osparc-simcore/packages/settings-library/src/settings_library/utils_service.py
Lines 23 to 38 in 122703f
class MixinServiceSettings: | |
"""Mixin with common helpers based on validated fields with canonical name | |
Example: | |
- Subclass should define host, port and vtag fields as | |
class MyServiceSettings(BaseCustomSettings, MixinServiceSettings): | |
{prefix}_HOST: str | |
{prefix}_PORT: PortInt | |
{prefix}_VTAG: VersionTag [Optional] | |
# optional | |
{prefix}_SCHEME: str (urls default to http) | |
{prefix}_USER: str | |
{prefix}_PASSWORD: SecretStr | |
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, but good to know. anyway for this one it's a different one, and I expect it to disappear soonish
What do these changes do?
Until this PR the data upload/download through the computational services was done the following way:
New env variables (⚠️ devops):
COMPUTATIONAL_BACKEND_DEFAULT_CLUSTER_FILE_LINK_TYPE
defines the file type to use with the internal cluster (defaults to S3, available=[s3, presigned])COMPUTATIONAL_BACKEND_DEFAULT_FILE_LINK_TYPE
defines the file type for the other clusters (defaults to presigned)When using S3 links, it is necessary to have access to the underlying S3 backend (on the contrary presigned links embed this access in their encoding). Therefore director-v2 now also transmits the S3 access credentials to the dask-sidecar when S3 file type is enabled.
In this PR, the default credentials are passed. In the next iteration, a temporary S3 access shall be computed using AWS STS interface. Which should allow the usage of external clusters at the price of some security.
Nevertheless, this solution cannot scale very much, as STS caps to a 1000 of these temporary credentials.
storage new API:
-POST
v0/simcore-s3:access
returns the S3 credentials (currently returns the default)NOTE
Maybe another solution would be instead to use the multipart upload options. At least for the frontend, and probably also for any external dask cluster.
Bonus:
removed all the pytest-docker infrastructure in storage and use pytest-simcore for all unit tests (moved to ♻️ Maintenance/remove pytest docker storage #3011 )Related issue/s
How to test
make build up-prod
ormake build-devel up-devel
create a pipeline with inputs, run the pipeline.
Checklist