Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨♻️ (⚠️ devops) Remove 5gb limit when uploading data via nodeports #2993

Merged
merged 91 commits into from
May 5, 2022

Conversation

GitHK
Copy link
Contributor

@GitHK GitHK commented Apr 20, 2022

⚠️ devops

@mrnicegyu11 @Surfict
Environment variables changes:

  • renamed R_CLONE_S3_PROVIDER to R_CLONE_PROVIDER
  • Below env vars became mandatory for director-v2(make sure they exist on the deployment):
    • S3_ENDPOINT
    • S3_ACCESS_KEY
    • S3_SECRET_KEY
    • S3_BUCKET_NAME
    • S3_SECURE

What do these changes do?

To avoid hitting a wall when uploading more than 5GB via AWS S3 rclone is used to upload data to ports.

  • if nodeports_v2 is provided with an r_clone_settings instance it will use rclone to upload data to s3
  • the s3_link used by r_clone is generated by storage
  • when the upload is complete a handler to update the metadata in storage is called (using the same code as the previous method)
  • in case the rclone upload fails, metadata entries will be removed (trying to reduce trash)
  • dynamic-sidecar now has a mandatory r_clone_settings fields.

BONUS:

  • when dynamic-sidecar start outputs directories no longer get uploaded (disabled file system events while running docker-compose up)

Related issue/s

How to test

  • start locally and run a jupyter-math:2.0.5 service.
  • add files to one of the outputs
  • looking at the logs of the associated dynamic-sidecar you will see r_clone being used
  • also in the logs of the storage service calls to the metadata endpoint will be issued to: GET v0/locations/0/files/{s3_object}/s3/link and PATCH v0/locations/0/files/_S3_OBJECT_/metadata

Checklist

  • Openapi changes? make openapi-specs, git commit ... and then make version-*)
  • Unit tests for the changes exist
  • Runs in the swarm

@codecov
Copy link

codecov bot commented Apr 20, 2022

Codecov Report

Merging #2993 (32fda96) into master (2a3913b) will increase coverage by 0.0%.
The diff coverage is 77.3%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #2993    +/-   ##
=======================================
  Coverage    79.7%   79.7%            
=======================================
  Files         693     698     +5     
  Lines       29049   29215   +166     
  Branches     3744    3755    +11     
=======================================
+ Hits        23156   23294   +138     
- Misses       5058    5082    +24     
- Partials      835     839     +4     
Flag Coverage Δ
integrationtests 65.9% <89.2%> (+0.1%) ⬆️
unittests 75.4% <65.0%> (-0.1%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...library/src/servicelib/aiohttp/dev_error_logger.py 0.0% <0.0%> (ø)
.../simcore-sdk/src/simcore_sdk/node_ports_v2/port.py 80.6% <ø> (ø)
...car/src/simcore_service_dask_sidecar/file_utils.py 96.2% <ø> (ø)
...re_service_dynamic_sidecar/modules/data_manager.py 46.8% <ø> (ø)
...mcore_service_dynamic_sidecar/modules/nodeports.py 22.6% <ø> (ø)
...torage/src/simcore_service_storage/access_layer.py 72.7% <ø> (ø)
...storage/src/simcore_service_storage/rest_models.py 0.0% <ø> (ø)
...storage/src/simcore_service_storage/application.py 67.5% <33.3%> (-3.1%) ⬇️
...es/storage/src/simcore_service_storage/handlers.py 72.6% <33.3%> (+<0.1%) ⬆️
...tor_v2/modules/dynamic_sidecar/volumes_resolver.py 65.0% <40.0%> (+0.8%) ⬆️
... and 30 more

@GitHK GitHK changed the title ✨♻️ Remove 5gb limit when uploading data via nodeports ✨♻️ (⚠️ devops) Remove 5gb limit when uploading data via nodeports Apr 22, 2022
Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storage does not use RClone right? why is it now installed in the CI? or I missed something?
Also I think you still have async-cache installed in the director-v2/

@@ -44,7 +44,7 @@ def _ensure_remove_bucket(client: Minio, bucket_name: str):
@pytest.fixture(scope="module")
def minio_config(
docker_stack: Dict, testing_environ_vars: Dict, monkeypatch_module: MonkeyPatch
) -> Dict[str, str]:
) -> Dict[str, Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor: python 3.9 allows to use dict instead of Dict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice to know!
@colinRawlings you were wondering about None in typing. This is similar.

formatted_error = "".join(
[f"\n{_SEP}{k}{_SEP}\n{v}" for k, v in fields.items()]
)
logger.debug("Error serialized to client:%s", formatted_error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: as far as I see, this creates a text error. Could we not have a JSON formatted one like everywhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have it formatted in an easy readable way. This is intended to be used for development/debugging. I'm not sure that having it formatted as json will help us.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is just a matter of consistency actually. it will also prevent us from having to manage both text/json cases when parsing the responses

class RCloneSettings(BaseCustomSettings):
R_CLONE_S3: S3Settings = Field(auto_default_from_env=True)
R_CLONE_PROVIDER: S3Provider
R_CLONE_REGION: str = Field("us-east-1", description="S3 region to use")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while I agree it's currently only used by RClone, it is still a S3 setting. and it will be used in AWS deployments cause currently we probably use some default. Therefore I would put it there. Can you please check how storage connects to AWS S3 then?

services/director-v2/requirements/_base.txt Outdated Show resolved Hide resolved
.github/workflows/ci-testing-deploy.yml Outdated Show resolved Hide resolved
@GitHK GitHK requested a review from sanderegg May 3, 2022 08:33
Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! thanks and kudos!

ci/github/helpers/install_rclone.bash Outdated Show resolved Hide resolved
ci/github/helpers/install_rclone_docker_volume_plugin.bash Outdated Show resolved Hide resolved
formatted_error = "".join(
[f"\n{_SEP}{k}{_SEP}\n{v}" for k, v in fields.items()]
)
logger.debug("Error serialized to client:%s", formatted_error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is just a matter of consistency actually. it will also prevent us from having to manage both text/json cases when parsing the responses

class RCloneSettings(BaseCustomSettings):
R_CLONE_S3: S3Settings = Field(auto_default_from_env=True)
R_CLONE_PROVIDER: S3Provider
R_CLONE_REGION: str = Field("us-east-1", description="S3 region to use")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whether the S3_REGION is used or not is up to the client code. so I do not think this should create any problem.

S3_SECURE: bool = False

@cached_property
def endpoint(self) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pair review comment: check if still required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this to a validator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spent too much time on this. Requires much more time to refactor and make working with validator. Will leave as is.

Copy link
Member

@mrnicegyu11 mrnicegyu11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only checked devops-changes, not code. Devops changes are fine and acknowledged.

@sonarcloud
Copy link

sonarcloud bot commented May 5, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.5% 0.5% Duplication

@GitHK GitHK merged commit 46d91c6 into ITISFoundation:master May 5, 2022
@GitHK GitHK deleted the remove-5gb-limit-nodeports branch May 5, 2022 08:21
mrnicegyu11 pushed a commit to mrnicegyu11/osparc-ops-environments that referenced this pull request Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants