Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] log_artifacts from an s3 location #7547

Closed
2 of 22 tasks
svijayaku1-chwy opened this issue Dec 16, 2022 · 2 comments
Closed
2 of 22 tasks

[BUG] log_artifacts from an s3 location #7547

svijayaku1-chwy opened this issue Dec 16, 2022 · 2 comments
Labels
area/tracking Tracking service, tracking client APIs, autologging bug Something isn't working

Comments

@svijayaku1-chwy
Copy link

svijayaku1-chwy commented Dec 16, 2022

Issues Policy acknowledgement

  • I have read and agree to submit bug reports in accordance with the issues policy

Willingness to contribute

Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.

MLflow version

  • Client: 1.27.0

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OS
  • Python version: 3.10.8

Describe the problem

Mlflow does not allow logging from artifacts which are stored in an s3 bucket. I can log artifacts TO an S3 tracking uri, but not FROM an S3 uri.
Official documentation says "Log a local file or directory as an artifact" and doesn't seem to mention anything about an S3 location. Is this currently possible?

Tracking information

No response

Code to reproduce issue

input_uri = "s3-path-of-artifacts"

def logging(run_id):
print("Logging artifacts for run:", run_id)
mlflow.log_artifact(input_uri)

def main():
with mlflow.start_run() as run:
run_id = run.info.run_id
print("Current tracking uri: {}".format(mlflow.get_tracking_uri()))
logging(run_id)

if name == "main":
main()

Stack trace

Traceback (most recent call last):
File "/Users//Desktop/mlflow-test/mlflow-test.py", line 61, in
main()
File "/Users//Desktop/mlflow-test/mlflow-test.py", line 56, in main
logging(run_id)
File "/Users//Desktop/mlflow-test/mlflow-test.py", line 44, in logging
mlflow.log_artifact(input_uri+"model/")
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/mlflow/tracking/fluent.py", line 752, in log_artifact
MlflowClient().log_artifact(run_id, local_path, artifact_path)
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/mlflow/tracking/client.py", line 955, in log_artifact
self._tracking_client.log_artifact(run_id, local_path, artifact_path)
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/mlflow/tracking/_tracking_service/client.py", line 365, in log_artifact
artifact_repo.log_artifact(local_path, artifact_path)
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/mlflow/store/artifact/s3_artifact_repo.py", line 124, in log_artifact
self._upload_file(
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/mlflow/store/artifact/s3_artifact_repo.py", line 117, in _upload_file
s3_client.upload_file(Filename=local_file, Bucket=bucket, Key=key, ExtraArgs=extra_args)
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/boto3/s3/inject.py", line 143, in upload_file
return transfer.upload_file(
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/boto3/s3/transfer.py", line 288, in upload_file
future.result()
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/s3transfer/futures.py", line 103, in result
return self._coordinator.result()
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/s3transfer/futures.py", line 266, in result
raise self._exception
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/s3transfer/tasks.py", line 269, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/s3transfer/upload.py", line 585, in _submit
upload_input_manager.provide_transfer_size(transfer_future)
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/s3transfer/upload.py", line 244, in provide_transfer_size
self._osutil.get_file_size(transfer_future.meta.call_args.fileobj)
File "/Users//Desktop/mlflow-test/venv/lib/python3.10/site-packages/s3transfer/utils.py", line 247, in get_file_size
return os.path.getsize(filename)
File "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/genericpath.py", line 50, in getsize
return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: 's3://bucket-name/user-name/mfllow-inputs/file-name'

Other info / logs

No response

What component(s) does this bug affect?

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

What language(s) does this bug affect?

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations
@svijayaku1-chwy svijayaku1-chwy added the bug Something isn't working label Dec 16, 2022
@github-actions github-actions bot added the area/tracking Tracking service, tracking client APIs, autologging label Dec 16, 2022
@mlflow-automation
Copy link
Collaborator

@BenWilson2 @dbczumar @harupy @WeichenXu123 Please assign a maintainer and start triaging this issue.

@dbczumar
Copy link
Collaborator

dbczumar commented Jan 6, 2023

Hi @idomic, thank you for raising this issue. MLflow does not currently support this capability through the log_artifact() API. However, you can call mlflow.artifacts.download_artifacts() followed by mlflow.log_artifact() to achieve the desired result. Thank you for using MLflow!

@dbczumar dbczumar closed this as completed Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tracking Tracking service, tracking client APIs, autologging bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants