Skip to content

Add datasets namespace to MinIO #4311

@aicam

Description

@aicam

Task Summary

Background & Goal
We are introducing a namespace concept for MinIO and LakeFS. The primary goal is to cleanly separate our storage paths for different types of assets. By implementing namespaces now, we can isolate our current datasets and pave the way for adding models (and other assets) in the future.

Current State
Currently, the namespace is defined at the root bucket level without any specific asset prefix:

storageNamespaceURI=s"${StorageConfig.lakefsBlockStorageType}://${StorageConfig.lakefsBucketName}"

Proposed Change
We need to append the datasets prefix to the end of the existing storage path. This new namespace will be used exclusively for dataset storage.

  • Target URI Structure: [block-storage-type]://[bucket-name]/datasets

Impact & Migration (Breaking Change)

  • ⚠️ Compatibility: Once this change is merged, all existing datasets pointing to the old root namespace will become incompatible.
  • Resolution: A data migration will be required to move existing datasets from the old namespace to the new /datasets namespace.

Out of Scope

  • The development and execution of the migration script will not be part of the Pull Request for this specific issue. It will be handled as a separate task.

Priority

P2 – Medium

Task Type

  • Code Implementation
  • Documentation
  • Refactor / Cleanup
  • Testing / QA
  • DevOps / Deployment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions