Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement/disk utilization #642

Merged
merged 81 commits into from May 13, 2022
Merged

Enhancement/disk utilization #642

merged 81 commits into from May 13, 2022

Conversation

lpoli
Copy link
Contributor

@lpoli lpoli commented Apr 20, 2022

Changes

This PR manages files in blobber so that millions files won't get crowded in same directory.
Using fixed number of files to put in a directory is not feasible because allocation can update its size.

Like previous method, I have used random property of 256 hash of allocation and content hash. Allocation hash(id), to create parent directories for all files belonging to that allocation, and content hash to create directories and file name for some file.

Previous method had overuse of directories and number of directories that an allocation could have was hard coded.
It used content_hash's first 9 characters to create directory for a file.
So there could be 16^9 = 68719476736(68 billion directories).

Now blobber can change this number as per its capacity.
Also in previous method, it would put all allocation in same directory i.e. /data/blobber/files. This would allow a directory to put millions allocation in a single directory thus making lookup inefficient. This also has been fixed.

Fixes

  1. Over crowd of millions files in same directory.
  2. Use of lock to delete files as there can be two files having same content hash.
  3. Data inconsistency due to file movement to minio. A ref gets updated for on_cloud field. If other refs too had been referring to same content hash then there is no way to get data for those refs.
  4. Partially separate integration test and unit test. Later on this requires complete independence on each other.

@lpoli lpoli requested a review from peterlimg April 20, 2022 01:40
.github/workflows/tests.yml Outdated Show resolved Hide resolved
Makefile Outdated
@@ -56,7 +56,7 @@ local-run:
--hostname 127.0.0.1 \
--deployment_mode 0 \
--keys_file ../docker.local/keys_config/b0bnode1_keys.txt \
--files_dir ./data/blobber/files \
--mount_point ./data/blobber/files \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep it with --files_dir, if not. it will break all helm charts.

Copy link
Contributor

@cnlangzi cnlangzi Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or please update it on dockerfile , cli and helm charts too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its better to update in cli and helm charts. For files storage, It must be mountpoint. So the flag name should be appropriate as well.

@codecov-commenter
Copy link

codecov-commenter commented Apr 24, 2022

Codecov Report

Merging #642 (eabffdf) into staging (c666d22) will increase coverage by 2.28%.
The diff coverage is 31.81%.

@@             Coverage Diff             @@
##           staging     #642      +/-   ##
===========================================
+ Coverage    19.70%   21.99%   +2.28%     
===========================================
  Files           66       69       +3     
  Lines         7530     7844     +314     
===========================================
+ Hits          1484     1725     +241     
- Misses        5819     5857      +38     
- Partials       227      262      +35     
Flag Coverage Δ
Unit-Tests 21.99% <31.81%> (+2.28%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...0chain.net/blobbercore/allocation/newfilechange.go 0.00% <0.00%> (ø)
...e/go/0chain.net/blobbercore/allocation/protocol.go 0.00% <0.00%> (ø)
code/go/0chain.net/blobbercore/filestore/minio.go 0.00% <0.00%> (ø)
.../go/0chain.net/blobbercore/filestore/mock_store.go 0.00% <0.00%> (ø)
code/go/0chain.net/blobbercore/filestore/store.go 0.00% <0.00%> (ø)
...net/blobbercore/handler/download_request_header.go 54.28% <0.00%> (ø)
...ain.net/blobbercore/handler/file_command_update.go 0.00% <0.00%> (ø)
code/go/0chain.net/blobbercore/handler/handler.go 57.48% <ø> (ø)
...et/blobbercore/handler/object_operation_handler.go 32.42% <0.00%> (-0.12%) ⬇️
code/go/0chain.net/blobbercore/handler/protocol.go 0.00% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c666d22...eabffdf. Read the comment docs.

@lpoli lpoli requested a review from peterlimg May 10, 2022 14:53
code/go/0chain.net/blobbercore/filestore/minio.go Outdated Show resolved Hide resolved
code/go/0chain.net/blobbercore/filestore/minio.go Outdated Show resolved Hide resolved
code/go/0chain.net/blobbercore/filestore/storage.go Outdated Show resolved Hide resolved
code/go/0chain.net/blobbercore/handler/tests_common.go Outdated Show resolved Hide resolved
code/go/0chain.net/blobbercore/handler/worker.go Outdated Show resolved Hide resolved
code/go/0chain.net/blobbercore/filestore/storage.go Outdated Show resolved Hide resolved
code/go/0chain.net/blobbercore/filestore/storage.go Outdated Show resolved Hide resolved
@lpoli lpoli requested a review from peterlimg May 11, 2022 11:54
@lpoli lpoli self-assigned this May 11, 2022
@lpoli lpoli added the mainnet label May 11, 2022
Copy link
Member

@peterlimg peterlimg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@lpoli lpoli dismissed cnlangzi’s stale review May 13, 2022 13:03

requested changes has been resolved

@lpoli lpoli merged commit cb46c97 into staging May 13, 2022
@lpoli lpoli deleted the enhancement/disk-utilization branch May 13, 2022 13:04
@cnlangzi cnlangzi mentioned this pull request May 20, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Track mount/unmount of disk on fly
5 participants