Skip to content

Deduplication test scripts for Docker image layers

Notifications You must be signed in to change notification settings

flx42/layer-dedup-test

Repository files navigation

layer-dedup-test

This repository contains a few simple scripts to test different deduplication strategies on docker layers. These scripts are meant to be executed on a AWS i3.16xlarge instance with 8 NVMe SSDs and running Ubuntu 18.04.

  • Verify if the machine is an AWS instance of type i3.16xlarge (I don't want to overwrite your NVMe disks).
  • Setup a RAID 0 array with the 8 NVMe SSDs using mdadm.
  • Create a ext4 filesystem on the array (btrfs had performance problems and xfs crashed).
  • Mount the created filesystem on /mnt/docker.
  • Install Docker CE and configure the daemon with --storage-driver=overlay2 --data-root=/mnt/docker.
  • Assemble a large list of image tags in tags.list by querying each DockerHub repository listed in repos.list.
  • Download all image tags in parallel using GNU Parallel.
  • This script might be heavy on the Docker Hub infrastructure, be nice.
+ du -sh /mnt/docker/overlay2
822G    /mnt/docker/overlay2
  • File-level deduplication test.
  • Uses rmlint with hardlinks to eliminate duplicate files.
+ rmlint --threads=64 -v -T df --config=sh:handler=hardlink /mnt/docker/overlay2
+ ./rmlint.sh -d
+ sync
+ du -sh /mnt/docker/overlay2
301G	/mnt/docker/overlay2
+ restic backup /mnt/docker/overlay2
scan [/mnt/docker/overlay2]
[4:47] 3254412 directories, 26534922 files, 828.901 GiB
scanned 3254412 directories, 26534922 files in 4:47
[37:53] 100.00%  828.901 GiB / 828.901 GiB  29789334 / 29789334 items  0 errors  ETA 0:00 

duration: 37:53
snapshot 4c536e09 saved
+ du -sh /tmp/restic
244G    /tmp/restic

About

Deduplication test scripts for Docker image layers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages