-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the layer's link by garbage-collect #2288
Remove the layer's link by garbage-collect #2288
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2288 +/- ##
==========================================
- Coverage 60.25% 50.58% -9.67%
==========================================
Files 125 125
Lines 14353 14389 +36
==========================================
- Hits 8648 7279 -1369
- Misses 4821 6361 +1540
+ Partials 884 749 -135
Continue to review full report at Codecov.
|
manifests.go
Outdated
@@ -67,6 +67,7 @@ type ManifestService interface { | |||
type ManifestEnumerator interface { | |||
// Enumerate calls ingester for each manifest. | |||
Enumerate(ctx context.Context, ingester func(digest.Digest) error) error | |||
LayersEnumerate(ctx context.Context, repoName string, ingester func(digest.Digest, string) error) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't just add this method to an interface called ManifestEnumerator
.
registry/storage/garbagecollect.go
Outdated
@@ -65,6 +66,11 @@ func MarkAndSweep(ctx context.Context, storageDriver driver.StorageDriver, regis | |||
return nil | |||
}) | |||
|
|||
err = manifestEnumerator.LayersEnumerate(ctx, repoName, func(dgst digest.Digest, linkpath string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All you need to do here is find the layers that aren't referenced in the repository and remove. It should look similar to the global blob collection, but can be local to a repository.
In the course of reviewing #2302, it looks like this needs to do a collection round inside each repo, then the global blobs directory. This should just belong in the |
I am thinking that is it so simple that if blob digest called X is deleted, then just scan layer folder is there file called X and if yes delete it? Not sure is that perfect way, but at least what I checked my layers, those are named similar way than some blobs. some test data:
|
That would work in the wrong direction. You really need to scan the blobs that are referenced via manifests and then delete the unreferenced ones in |
but that is how currently garbage collector is working right? It is first scanning all manifest blobs, and if there is blobs which are not referenced in manifests those are deleted? |
@zetaab Yes, but there is a set of layer links in each repository that isn't getting cleaned up by the current garbage collection (unless I missed it). These links provide access control for the blob. If the link is present, the repository has access to that blob. These link out to the main blobs store, which is shared by everyone. For this to work correctly, you must ensure that no reachable manifests reference these linked blobs before removing them. |
Sorry for my late reply.
|
@m-masataka Unfortunately, exposing that on the manifest interface doesn't make a whole of sense. It exposes internal details of implementation to the external model. This should be done internally, inside of the garbage collection function. There may already be a function that lists the blobs that are already part of a repository. |
registry/storage/linkedblobstore.go
Outdated
@@ -231,7 +231,47 @@ func (lbs *linkedBlobStore) Delete(ctx context.Context, dgst digest.Digest) erro | |||
return nil | |||
} | |||
|
|||
func (lbs *linkedBlobStore) Enumerate(ctx context.Context, ingestor func(digest.Digest) error) error { | |||
// LayersEnumerate find layer's digest in the Repository. | |||
func LayersEnumerate(ctx context.Context, storageDriver driver.StorageDriver, repoName string, ingester func(digest.Digest, string) error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add linkDirectoryPathSpec to linkedBlobStore in repository.Blobs, you will be able to use linkedBlobStore.Enumerate instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmage Thank you for your review.
I deleted LayerEnumerate(), and use linkedBlogStore.Enumerate.
Please sign your commits following these rules: $ git clone -b "garbagecollection-remove-layers" git@github.com:m-masataka/distribution.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354643304
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -f Amending updates the existing PR. You DO NOT need to open a new one. |
00536ef
to
a5b3b4c
Compare
@stevvooe |
@m-masataka There are several commits that are not signed. I'd recommend sqaushing them all into a single commit. |
9b4e921
to
3128dd3
Compare
@stevvooe |
@m-masataka No, please make this PR into a single commit. |
3128dd3
to
815a0bc
Compare
Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> Remove ther layer's link by garbage-collect Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> fix typo Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> modify code format Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> Delete LayerEnumerator from ManifestEnumerator interface. Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> add comment. Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> just find the layers that aren't referenced and remove. Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> Delete LayerEnumrate func Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> Fix signalling Wait in regulator.enter In some conditions, regulator.exit may not send a signal to blocked regulator.enter. Let's assume we are in the critical section of regulator.exit and r.available is equal to 0. And there are three more gorotines. One goroutine also executes regulator.exit and waits for the lock. Rest run regulator.enter and wait for the signal. We send the signal, and after releasing the lock, there will be lock contention: 1. Wait from regulator.enter 2. Lock from regulator.exit If the winner is Lock from regulator.exit, we will not send another signal to unlock the second Wait. Signed-off-by: Oleg Bulatov <obulatov@redhat.com> fix error Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com> check sign Signed-off-by: Masataka Mizukoshi <m.mizukoshi.wakuwaku@gmail.com>
815a0bc
to
9016620
Compare
Can this PR go forward now? |
@@ -38,11 +38,7 @@ func (r *regulator) enter() { | |||
|
|||
func (r *regulator) exit() { | |||
r.L.Lock() | |||
// We only need to signal to a waiting FS operation if we're already at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this can be removed?
"time" | ||
) | ||
|
||
func TestRegulatorEnterExit(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated. What is this testing?
@@ -367,6 +369,12 @@ type layerLinkPathSpec struct { | |||
|
|||
func (layerLinkPathSpec) pathSpec() {} | |||
|
|||
type layerDirectoryPathSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document this type. Notice the other comments around here and please try to follow that style.
@@ -289,6 +289,8 @@ func (repo *repository) Blobs(ctx context.Context) distribution.BlobStore { | |||
statter = repo.registry.blobDescriptorServiceFactory.BlobAccessController(statter) | |||
} | |||
|
|||
LayerDirectoryPathSpec := layerDirectoryPathSpec{name: repo.name.Name()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please choose a better variable name.
@@ -300,6 +302,7 @@ func (repo *repository) Blobs(ctx context.Context) distribution.BlobStore { | |||
// TODO(stevvooe): linkPath limits this blob store to only layers. | |||
// This instance cannot be used for manifest checks. | |||
linkPathFns: []linkPathFunc{blobLinkPath}, | |||
linkDirectoryPathSpec: LayerDirectoryPathSpec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this unset in all cases? Is this the actual fix here?
@AndreaGiardini Looking at it closely. The biggest blocker is testing. Have you tried out these changes? |
hi @m-masataka, is it possible to move this PR forward? It's great if we can merge it before v2.8.0. |
@m-masataka can you please rebase? I'd like to pick this up and properly look into this. |
/cc |
any update? |
This needs a rebase. |
hi @m-masataka, any updates? |
Signed-off-by: David Karlsson <35727626+dvdksn@users.noreply.github.com>
hi @milosgajdos @corhere, @m-masataka, I've noticed that this PR has been inactive for a while. If you don't mind, I can take over this PR. However, if the original author wishes to continue, I am also willing to step back. Thanks. :) |
@microyahoo go for it! |
The garbage-collect should remove unsed layer link file P.S. This was originally contributed by @m-masataka, now I would like to take over it. Thanks @m-masataka efforts with PR distribution#2288 Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
The garbage-collect should remove unsed layer link file P.S. This was originally contributed by @m-masataka, now I would like to take over it. Thanks @m-masataka efforts with PR distribution#2288 Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
The garbage-collect should remove unsed layer link file P.S. This was originally contributed by @m-masataka, now I would like to take over it. Thanks @m-masataka efforts with PR distribution#2288 Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
Closing in favour of #4344 |
Signed-off-by: Masataka Mizukoshi m.mizukoshi.wakuwaku@gmail.com
The gabage-collect command should removes layer's link files (
_layers/<algorithm>/<digest>/link
).These link files don't have blobs to link with.