Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add database interface for blob paths + DynamoDB integration + refactor BoltDB to use interface #588

Closed
wants to merge 4 commits into from

Conversation

chofnar
Copy link
Contributor

@chofnar chofnar commented Jun 14, 2022

Closed as I split this PR into two: #667 and chofnar#18

What type of PR is this?

enhancement

Which issue does this PR fix:

#564

What does this PR do / Why do we need it:

This PR adds a database interface that allows the usage of vendor-specific database services by implementing their drivers.
The usage of BoltDB was refactored to use this new interface.
Added AWS DynamoDB driver.

Will this break upgrades or downgrades?

If a client uses DynamoDB and decides to downgrade, they must make sure that the database is migrated to BoltDB to avoid data loss.

Does this PR introduce any user-facing change?:

Adds the possibility to use AWS DynamoDB instead of BoltDB


Problems
Right now, there are several problems with this PR:

  • database_factory.go uses global variable driverFactories to keep track of the available drivers (gochecknoglobals)
  • new entries to that variable are added with the Drivers' init() functions (gochecknoinits)
  • need to specify protocol in front of dynamodb endpoint (known issue)
  • checking for db interface value to be nil perhaps less than ideal way: fmt.Sprintf("%v", is.cache) == fmt.Sprintf("%v", nil)

Other comments
Inspiration from https://github.com/distribution/distribution

To do

  • check that the example config actually works
  • add coverage tests
  • rewrite some (if not all) cache tests for dynamodb

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@chofnar chofnar force-pushed the dynamodb branch 5 times, most recently from af1d119 to f0e76e7 Compare June 14, 2022 15:11
@rchincha
Copy link
Contributor

why so?

@@ -29,7 +29,7 @@ jobs:
s3mock:
image: ghcr.io/project-zot/localstack/localstack:0.13.2
env:
SERVICES: s3
SERVICES: s3,dynamodb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes us a user of another aws service? :\

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes..I propose to keep backing up boltdb in s3, and locally we just download it to a temp dir, use it, and keep backing it up?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will work for one zot but not for multiple, for multiple we should do the same for all GETs but that will add a lot of overhead as GETs are very frequent....

@chofnar chofnar force-pushed the dynamodb branch 10 times, most recently from 0016f01 to 3aaa919 Compare June 15, 2022 13:11
@codecov
Copy link

codecov bot commented Jun 15, 2022

Codecov Report

Merging #588 (42f322a) into main (4a3c007) will decrease coverage by 0.87%.
The diff coverage is 67.41%.

❗ Current head 42f322a differs from pull request most recent head 8886f6f. Consider uploading reports for the commit 8886f6f to get more accurate results

@@            Coverage Diff             @@
##             main     #588      +/-   ##
==========================================
- Coverage   88.05%   87.18%   -0.88%     
==========================================
  Files          61       62       +1     
  Lines       11630    11712      +82     
==========================================
- Hits        10241    10211      -30     
- Misses       1090     1190     +100     
- Partials      299      311      +12     
Impacted Files Coverage Δ
pkg/api/config/config.go 85.41% <ø> (ø)
pkg/api/controller.go 80.54% <24.70%> (-13.37%) ⬇️
pkg/storage/dynamodatabase/dynamodatabase.go 74.28% <74.28%> (ø)
pkg/storage/cache.go 67.22% <86.53%> (+5.28%) ⬆️
pkg/storage/database/database_factory.go 100.00% <100.00%> (ø)
pkg/storage/local.go 82.67% <100.00%> (-0.15%) ⬇️
pkg/storage/s3/s3.go 85.66% <100.00%> (+0.09%) ⬆️
pkg/extensions/search/resolver.go 78.58% <0.00%> (-7.08%) ⬇️
pkg/extensions/search/common/oci_layout.go 90.56% <0.00%> (-4.86%) ⬇️
... and 4 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@chofnar chofnar linked an issue Jun 15, 2022 that may be closed by this pull request
2 tasks
@chofnar chofnar force-pushed the dynamodb branch 2 times, most recently from 0d8300d to c575305 Compare June 17, 2022 12:49
@rchincha
Copy link
Contributor

What happens when we interface with GCP instead? is there a dynamoDB equivalent for GCP?
Would prefer to have a IO driver per cloud provider instead.

@chofnar chofnar force-pushed the dynamodb branch 4 times, most recently from 5f4432a to 30eacda Compare June 27, 2022 15:19
pkg/storage/dynamodb/dynamodb.go Outdated Show resolved Hide resolved
pkg/storage/cache.go Show resolved Hide resolved
docs/aws_databases.md Outdated Show resolved Hide resolved
pkg/storage/s3/s3.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@peusebiu peusebiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved by mistake

@chofnar chofnar force-pushed the dynamodb branch 6 times, most recently from cde8aaa to 2c338a8 Compare July 8, 2022 12:10
@@ -1032,7 +1041,7 @@ func (is *ImageStoreLocal) FinishBlobUpload(repo, uuid string, body io.Reader, d

dst := is.BlobPath(repo, dstDigest)

if is.dedupe && is.cache != nil {
if is.dedupe && fmt.Sprintf("%v", is.cache) != fmt.Sprintf("%v", nil) {
Copy link
Collaborator

@peusebiu peusebiu Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we use a switch where we use type check for each implementation we know we have, now we only have dynamodatabase and boltdb, if it's not one of these types assume cache is nil

later when adding more databases we add checks for them.

Thoughts?

pkg/storage/s3/s3.go Outdated Show resolved Hide resolved
Signed-off-by: Catalin Hofnar <catalin.hofnar@gmail.com>
Added proper endpoint handling, logging, tests
Implemented more functions, added more tests, updated errors
Added aws databases documentation
Added DB driver and factories, need to implement
Implemented Factory and Driver interfaces for dynamodb
Refactor cache.go and changed some stuff
Rewrote some stuff, TODO: fix PutBlob
Fixed dynamoDB functions
Refactor boltDB to use new interface
Fixed gotcha
Lint
Point S3 to use DynamoDB database instead of boltdb
Reverted s3 use only boltdb
Added more dynamodb tests, fixed errors lint
Added database factory tests, removed Scan and DescribeTable functions
Added loading of db parameters from either config or env
Renamed packages and moved files around to avoid cyclic import
Added more coverage tests
Removed env var usage for config
Added benchmark tests

Signed-off-by: Catalin Hofnar <catalin.hofnar@gmail.com>
Signed-off-by: Catalin Hofnar <catalin.hofnar@gmail.com>
Signed-off-by: Catalin Hofnar <catalin.hofnar@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
zot-core
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants