Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for storing the "local" pool on Azure #1074

Merged
merged 13 commits into from
Jun 17, 2024

Conversation

refi64
Copy link
Contributor

@refi64 refi64 commented May 19, 2022

Requirements

All new code should be covered with tests, documentation should be updated. CI should pass.

Description of the Change

This adds support for storing the "local" package pool on Azure. It includes:

  • Misc. test fixes from the Python 3 migration when passing --capture.
  • Adding some functional tests for Azure publishing based on the S3 ones, as a foundation for these changes.
  • Changes to the Azure endpoint syntax to match the official Azure connection string format.
  • Some refactoring of Azure publishing to move any code reused by the package pool to a shared file.
  • Tangentially related change to clean up temporary files when mirroring.

I described some of these in further detail in the individual commits.

This does not include changing all references to the "local" package pool, because I'm honestly not sure if that's worthwhile...

Checklist

  • unit-test added (if change is algorithm)
  • functional test added/updated (if change is functional)
  • man page updated (if applicable)
  • bash completion updated (if applicable) (CLI was not touched)
  • documentation updated (I don't believe the changes here would result in changes to the website content)
  • author name in AUTHORS

Copy link
Contributor

@chuan chuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Great to see the new functionality to store the package pool in the remote storage. I will take a more detailed pass of the change later today.

context/context.go Outdated Show resolved Hide resolved
azure/package_pool.go Outdated Show resolved Hide resolved
@randombenj randombenj self-requested a review May 25, 2022 08:24
@refi64
Copy link
Contributor Author

refi64 commented May 25, 2022

Hmm, the Azure logs seem to have some 500 internal server errors from Azurite, I'll look into it locally.

@refi64
Copy link
Contributor Author

refi64 commented May 25, 2022

I can't reproduce the CI failure at all locally, so I added a step to upload Azurite's logs if the tests fail, which should hopefully help clear things up a bit...

@refi64
Copy link
Contributor Author

refi64 commented Jun 2, 2022

Can't use 'tar -xzf' extract archive file: /home/runner/work/_actions/_temp_4c5950e6-f316-4a7c-aa09-7b03239b3a59/3b979214-6f9f-46a7-976a-ed55121cf5e1.tar.gz. return code: 2.

...GitHub Actions issue maybe?

@chuan
Copy link
Contributor

chuan commented Jun 2, 2022

Can't use 'tar -xzf' extract archive file: /home/runner/work/_actions/_temp_4c5950e6-f316-4a7c-aa09-7b03239b3a59/3b979214-6f9f-46a7-976a-ed55121cf5e1.tar.gz. return code: 2.

...GitHub Actions issue maybe?

Where did you see this error?

Since in the CI #137 run only the long test for v1.17 failed, I suspect it is an intermittent issue with Azurite or the runtime environment.

@refi64
Copy link
Contributor Author

refi64 commented Jun 2, 2022

@chuan apologies for the confusion, that was from an earlier run, but I think @randombenj restarted it since then. There was a bug in the way config files were written that I missed, just pushed up a fix.

@chuan
Copy link
Contributor

chuan commented Jun 7, 2022

@refi64 I figured out that the Azurite 500 errors are due to its use of /tmp. The Azure tests in your change should pass the CI if you pull the change here: #1078.

@refi64
Copy link
Contributor Author

refi64 commented Jun 8, 2022

@chuan thanks! I just pushed a new version including that commit to see how it goes

.github/workflows/ci.yml Outdated Show resolved Hide resolved
aptly/interfaces.go Outdated Show resolved Hide resolved
@refi64 refi64 force-pushed the cloud-storage branch 2 times, most recently from 405bf8f to f158b3f Compare June 8, 2022 21:31
@chuan
Copy link
Contributor

chuan commented Jun 9, 2022

The latest change looks good to me and glad to see all CI jobs passed!

@refi64
Copy link
Contributor Author

refi64 commented Jun 13, 2022

Just rebased on top of the merged master incl. the Azurite fixes PR.

@randombenj
Copy link
Member

As this is quite a big change and I am not too familiar with the source yet, this will probably take some time to review ...

@randombenj
Copy link
Member

I am not sure if I understand all the changes yet ...
However, I am wondering a bit why the changes are necessary in the first place?
Couldn't you just mount the azure blob storage to the local disk and then run aptly as is to this directory?
This would be much simpler and aptly wouldn't have to implement all the read write logic?

@chuan
Copy link
Contributor

chuan commented Jun 23, 2022

I am not sure if I understand all the changes yet ... However, I am wondering a bit why the changes are necessary in the first place? Couldn't you just mount the azure blob storage to the local disk and then run aptly as is to this directory? This would be much simpler and aptly wouldn't have to implement all the read write logic?

Mounting Azure blob storage to local disk can usually be done via either Azure Blob NFS support or via FUSE driver, both have their own limitations.

Azure blob NFS protocol only supports limited features of blob storage. For example, users can only use it with a storage account that enables the hierarchical namespace and cannot enable NFS on an existing storage account. More limitations can be found here: https://docs.microsoft.com/en-us/azure/storage/blobs/network-file-system-protocol-known-issues

FUSE implementation (https://github.com/Azure/azure-storage-fuse) uses the same underlying REST APIs as the implementation here. It is targeting more generic use cases, has different requirements, and its implementation is much more complicated because it needs to support the complete set of filesystem APIs. Having our own storage and publish implementation directly based on REST APIs cuts through the middle layer and makes it possible to optimize for our own use cases. For example, the file moving operation is made atomic in the Azure publish support, which is not supported in FUSE.

Similar arguments also apply as to why we need to support publishing to S3 and OpenStack Swift.

@neolynx neolynx added PR superseded and removed needs rebase The PR needs to be rebased on master labels Apr 21, 2024
@neolynx
Copy link
Member

neolynx commented Apr 21, 2024

@refi64 could you check if this still works for you ?

especially since a prefix was introduced in azure/public.go ...

@neolynx neolynx added needs review Ready for review & merge increase coverage The PR lacks test coverage and removed PR superseded labels Jun 15, 2024
@neolynx neolynx requested review from mfolusiak, szakalboss and a team June 15, 2024 13:49
@neolynx neolynx removed the increase coverage The PR lacks test coverage label Jun 15, 2024
refi64 and others added 13 commits June 16, 2024 01:04
read_path() can read in binary, which the S3 tests don't support (simply
because they don't need it)...but it needs to be able to take the `mode`
argument anyway.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
Before, a "partial" URL (either "localhost:port" or an endpoint URL
*without* the account name as the subdomain) would be specified, and the
full one would automatically be inferred. Although this is somewhat
nice, it means that the endpoint string doesn't match the official Azure
syntax:

https://docs.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string

This also raises issues for the creation of functional tests for Azure,
as the code to determine the endpoint string needs to be duplicated
there as well.

Instead, it's just easiest to follow Azure's own standard, and then
sidestep the need for any custom logic in the functional tests.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
The contents of `os.Stat` are rather fitted towards local package pools,
but the method is in the generic PackagePool interface. This moves it to
LocalPackagePool, and the use case of simply finding a file's size is
delegated to a new, more generic PackagePool.Size() method.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
Several sections of the code *required* a LocalPackagePool, but they
could still perform their operations with a standard PackagePool.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
This adds support for storing packages directly on Azure, with no truly
"local" (on-disk) repo used. The existing Azure PublishedStorage
implementation was refactored to move the shared code to a separate
context struct, which can then be re-used by the new PackagePool. In
addition, the files package's mockChecksumStorage was made public so
that it could be used in the Azure PackagePool tests as well.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
fixes golangci-lint errors
None of the commands' output is ever treated as binary, so we can just
always decode it as text.

Signed-off-by: Ryan Gonzalez <ryan.gonzalez@collabora.com>
Copy link
Member

@neolynx neolynx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested also manually

@neolynx neolynx merged commit b5bf2cb into aptly-dev:master Jun 17, 2024
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs review Ready for review & merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants