Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alphabetical directory structure by default #1657

Merged
merged 1 commit into from
Apr 8, 2020

Conversation

pavelpicka
Copy link
Contributor

@pavelpicka pavelpicka commented Mar 25, 2020

Publish repository with directory structure sorter by first letter under
'Packages' directory.

Adjust docs' scripts and tests to reflect the change.

/Packages/[a-z]/*.rpm
/repodata

closes: #4445
closes: #6399

https://pulp.plan.io/issues/4445
https://pulp.plan.io/issues/6399

@pavelpicka pavelpicka changed the title Alphabetical directory structure by default [WIP] Alphabetical directory structure by default Mar 25, 2020
@pulpbot
Copy link
Member

pulpbot commented Mar 25, 2020

Attached issue: https://pulp.plan.io/issues/4445

Attached issue: https://pulp.plan.io/issues/6399

@pavelpicka pavelpicka force-pushed the 1545-alpha-sctruct branch 3 times, most recently from 8a2632f to a4101e7 Compare March 25, 2020 18:08
@pavelpicka pavelpicka changed the title [WIP] Alphabetical directory structure by default Alphabetical directory structure by default Mar 25, 2020
@pavelpicka pavelpicka requested a review from a team March 26, 2020 11:32
CHANGES/6399.misc Outdated Show resolved Hide resolved
pulp_rpm/app/tasks/publishing.py Outdated Show resolved Hide resolved
pulp_rpm/app/tasks/publishing.py Outdated Show resolved Hide resolved
pulp_rpm/app/tasks/publishing.py Outdated Show resolved Hide resolved
pulp_rpm/app/tasks/publishing.py Outdated Show resolved Hide resolved
pulp_rpm/tests/functional/utils.py Outdated Show resolved Hide resolved
Comment on lines 271 to 276
pkg_filename = os.path.basename(package.location_href)
pkg.location_href = os.path.join(
"Packages", pkg_filename[0].lower(),
pkg_filename
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are publishing using a filename and sorting it into directories bu their first letter.
With the recent suggestion to allow in one repo version packages with unique location_href, we might run into issues.
If one package is in a/foo.rpm and another is in b/foo.rpm, they will be allowed in a repo version because their location_hrefs differ but at publish time the original dir structure is ignored, so I guess only one of those files will be published. Any suggestions?
Should we check not the whole location_href but the filename during repo version creation?

@ppicka, @ipanova , @dralley , @daviddavis , @dkliban

Copy link
Member

@ipanova ipanova Mar 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should use the pkg.location_href when publishing after my "divine" realization :D i don't think we can publish 2 rpms with different checksums and having packages/x publish layout

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@goosemania i did some experiment and added location_href to the repo_key fields. We still cannot have 2 packages with a/foo.rpm and another b/foo.rpm location_href because the check for validate_version_paths fails. It checks if 2 CA have overlaped relative_path, which in our case does overlap because we do not take full location_href but just os.path.basename

@pavelpicka pavelpicka force-pushed the 1545-alpha-sctruct branch 6 times, most recently from b15d47a to c4a300c Compare April 1, 2020 11:01
CHANGES/6399.bugfix Outdated Show resolved Hide resolved
Copy link
Member

@ipanova ipanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favour of merging this now, since our publish is broken. Once we agree whether to add location_href or filename to the repo_key fields we can re-work this part if need be.

@@ -257,6 +267,14 @@ def create_repomd_xml(content, publication, extra_repomdrecords,
# Process all packages
for package in packages.iterator():
pkg = package.to_createrepo_c()
pkg_filename = os.path.basename(package.location_href)
Copy link
Member

@ipanova ipanova Apr 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason we are not using package.filename directly here? you are doing a correct thing here. We need to look at location_href specifically since filename is a property that is composed out ofpackage nevra, that potentially can be different from what is written in the location_href

@@ -257,6 +267,14 @@ def create_repomd_xml(content, publication, extra_repomdrecords,
# Process all packages
for package in packages.iterator():
pkg = package.to_createrepo_c()
pkg_filename = os.path.basename(package.location_href)
# this can cause an issue when two same RPM package names appears
# a/name1.rpm b/name1.rpm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do not think this is an issue yet, since we have not added location_href among repo_key fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or let' say - this is an issue that would not be managed during sync, so sync would fail if we would have 2 same rpms what would differ only in the location_href.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i did some more testing where:

  • uploaded artifact A
  • created an rpm out out it and used relative_path X
  • created an rpm out of artifact A and used relative_path Y - this task failed with an error
        "description": "{'non_field_errors': [ErrorDetail(string='There is already a package with: arch=noarch, checksum_type=sha256, epoch=0, name=fox, pkgId=eff95a379774c7622037bcb7ff347271dc727ae4ddce9b16017477b58488aa3e, release=2, version=1.1.', code='invalid')]}",

Then i tried:

  • uploaded artifact A
  • created an rpm out out it and used relative_path X
  • uploaded artifact B
  • created an rpm out out it and used relative_path X.
  • 2 rpms got uploaded into pulp, however then i tried to add those 2 rpms to the repo , I got an error
"description": "Cannot create repository version. Path is duplicated: fox-1.1-2.noarch.rpm.",

This testing has shown that:
a) within a repo version there cannot be 2 rpms with the same relative_path [aka our location_href from rpm during sync os.path.basename(package.location_href) or the relative_path we provide during upload and eventually treat it as location_href]
b) within pulp we cannot have 2 same rpms with different relative_path

Publish repository with directory structure sorter by first letter under
'Packages' directory.

Adjust docs scripts and tests to reflect the change.
Added 'django' as test requirement.

/Packages/[a-z]/*.rpm
/repodata

closes: #4445
closes: #6399

https://pulp.plan.io/issues/4445
https://pulp.plan.io/issues/6399
@ipanova ipanova merged commit 0423b0c into pulp:master Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants