Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More fault tollerant metadata parsing #2181

Closed
wants to merge 1 commit into from

Conversation

HolgerHees
Copy link
Contributor

This fixes parsing metadata for repositories where the filelists does not contain package entries like for gitlab repos.

Check https://packages.gitlab.com/gitlab/gitlab-ce/scientific/7/x86_64/repodata/repomd.xml

closes: #9567

https://pulp.plan.io/issues/9567

@pulpbot
Copy link
Member

pulpbot commented Nov 16, 2021

Attached issue: https://pulp.plan.io/issues/9567

@dralley
Copy link
Contributor

dralley commented Nov 24, 2021

@HolgerHees I'd like to keep your issue open (since it is valid) but I don't think we will merge this patch. Our goal is to rip out all of this code and make it irrelevant within the next month or two - and we've been burned in the past by making changes to this code, so I'm hesitant to take on that risk for the sake of a very short-term solution to an uncommon problem.

But once we do replace the code (shortly), this will be a case that we try to ensure works properly.

My suggestion in the interim is to either keep using the patch, OR to edit the Pulp settings so that RPM_ITERATIVE_PARSING=False. The downside of this is that it consumes much more memory, but it avoids our reimplementation of the metadata parsing (ie. this code which is being problematic) entirely.

One last note, I strongly suspect the metadata synced here is going to be incorrect without changing this line so that do_files=True. But I'm also not sure if there are consequences to enabling it and parsing the filelists metadata - if files entries might be duplicated. If you're already using it and it works for you, there's no immediate need to be concerned, but just be forewarned that if you do certain things with the repos like manually regenerate metadata (instead of mirroring it) or move the packages to other repositories, you might encounter issues.

@dralley dralley closed this Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants