Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserialize treeinfo files in a scpecific order #1639

Merged
merged 1 commit into from Mar 27, 2020
Merged

Conversation

fao89
Copy link
Member

@fao89 fao89 commented Mar 12, 2020

Copy link
Member Author

@fao89 fao89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments are related to sync openSUSE Leap 15.1 from here:
http://download.opensuse.org/distribution/leap/15.1/repo/oss/
Summarizing:

  • By our side, the issue was the disorderly deserialization
  • By productmd, the validation on short for release section
  • By repo, for not having timestamp

Comment on lines 98 to 72
deserialize_order = "header release tree variants checksums images stage2 media".split()
sections = parser._sections.keys()

for section in sections:
for section in deserialize_order:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be helpful to have it in the comments so anyone who looks later at the code understood why there is some special handling present.

Comment on lines 85 to 86
if "release" in sections:
release = parser._sections.get("release")
release["short"] = release.get("short", release["name"].split()[0])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was breaking because the treeinfo file didn't have a short name

@@ -55,6 +55,37 @@ def load(self, f):
except MissingSectionHeaderError:
raise TypeError(_("Treeinfo file should have INI format"))

def sections_from_general(self, parser):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general is formed by data from other sections:
https://productmd.readthedocs.io/en/latest/treeinfo-1.0.html
for this case, it had data on general, but not on the "original sections", so I built these sections from the general, using the link above as reference

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabricio-aguiar , @daviddavis , does it mean that during publication we'll get a different treeinfo file?
Or do we keep treeinfo file untouched?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

treeinfo file will be untouched, it just changes the way we read it


if general and "tree" not in sections:
tree = {}
fake_timestamp = "".join([str(ord(a)) for a in general["name"].replace(" ", "")])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is super dirty!
the treeinfo file didn't have a timestamp
http://download.opensuse.org/distribution/leap/15.1/repo/oss/.treeinfo

[header]
version = 1.0

[release]
name = openSUSE Leap
version = 15.1

[general]
arch = x86_64
family = openSUSE Leap
name = openSUSE Leap 15.1
version = 15.1
platforms = x86_64,xen

[images-x86_64]
kernel = boot/x86_64/loader/linux
initrd = boot/x86_64/loader/initrd

[images-xen]
kernel = boot/x86_64/loader/linux
initrd = boot/x86_64/loader/initrd

depending on the header version, productmd validates if timestamp is integer or float.
I needed something unique by treeinfo file, and it should be a number, so I took the general.name (in this case is: openSUSE Leap 15.1), and replaced the letter by numbers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fabricio-aguiar , naive question. Can we just set timestamp to 0 if it's not available?
Why do we need something unique?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not naive, actually it is much better than what I did. I thought on uniqueness because of unique_together, but putting a 0 would solve it

@fao89
Copy link
Member Author

fao89 commented Mar 12, 2020

@goosemania @daviddavis this PR unblocks @pavelpicka
but we have to define what will be the strategy to address it

@fao89 fao89 changed the title Parse OpenSUSE treeinfo files Deserialize treeinfo files in a scpecific order Mar 23, 2020
@fao89
Copy link
Member Author

fao89 commented Mar 23, 2020

Changing this PR since the main issue was addressed on release-engineering/productmd#141

@fao89 fao89 requested a review from goosemania March 23, 2020 17:34
for section in sections:
# ProductMD follows a specific order to deserialize:
# https://github.com/release-engineering/productmd/blob/master/productmd/treeinfo.py#L120
deserialize_order = "header release tree variants checksums images stage2 media".split()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting approach, it wouldn't cross my mind to create a list this way!

I'm on the fence.
Having a list (or maybe a tuple in this case is better) explicitly seems to me the obvious choice because the goal is to have a sequence. At the same time it would mean many quotation marks which can be seen as bringing noise.
Having it as a string and then split is a bit unintuitive way to introduce a sequence but there is no noise with quotes.

@pulpbot
Copy link
Member

pulpbot commented Mar 27, 2020

Attached issue: https://pulp.plan.io/issues/6322

@fao89 fao89 merged commit 8f5df0a into pulp:master Mar 27, 2020
@pulpbot pulpbot mentioned this pull request Mar 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants