Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model fields for Package type #1105

Merged
merged 1 commit into from
May 18, 2018
Merged

Conversation

dralley
Copy link
Contributor

@dralley dralley commented Apr 27, 2018

@dralley dralley force-pushed the 3199-models branch 16 times, most recently from aa5a64b to 4dc2beb Compare May 3, 2018 15:27
@dralley dralley force-pushed the 3199-models branch 5 times, most recently from 155bf41 to 73f07cc Compare May 3, 2018 15:57
@dralley dralley requested a review from goosemania May 4, 2018 12:25
# version (str): version
# release (str): release
# pre (bool): preinstall
requires = models.TextField(default='[]', blank=True)
Copy link
Contributor Author

@dralley dralley May 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do 'rich dependencies' fit in here? I couldn't find any information with respect to createrepo_c + 'rich' or 'boolean' dependencies.

Is it something we don't really have to be concerned with at the modelling level?

Copy link
Member

@goosemania goosemania May 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I believe we should not be concerned about that for the modeling part.
Any dependency solving should be handled by an external library and it's its business to figure out all the relationships and types of dependencies the package has by repository and package metadata.

And for createrepo_c I'd guess that any rich dependency would be just a string which is not parsed.

)


class SrpmContent(RpmContent):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a separate content type, or an is_srpm=True boolean flag on the RPM model?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally +1 to have a separate type. It will be a separate table in DB, right? I think it will make all the type related operations easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of how it's modeled. it would be great if when filtering packages that you have 1 endpoint that has both rpms and srpms with a flag like is_srpm=True. With that in mind modeling it as the same may be the easiest. I'm ok with either approach.

Copy link
Member

@goosemania goosemania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a general question.
Do we really want to have all those fields on the model?
Are we going to use fields like rpm_buildhost or time_build and other similar ones?
I agree not to create/name fields which will be different from what createrepo_c has but I'm not sure we need all the attributes from createrepo_c objects until we have a use case for them.
From a migration point of view, it's ok until we have lazy because we can take this data from the rpm itself. Thoughts?

recommends = models.TextField(default='[]', blank=True)
supplements = models.TextField(default='[]', blank=True)

location_base = models.TextField(blank=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be null, not just empty string. FWIW, I'm testing on the packages from EPEL repo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for at least url and description

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Django heavily recommends that you not use null=True with TextField or CharField because it just creates extra "empty" states.

We can do it, but it would be preferable to just store None as an empty string.


class Meta:
unique_together = (
'name', 'epoch', 'version', 'release', 'arch', 'checksum_type', 'pkgId'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone remind me, why we use checksum_type and checksum(pkgId) as a part of the unique index?

Why can't we leave just checksum_type? The only reason to support two packages with the same nevra and checksum_type but different checksum(pkgId) is if one package is wrong and another one is a correct one which was fixed/updated, thus checksum(pkgId) is different. What do you all think? Is it a valid use case?

Why can't we just leave checksum(pkgId) in the unique index? It will cover the use case ^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checksum value gives an easy way to know if the same package has been signed by different gpg keys (or no key at all). That's the primary reason checksum is included in the unit key. It could be better to use information about the signature directly in the unit key, but that's a lot harder to access since it isn't included in any of the XML metadata. You need to inspect the RPM file directly to get at it.

Assuming you keep the checksum value in the unit key, the checksum type isn't very helpful for establishing uniqueness. But it's such a fundamental part of the checksum value, that it would be weird to have a value without knowing the type. So that's why both have been included in the unit key.


(same as the "rpm" content type)
"""
TYPE = 'rpm'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean srpm?

@dralley
Copy link
Contributor Author

dralley commented May 17, 2018

Do we really want to have all those fields on the model?
Are we going to use fields like rpm_buildhost or time_build and other similar ones?
I agree not to create/name fields which will be different from what createrepo_c has but I'm not sure we need all the attributes from createrepo_c objects until we have a use case for them.

Good question. Pulp 2 does model them, but does it actually use them?

https://github.com/pulp/pulp_rpm/blob/pulp-rpm-2.14.3-1/plugins/pulp_rpm/plugins/db/models.py#L751

@dralley dralley force-pushed the 3199-models branch 2 times, most recently from 64d66ba to 416cde3 Compare May 17, 2018 17:20
@dralley
Copy link
Contributor Author

dralley commented May 17, 2018

Added pkgId (checksum) back for now

location_base = models.TextField(blank=True)
location_href = models.TextField(blank=True)

rpm_buildhost = models.TextField(blank=True)
Copy link
Contributor

@jortel jortel May 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these particular fields prefixed with rpm_?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because that's what createrepo_c does, and we're matching what they do precisely


(same as the "rpm" content type)
"""
TYPE = 'source-rpm'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to name it "srpm", this name will appear in the API or in the filters,

Copy link
Member

@goosemania goosemania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @dralley !

@dralley dralley merged commit 7570604 into pulp:3.0-dev May 18, 2018
@dralley dralley deleted the 3199-models branch May 18, 2018 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants