Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail on missing metadata #50

Merged
merged 2 commits into from
Feb 28, 2024
Merged

Conversation

schaefi
Copy link
Contributor

@schaefi schaefi commented Feb 24, 2024

The method get_image_packages_metadata looks up either a .report or a .packages file from the given download URI. The information if present is used in mash to check on package conditions and for the build time of the image. In case no metadata information is present the returned package dict is empty and the object instance stored build time stays at its initial 'unknown' value. Not raising an exception but sending a log warning has two effects on mash:

  1. An unknown buildtime causes a download of the image in any case
  2. An empty packages information causes the optionally given package based conditions in the mash job description to be not verifiable

On the positive side any URL just providing an image can then be consumed by mash with the mentioned restrictions. At the moment the scope of mash is limited to OBS build images and projects which has the "withreports" flags configured. Any other source location is not consumable by mash and this commit would allow to increase the scope for mash.

@schaefi
Copy link
Contributor Author

schaefi commented Feb 24, 2024

This is related to SUSE-Enceladus/mash#898

I don't think the failed tests for newer python versions depends on my change. Some of the tests are implemented with deprecated test concepts compared to py >= 3.8

@schaefi schaefi self-assigned this Feb 24, 2024
@rjschwei
Copy link
Collaborator

While I agree in principal that we should be able to download images from OBS that do not have the metadata, this change introduces another problem, IMHO. As implemented one could use obs-img-utils, specify a license to exclude but because there is no metadata the image would be downloaded without error anyway leaving the user to believe that everything is just fine, when in fact it may not be. I think if any filtering conditions are applied and no metadata is available the code should exit with an error.

@schaefi
Copy link
Contributor Author

schaefi commented Feb 26, 2024

I think if any filtering conditions are applied and no metadata is available the code should exit with an error.

yes I fully agree, good point. This condition check needs to find it's implementation in mash if I understood you correctly. So going forward it means a pull request to mash that check for the above condition must be done first prior this change to eventually be accepted correct ?

@rjschwei
Copy link
Collaborator

I think if any filtering conditions are applied and no metadata is available the code should exit with an error.

yes I fully agree, good point. This condition check needs to find it's implementation in mash if I understood you correctly. So going forward it means a pull request to mash that check for the above condition must be done first prior this change to eventually be accepted correct ?

I think the condition should be in this code base. I have not tested this but I think with this change I could us the code with the --disallow-license command line argument and the image download would succeed if there is no metadata against which to check this condition. Somewhere there has to be a check for this condition in this code base. At the moment the download will simply fail if there is none of the image metadata providing this feature with a very big hammer. If we want to be more fine grained which is a good direction IMHO we still have to have these types of conditions produce an error.

@schaefi
Copy link
Contributor Author

schaefi commented Feb 26, 2024

I think the condition should be in this code base

yes after reading the code I saw that conditions are checked here and only the status is checked in mash. Will update the PR from here. To be honest I find this utils connection into mash quite confusing

@schaefi
Copy link
Contributor Author

schaefi commented Feb 26, 2024

how about this direction now ? This also needs to be tested first

@schaefi
Copy link
Contributor Author

schaefi commented Feb 26, 2024

I did not look into fine grain regarding which conditions are there. So there is still the hammer which does not allow to continue if there is any sort of condition specified. However if you don't set any condition we allow to continue, which would be ok from my perspective

@smarlowucf
Copy link
Collaborator

There might be some more changes necessary here. As far as I remember the packages/reports file is used as a way to find the version/build of the image. Without that or a version/build condition the regex might fail to find the image. Need to dig in to be sure.

To be honest I find this utils connection into mash quite confusing

The code was split out since it's used by many tools aside from just Mash. Even though the design is still Mash specific. This is something I wanted to clean up to at some point.

@schaefi
Copy link
Contributor Author

schaefi commented Feb 27, 2024

There might be some more changes necessary here.

Uh yes you are right. There is

def check_all_conditions(self):
        self.check_image_conditions()
        self.check_license_conditions()
        self.check_invalid_packages()

and check_image_conditions has

for condition in self.conditions:
            if 'package_name' in condition:
               ...
            else:
                if self._check_version_and_build_condition(

So at least this part seems mandatory

@schaefi
Copy link
Contributor Author

schaefi commented Feb 27, 2024

The other condition check methods are not mandatory to my understanding

@schaefi
Copy link
Contributor Author

schaefi commented Feb 27, 2024

The code was split out since it's used by many tools aside from just Mash. Even though the design is still Mash specific. This is something I wanted to clean up to at some point.

yes there is a coupling to mash in a way that conditions are part of a mash job doc and handed over to instances of OBSImageUtil. For me this already has caused some headaches as I though it's decoupled from mash. In fact a change in OBSImageUtil can easily impact mash's functionality unnoticed. So only yours and Robert's brain prevents regressions :-)

@schaefi
Copy link
Contributor Author

schaefi commented Feb 27, 2024

@smarlowucf ok I need to understand this better. Hope you can help me. I see two properties in OBSImageUtil which are

  • image_release
  • image_version

In the mash code there is one instance of OBSImageUtil called downloader. This instance is not accessing the property. Thus I assume the property is only used inside of OBSImageUtil. The code path where it is been used is this one

if self._check_version_and_build_condition(
    condition,
    self.image_release,
    self.image_version,
    self.image_name
):
    ...

access to the property can cause an exception e.g OBSImageVersionException

apart from running this check I don't see what could cause the download to fail ? In fact the download would just work because it uses the name of the image file found by web_content.py and the regex passed to e.g fetch_to_dir.
So I was wondering what is the purpose of _check_version_and_build_condition ?

Thanks

@schaefi
Copy link
Contributor Author

schaefi commented Feb 27, 2024

Hmm, and I think there is already a check that prevents calling the check_ methods

if self.has_conditions:
    self._wait_on_image_conditions()

without conditions no call and no check_all_conditions() and everything that follows after it.

ok so long story short. With the patch from here I did the following test:

from obs_img_utils.api import OBSImageUtil

downloader = OBSImageUtil(
    'https://download.opensuse.org/repositories/Virtualization:/Appliances:/Images:/Testing_x86:/fedora/images/',
    'kiwi-test-image-live-disk',
    profile='Disk'
)

print(f'_______ {downloader.packages}')

image_source = downloader.get_image()

print(image_source)

This is one of my integration testing projects, no .package no .reports file in the repos. And the the downloader does its job ( I added some debug data )

(no_fail_on_missing_metadata*)> python bob.py 
---------------- kiwi-test-image-live-disk
################ kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz
################ kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz.mirrorlist
################ kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz.sha256
################ kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz.sha256.asc
################ kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz.sha256.asc.mirrorlist
################ kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz.sha256.mirrorlist
################ kiwi-test-image-live-disk.x86_64-2.0.0-Virtual-Build106.8.qcow2
################ kiwi-test-image-live-disk.x86_64-2.0.0-Virtual-Build106.8.qcow2.mirrorlist
################ kiwi-test-image-live-disk.x86_64-2.0.0-Virtual-Build106.8.qcow2.sha256
################ kiwi-test-image-live-disk.x86_64-2.0.0-Virtual-Build106.8.qcow2.sha256.asc
################ kiwi-test-image-live-disk.x86_64-2.0.0-Virtual-Build106.8.qcow2.sha256.asc.mirrorlist
################ kiwi-test-image-live-disk.x86_64-2.0.0-Virtual-Build106.8.qcow2.sha256.mirrorlist
################ kiwi-test-image-live-disk.x86_64-Disk.raw.xz
################ kiwi-test-image-live-disk.x86_64-Disk.raw.xz.mirrorlist
################ kiwi-test-image-live-disk.x86_64-Disk.raw.xz.sha256
################ kiwi-test-image-live-disk.x86_64-Disk.raw.xz.sha256.asc
################ kiwi-test-image-live-disk.x86_64-Disk.raw.xz.sha256.asc.mirrorlist
################ kiwi-test-image-live-disk.x86_64-Disk.raw.xz.sha256.mirrorlist
################ kiwi-test-image-live-disk.x86_64-Virtual.qcow2
################ kiwi-test-image-live-disk.x86_64-Virtual.qcow2.mirrorlist
################ kiwi-test-image-live-disk.x86_64-Virtual.qcow2.sha256
################ kiwi-test-image-live-disk.x86_64-Virtual.qcow2.sha256.asc
################ kiwi-test-image-live-disk.x86_64-Virtual.qcow2.sha256.asc.mirrorlist
################ kiwi-test-image-live-disk.x86_64-Virtual.qcow2.sha256.mirrorlist
_______ {}
**** https://download.opensuse.org/repositories/Virtualization:/Appliances:/Images:/Testing_x86:/fedora/images//kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz
**** https://download.opensuse.org/repositories/Virtualization:/Appliances:/Images:/Testing_x86:/fedora/images//kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz.sha256
**** https://download.opensuse.org/repositories/Virtualization:/Appliances:/Images:/Testing_x86:/fedora/images//kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz.sha256.asc
/home/ms/obs_img_utils/images/kiwi-test-image-live-disk.x86_64-2.0.0-Disk-Build106.9.raw.xz

@smarlowucf
Copy link
Collaborator

As long as in your testing the download still works it should be good. The area I was concerned is here: https://github.com/SUSE-Enceladus/obs-img-utils/blob/main/obs_img_utils/api.py#L388. But if you dig down the base file name is no longer the packages file. Instead it's the image file as long as it matches one of the known extensions.

@schaefi
Copy link
Contributor Author

schaefi commented Feb 28, 2024

As long as in your testing the download still works it should be good. The area I was concerned is here: https://github.com/SUSE-Enceladus/obs-img-utils/blob/main/obs_img_utils/api.py#L388. But if you dig down the base file name is no longer the packages file. Instead it's the image file as long as it matches one of the known extensions.

yes the base_file_name property resolves to the image file name for a given extension

self._base_file_name, self.image_ext = self.remote.fetch_file_name(
    self.image_name,
    self.base_regex,
    self.extensions
 )

If I don't provide an extension to the constructor of OBSImageUtil the default extension list is

extensions = [
    'vhdfixed.xz',
    'vhdfixed',
    'raw.xz',
    'tar.gz',
    'qcow2',
    'vmdk.xz'
]

and that matches images not anything else.

I could imagine that I can confuse OBSImageUtil at several places by providing an extension at construction time. But also that I think is subject to a refactoring

The method get_image_packages_metadata looks up either a .report
or a .packages file from the given download URI. The information
if present is used in mash to check on package conditions and for
the build time of the image. In case no metadata information is
present the returned package dict is empty and the object instance
stored build time stays at its initial 'unknown' value. Not
raising an exception but sending a log warning has two effects
on mash:

1. An unknown buildtime causes a download of the image in any case
2. An empty packages information causes the optionally given
   package based conditions in the mash job description to be
   not verifiable

On the positive side any URL just providing an image can then
be consumed by mash with the mentioned restrictions. At the
moment the scope of mash is limited to OBS build images and
projects which has the "withreports" flags configured. Any other
source location is not consumable by mash and this commit
would allow to increase the scope for mash.
@smarlowucf
Copy link
Collaborator

re-based against main, will merge when checks pass

@smarlowucf smarlowucf merged commit a55f8be into main Feb 28, 2024
4 checks passed
@smarlowucf smarlowucf deleted the no_fail_on_missing_metadata branch February 28, 2024 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants