Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check manifest to check index report #1131

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

iainduncani
Copy link
Contributor

@iainduncani iainduncani commented Oct 31, 2023

This is a proposed solution to #1124. This is just a draft as I would like feedback as to whether this is the right approach and will add tests if this is along the right lines. The PR does two things:

  1. Removes the retry logic from the Indexer's controller. The retry logic never actually triggered a retry, any request that may timeout also returned the Terminal next state which is processed after the if retry section to break out of the for loop and exit. Having the retry in there meant that empty index reports could be persisted if a query timed out in checkManifest (see the issue description). There was then two options

    1. Fix the retry logic
    2. Delete the retry logic

    I opted for deleting the retry logic as it simplified the code. If you would prefer the retry logic was fixed then I think it would just need a continue adding to the loop and fixing the persisting of an empty manifest when this state occurs (probably with a retry check when calling SetIndexReport). I don't know if consideration would also want to be given to how many times a retry happened to not get stuck in this state. All these thoughts made me think simpler code was better code so deleted it!

  2. If all Scanners complete for a manifest then the index report should be persisted in the IndexFinished state with all the required information in it for querying. If it does not end up in this state it will not be possible to query the manifest but the current implementation of checkManifest would also not allow for a rescan. I have updated checkManifest to make sure that the manifest is in IndexFinished state and if not it will reprocess the whole image from scratch. My main question with this part of the change is whether FetchLayers is the right state to go into. By the time we have check the index report we know all the Scanners have database information for them so would we be better skipping to Coalesce so that we do not need to pull the image again? I opted for FetchLayers for two reasons, both of which I suspect someone more familiar with the code may disagree with:

    1. We don't necessarily know when the index report did not end up in the right state, I felt FetchLayers was the safest option for handling an unexpected state but this only holds true if there was any situation where ManifestScanned returns true but not everything is in the database.
    2. I was worried there may be information required in Coalesce that is only set in previous states rather than being in the database.
    3. I also worry that this could cause excessive amounts of work for when there is a genuine reason a manifest could not be scanned, for instance if it has invalid contents that cannot be scanned by Clair. This PR does not do anything to differentiate between genuine errors and accidental errors so could trigger lots of work.

Would love the thoughts of someone familiar with this code!

The retry logic in the indexer's controller is faulty in that every action that may return a timeout error (the entry path to set retry to true) will also return a Terminal state along with the timeout error.  Therefore after pausing briefly in the retry statement it will then break due to the next state being Terminal. I think the correct way to fix this is to just remove the rety logic as it can also cause an empty index report to be written to the database and simplifying the code seems to be the best approach.

Signed-off-by: Iain Duncan <iain.duncan@uk.ibm.com>
If a manifest has scanned with all it's scanners it _should_ end up with a completed index report in the IndexFinished state.  If it does not then it will not be possible to query this manifest.  To fix an incomplete index report it will need rescanning so trigger a rescan.

Signed-off-by: Iain Duncan <iain.duncan@uk.ibm.com>
@iainduncani iainduncani force-pushed the checkManifestToCheckIndexReport branch from 6e80b61 to 8af9f43 Compare October 31, 2023 11:48
@paulaldridge
Copy link
Contributor

Would really love to see this improvement. Today after an upgrade problem we have a load of manifests with a committed index report with State: IndexError, and success: false. But resubmitting the manifest for another scan attempt just goes down the Manifest already scanned path, so they are stuck with bad index reports.

@hdonnay @crozzy any chance of getting this prioritised at some point? Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants