-
Notifications
You must be signed in to change notification settings - Fork 201
Support reading critical path information from local JSON files #4755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cd06e3e
to
c521ea7
Compare
Test failure looks unrelated (the F35 integration tests ran into an authentication prompt in unrelated code). |
f0d9192
to
8a12d75
Compare
Looks good, just a two pedantic comments from me:
I'm also thinking about dropping |
Thanks, I'll look at the first note in a bit. My intent for how this could be used is that we could run a cron job (or systemd timer service, whatever) on the Bodhi backend server that runs the critpath generation script every 24 hours and drops it in the appropriate location. I guess the script could also restart Bodhi, but is it a great idea to do that every 24 hours? It might cause unexpected effects for anyone who happens to be using Bodhi at that moment, after all. So I kinda intended to drop the caching, it makes the code simpler, it avoids potential weird caching issues (caching always seems to have weird issues), and since opening and reading a local JSON file is a fairly trivial operation and AFAICS only happens when an update is created or edited, doesn't really seem worthwhile at least to me. I guess ideally in future I'd like to drop the other two critpath 'backends' which would allow this code to be much simpler. While the PDC backend is still around you can say "well it's certainly sensible to cache that, and if we're caching that one we may as well cache the others", but since we're trying to drop PDC anyway, I'm kinda expecting we'll drop that code in future. |
and yeah, if there's a stdlib implementation of the caching, I certainly agree with switching to that as long as we actually are doing caching, but that could probably be done as a separate PR. |
I almost forgot: can you add a note to be added to release notes? |
Technical note on the caching: after just the first commit in this series, the JSON path is cached, because at that point, caching is still applied to the umbrella |
This adds another valid setting for critpath.type, 'json', which means to read the critpath information from JSON files. These are expected to be produced by the `critpath.py` script from the releng repo (which I recently updated to output this data). An additional setting, `critpath.jsonpath`, tells Bodhi where to find the files. The files must be named for each Release's 'branch' value, which is the name of the dist-git branch for that release. Ultimately this has two benefits: let us stop using PDC for this (we can just have a cron job to run `critpath.py` nightly or whatever), which is one step to not needing PDC any more and avoids unnecessary network traffic and server load, and allow us to do critpath stuff by group. The JSON data as output by `critpath.py` is grouped. This commit doesn't take advantage of that yet - it flattens the data into a single big list when reading it - but follow-up commits will add additional code paths to determine which groups a given update is critpath for. Signed-off-by: Adam Williamson <awilliam@redhat.com>
This moves the `@memoized` decorator to the function for getting critpath from PDC. The effect of this is we will not cache the critpath when retrieving it from a local source (JSON files or configuration). There's no real point in re-reading it from the configuration on every run of the function, but it also doesn't hurt much - it's probably no more resource-intensive than running through the `@memoized` decorator anyway. There is a benefit to re-reading the JSON file on every run: it allows us to update the critical path regularly without restarting Bodhi. This function is called only when creating or editing updates, so I don't think we're going to overwhelm servers with disk reads or anything by doing this. Signed-off-by: Adam Williamson <awilliam@redhat.com>
8a12d75
to
f7bd086
Compare
Rebased, dropped the unnecessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if you really really want to keep the caching, we just drop the second commit.
I was just thinking that having the json file open and parsed for every update creation is not really efficient, since the critpath list is rarely to be updated. Maybe the cronjob used to update the critpath files can also fire cache_clear()
.
However, this PR is fine as it is, I'll try to look to porting the cache from memoize to functools in another PR.
I agree it's not super efficient, but practically speaking, I don't think we create / edit updates enough that the trivial amount of resources needed to open and parse a JSON file each time we do are really worth worrying about :) It's just intuition, though. I guess if we actually do notice increased resource usage in production after deploying the change we can always revisit it. |
I have follow-ups to this PR to actually do interesting things with the grouped data, but that will be a rather bigger change (including db schema change) and this is useful in itself, so I figured I'd submit it first.
Doing this would give us two significant things: we can potentially stop using PDC for critical path information (which is one of the things blocking PDC's retirement), and we wouldn't have to restart Bodhi for changes to the critical path to take effect. I'm imagining to implement this we'd write a trivial cron job that runs the
critpath.py
script and outputs the data with the right filenames every day.I did write this to theoretically support non-RPM component types, like the existing code does, but I don't think we actually do generate critical path data for anything besides RPMs in the main release collections right now, unless someone knows otherwise.
@nirik @smooge for info.