New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependency generator optimization and cleanup #698
Dependency generator optimization and cleanup #698
Conversation
No functional changes, but needed for next steps.
This doesn't change the amount of work we have to do in itself but is prerequisite for pretty much any optimization we could do.
Compile the exclude-regexes once per dependency type instead of per every file. Saves a huge amount of huffing and puffing about, but doesn't really show on wall clock as the time goes to even bigger stupidities.
Use a hash to store the attribute -> file mapping, run in dependency type, file attribute and file order.
const char *nsdep, const char *depname, | ||
rpmsenseFlags dsContext, rpmTagVal tagN) | ||
rpmsenseFlags dsContext, rpmTagVal tagN, | ||
const char *namespace, const char *cmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new params are not added to the doc string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Urgh, yeah. Mind if I just nuke the doc string, it's just an internal helper after all...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, axed them all in a separate commit.
I was first concerned that all this "putting stuff in data structures" might create overhead. After having a deeper look this is not a problem. Storing a few more integers while we have the whole set of file names in memory will probably not kill anyone. |
Looks good. Please squash the doc string fix and push upstream! |
Oh, I left the doc string fix as a separate commit because it eliminates all the useless docstrings from rpmfc so its not really related to these. |
And yeah there is going to be some overhead from using a hash, but otoh this also eliminates a whole lot of overhead that is currently running on each file, and if we ever want to run more than one file at a time through a generator there's not a whole lot of choice. |
We need to expand these things once per file attribute, not per file. It saves a huge amount of huffing and puffing about, but doesn't show up on wall clock due to the generator script interaction being so dumb and slow. However, now we have the operation arranged in a way that would make smarter generator script running actually possible.
Out-of-date and even empty doc strings are so not helping anybody at all...
Anyway, dropped the docstring change from the "expand all macros" commit, its all in the cleanup commit instead. |
This basically turns the dependency generation order inside out: previously we ran through the file list one by one, running dependency generators for all types the file was found to match, ie in file:attr:deptype order. Now we do the exact opposite: deptype:attr:file, which allows us to optimize to only expand all those macros once per deptype / attr instead of once per file.
Sadly all that is lost in the noise of actually forking + executing those dependency generators, but this is basically a prerequisite for the far more important optimizations, such as (some day) teaching the generators to work on multiple files at once and ultimately, multiple generators in parallel.