Dependency generator optimization and cleanup #698

pmatilai · 2019-05-09T13:39:49Z

This basically turns the dependency generation order inside out: previously we ran through the file list one by one, running dependency generators for all types the file was found to match, ie in file:attr:deptype order. Now we do the exact opposite: deptype:attr:file, which allows us to optimize to only expand all those macros once per deptype / attr instead of once per file.

Sadly all that is lost in the noise of actually forking + executing those dependency generators, but this is basically a prerequisite for the far more important optimizations, such as (some day) teaching the generators to work on multiple files at once and ultimately, multiple generators in parallel.

No functional changes, but needed for next steps.

This doesn't change the amount of work we have to do in itself but is prerequisite for pretty much any optimization we could do.

Compile the exclude-regexes once per dependency type instead of per every file. Saves a huge amount of huffing and puffing about, but doesn't really show on wall clock as the time goes to even bigger stupidities.

Use a hash to store the attribute -> file mapping, run in dependency type, file attribute and file order.

ffesti · 2019-05-14T08:24:26Z

build/rpmfc.c

-		       const char *nsdep, const char *depname,
-		       rpmsenseFlags dsContext, rpmTagVal tagN)
+		       rpmsenseFlags dsContext, rpmTagVal tagN,
+		       const char *namespace, const char *cmd)


The new params are not added to the doc string

Urgh, yeah. Mind if I just nuke the doc string, it's just an internal helper after all...?

Fine with me.

Okay, axed them all in a separate commit.

ffesti · 2019-05-14T08:46:00Z

I was first concerned that all this "putting stuff in data structures" might create overhead. After having a deeper look this is not a problem. Storing a few more integers while we have the whole set of file names in memory will probably not kill anyone.

ffesti · 2019-05-14T08:48:54Z

Looks good. Please squash the doc string fix and push upstream!

pmatilai · 2019-05-14T08:59:45Z

Oh, I left the doc string fix as a separate commit because it eliminates all the useless docstrings from rpmfc so its not really related to these.

pmatilai · 2019-05-14T09:01:56Z

And yeah there is going to be some overhead from using a hash, but otoh this also eliminates a whole lot of overhead that is currently running on each file, and if we ever want to run more than one file at a time through a generator there's not a whole lot of choice.

We need to expand these things once per file attribute, not per file. It saves a huge amount of huffing and puffing about, but doesn't show up on wall clock due to the generator script interaction being so dumb and slow. However, now we have the operation arranged in a way that would make smarter generator script running actually possible.

Out-of-date and even empty doc strings are so not helping anybody at all...

pmatilai · 2019-05-14T09:05:57Z

Anyway, dropped the docstring change from the "expand all macros" commit, its all in the cleanup commit instead.

pmatilai added 4 commits May 9, 2019 11:53

Refactor dependency generation logic from code to data structure

318edfb

No functional changes, but needed for next steps.

Reorder dependency generation to run per type, not by file

271ce99

This doesn't change the amount of work we have to do in itself but is prerequisite for pretty much any optimization we could do.

Optimize exclude handling in dependency generation

4bc9758

Compile the exclude-regexes once per dependency type instead of per every file. Saves a huge amount of huffing and puffing about, but doesn't really show on wall clock as the time goes to even bigger stupidities.

Reorder dependency generator to run by type, attr, file

96a2484

Use a hash to store the attribute -> file mapping, run in dependency type, file attribute and file order.

pmatilai added the cleanup label May 9, 2019

ffesti reviewed May 14, 2019

View reviewed changes

pmatilai added 2 commits May 14, 2019 12:04

Nuke useless doxygen strings from rpmfc internals

a5da6ca

Out-of-date and even empty doc strings are so not helping anybody at all...

pmatilai force-pushed the rpmfc-sanity-pr branch from ce97d76 to a5da6ca Compare May 14, 2019 09:05

pmatilai merged commit 8deb9bb into rpm-software-management:master May 14, 2019

pmatilai deleted the rpmfc-sanity-pr branch May 14, 2019 09:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dependency generator optimization and cleanup #698

Dependency generator optimization and cleanup #698

pmatilai commented May 9, 2019

ffesti May 14, 2019

pmatilai May 14, 2019

ffesti May 14, 2019

pmatilai May 14, 2019

ffesti commented May 14, 2019

ffesti commented May 14, 2019

pmatilai commented May 14, 2019

pmatilai commented May 14, 2019

pmatilai commented May 14, 2019

Dependency generator optimization and cleanup #698

Dependency generator optimization and cleanup #698

Conversation

pmatilai commented May 9, 2019

ffesti May 14, 2019

Choose a reason for hiding this comment

pmatilai May 14, 2019

Choose a reason for hiding this comment

ffesti May 14, 2019

Choose a reason for hiding this comment

pmatilai May 14, 2019

Choose a reason for hiding this comment

ffesti commented May 14, 2019

ffesti commented May 14, 2019

pmatilai commented May 14, 2019

pmatilai commented May 14, 2019

pmatilai commented May 14, 2019