DM-17611: performance optimizations for data ID manipulations #126

TallJimbo · 2019-02-02T02:43:25Z

No description provided.

Callers of toNameSet didn't actually care that they got a DimensionNameSet back; they just wanted something with .names, and frequently the object they passed in would have already qualified. In that case, we now just return that object directly, and toNameSet has been renamed to conformSet to reflect that new behavior.

Doing this repeatedly for every data ID with a certain set of keys was wasting a lot of time.

This separates operation unique to DimensionGraph construction and make those a bit easier to follow and control. This will be useful for later commits that try to limit the time spent in DimensionGraph construction via caching.

We only need the per-DatasetType data IDs expanded, because no one ever looks at the extra metadata associated with the row-wide one. In the future we should investigate whether we can get rid of the row-wide data ID entirely, but it does currently appear to be used in pipe.base.GraphBuilder.

TallJimbo · 2019-02-02T02:44:25Z

This change was reviewed on DM-17496; I'm just moving it here to separate it from unrelated ongoing work on that ticket.

TallJimbo added 7 commits February 1, 2019 21:42

Compute the full set of DimensionElement dependencies up-front.

14c5828

Doing this repeatedly for every data ID with a certain set of keys was wasting a lot of time.

Cache DimensionSet.links.

2799326

Move assertion after short-cut in recursion.

7454fb2

Make universe DimensionGraphs their own subclass.

aae512a

This separates operation unique to DimensionGraph construction and make those a bit easier to follow and control. This will be useful for later commits that try to limit the time spent in DimensionGraph construction via caching.

Add cache to DimensionGraph construction.

5187f0e

TallJimbo merged commit 5d5c9dd into master Feb 2, 2019

TallJimbo deleted the tickets/DM-17611 branch February 2, 2019 02:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-17611: performance optimizations for data ID manipulations #126

DM-17611: performance optimizations for data ID manipulations #126

TallJimbo commented Feb 2, 2019

TallJimbo commented Feb 2, 2019

DM-17611: performance optimizations for data ID manipulations #126

DM-17611: performance optimizations for data ID manipulations #126

Conversation

TallJimbo commented Feb 2, 2019

TallJimbo commented Feb 2, 2019