Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] RelationshipMixin: Improve performance of Relationships parent #10160

Closed
wants to merge 1 commit into from

Conversation

kbrock
Copy link
Member

@kbrock kbrock commented Jul 31, 2016

extracted #10299 - waiting to merge that to decide what would be the best to read here MERGED

Theme:

Speed up Capture.perf_capture_timer (142s)
by speeding up MiqAlert.target_needs_realtime_capture? (105s)
by speeding up "detection if a resource has an alert" (105s)
by speeding up "get tags for a resource and it's parents (including blue folder)" (100s)
by speeding up "get parents for a resource" (~50% time, 80% queries)

Goal of this PR:

Speed up the time it takes to lookup a vm's parents by making it quicker to query ancestors/relationships. This speeds up parent_blue_folder and parent_resource_pool

https://bugzilla.redhat.com/show_bug.cgi?id=1346988
https://bugzilla.redhat.com/show_bug.cgi?id=1346999

TL;DR

19% faster, 26% fewer objects, 17% fewer queries, 30% fewer rows returned.

before

ms bytes objects queries query (ms) rows comments
139,250.5 439,519,495* 146,637,247 82,206 44,212.9 101,046 before
1,773.5 11 863.0 11,045 ...Metric::Targets.capture_infra_targets
444.8 202 134.9 ..Metric::Capture.calc_targets_by_rollup_parent
106.5 80 21.0 ..Metric::Capture.calc_tasks_by_rollup_parent
101,671.1 60,223 29,461.1 90,000 ..Metric::Capture.filter_perf_capture_now
7,350.9 5,221 5,908.7 10,000 ...SELECT "taggings"."id" AS t0_r0, "taggings"."taggable_id" AS t0_r1, "taggings"."tag_id" AS t0_r2, "ta
4,896.8 10,000 3,701.4 25,000 ...SELECT "relationships".*
10,203.7 30,000 6,794.4 30,000 ...SELECT "relationships".*
4,701.8 10,000 3,002.7 20,000 ...SELECT "ems_folders".*
2,307.0 5,000 1,666.4 5,000 ...SELECT "resource_pools".*
35,133.3 21,684 13,716.2 ..Metric::Capture.queue_captures
  • Memory usage does not reflect 71,630,466 freed objects.

after

ms bytes objects queries query (ms) rows comments
112,703.9 245,611,833* 108,651,445 57,206 36,877.4 71,046 after
1,828.7 11 877.2 11,045 ...Metric::Targets.capture_infra_targets
441.7 202 132.4 ..Metric::Capture.calc_targets_by_rollup_parent
112.1 80 24.1 ..Metric::Capture.calc_tasks_by_rollup_parent
74,396.0 35,223 22,228.2 60,000 ..Metric::Capture.filter_perf_capture_now
6,055.6 5,221 3,984.9 10,000 ...SELECT "taggings"."id" AS t0_r0, "taggings"."taggable_id" AS t0_r1, "taggings"."tag_id" AS t0_r2, "ta
3,879.3 10,000 2,511.9 20,000 ...SELECT "relationships"."ancestry"
6,282.9 10,000 5,148.0 10,000 ...SELECT "relationships".*
3,627.2 5,000 2,496.6 15,000 ...SELECT "ems_folders".*
2,382.3 5,000 1,719.6 5,000 ...SELECT "resource_pools".*
35,785.4 21,684 13,598.4 ..Metric::Capture.queue_captures
  • Memory usage does not reflect 52,462,392 freed objects.

I tried to cut out cruft from tables. let me know if you need more info

relationships.reject { |r| r.filtered?(of_type, except_type) }
else
relationships = relationships.where(:resource_type => of_type) if of_type.present?
relationships = relationships.where.not(:resource_type => except_type) if except_type.present?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but since we have a separate method for filtered?, it feels like we should have a separate method for this scoping...perhaps a .filtered class method that does these two lines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this method filter_by_resource_type is to filter on these 2 fields.

You're proposing a method that just does the in memory part?

of note, relationships.reject { |r| r.filtered?(of_type, except_type) } is only used in this 1 spot.
Also, of note, :except_type is only ever used in 1 spot of our code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm proposing a method that is just the scopes part... This method is essentially just a switch between scopes vs Array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's a method that's just scopes, that in itself would make it a scope-method (i.e. a method that returns an AR::Relation)

Copy link
Member Author

@kbrock kbrock Aug 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved, not sure why it isn't showing up in this commit. it is showing up correctly in main PR See commit "relationships: use scope to simplify filtered"

@Fryguy
Copy link
Member

Fryguy commented Aug 1, 2016

Can you expand on the OP description a bit...those number are meaningless to me without context (e.g. is that 142s total time or 142s faster but what percentage is that)

Do you have some benchmarks for "Speed[ing] up the time it takes to lookup a vm's parents"?

@Fryguy
Copy link
Member

Fryguy commented Aug 1, 2016

Overall code-wise looks good, but would like some benchmarks to justify it better.

@Fryguy
Copy link
Member

Fryguy commented Aug 1, 2016

Thanks for the numbers @kbrock . Sounds good. Only thing left is my comment on separate method and getting the tests to go green

@kbrock
Copy link
Member Author

kbrock commented Aug 2, 2016

@Fryguy added scopes and cleared up dumb bug I introduced in fixing up cops

@kbrock kbrock added the bug label Aug 2, 2016
@Fryguy
Copy link
Member

Fryguy commented Aug 2, 2016

@kbrock I see you made a lot more changes after I gave a 👍, and started moving more stuff around...the code was good as is, but now I have to re-review in light of all the new changes...were they necessary to move for this PR?

@kbrock
Copy link
Member Author

kbrock commented Aug 2, 2016

@Fryguy I'm sorry. The only changes that I think I made are related to your comments:

  1. added a filtered scope per your suggestion
  2. removed filtered helper method from specs per your suggestion
  3. removed assignment_mixin.rb flat_map refactor that was unrelated, unnecessary, and was breaking the tests. => cap&u assignment: use flat_map #10211

I didn't mean to introduce any other changes.

UPDATE:
aah: The last commit with parent_blue_folders is new. Without that, the performance numbers looked paltry.

@kbrock
Copy link
Member Author

kbrock commented Aug 5, 2016

testing:

I ran for >10,000 Vms, with the old method and the new method. and they returned the same value. (not conclusive, but a little reassuring)
Though the Vms probably had the same parent lineage anyway

@@ -166,6 +166,23 @@ def folder_path_objs(*args)
folders
end

# similar to folder_path_objs, but takes in a scope
# @param folder [Relation] relation pointin to ems folder of interest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo pointin => pointing Also prefer EmsFolder over ems folder

@Fryguy
Copy link
Member

Fryguy commented Aug 5, 2016

@kbrock If you remove the parent_blue_folder commit, then this would be fine to merge...that particular commit has issues that could probably be done in a separate PR.

I assume the numbers in the OP are without that commit? If so, they seemed good to me without that parent_blue_folder commit.

# from this id, relationship records can be brought back and mapped to the resource of interest
# NOTE: parent_id is read from ancestry field, so it is not a db hit, but parent is an N+1 db hit.
def relationship_parent_ids
relationships.where.not(:ancestry => [nil, ""]).select(:ancestry).collect(&:parent_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized...does this handle relationship_type filtering?

@kbrock
Copy link
Member Author

kbrock commented Aug 9, 2016

majority of the meat has been extracted.
What remains is moving parent_blue_folder over to using scopes. This reduces 8 (12 rows) queries to 3 (6 rows) and shows most of the benefit.

(This would be great in darga, but not as much of a pressing need now that c&u is running. well. If you do want this performance boost, we'll also want to 10299)

@kbrock
Copy link
Member Author

kbrock commented Aug 17, 2016

@Fryguy is there any legwork I can do on my side to make this easier to merge?
Or anything to extract to a separate PR?

thnx

@kbrock kbrock changed the title Improve performance of Relationships parent RelationshipMixin: Improve performance of Relationships parent Aug 17, 2016
@@ -135,6 +135,13 @@ def parent_ids(*args)
Relationship.resource_pairs(parent_rels(*args))
end

# return a scope of all of the parents of this record (must :of_type with a single parameter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If of_type is required or single parameter, can you check and raise an ArgumentError (instead of the comment?)

@kbrock kbrock force-pushed the relationships_prune branch 2 times, most recently from 395d824 to 72a3b19 Compare August 31, 2016 03:23
folders = folders[1..-1] if options[:exclude_root_folder]
folders = folders.reject(&:hidden?) if options[:exclude_non_display_folders]
folders
self.class.folder_options path(:of_type => "EmsFolder"), options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised the bot is not picking up on the lack of parens on the method call (i.e. please use parens).

@Fryguy
Copy link
Member

Fryguy commented Sep 9, 2016

@kbrock Would like to sit down with you to understand this one like we did last time. I liked that last time we came to nice elegant answer that covered all use cases, and I have a feeling the same thing will happen here.

@kbrock
Copy link
Member Author

kbrock commented Nov 30, 2016

WIP until I rebase and produce numbers again

@kbrock kbrock added the wip label Nov 30, 2016
@kbrock kbrock changed the title RelationshipMixin: Improve performance of Relationships parent [WIP]RelationshipMixin: Improve performance of Relationships parent Nov 30, 2016
@chessbyte chessbyte changed the title [WIP]RelationshipMixin: Improve performance of Relationships parent [WIP] RelationshipMixin: Improve performance of Relationships parent Feb 3, 2017
@chessbyte
Copy link
Member

@kbrock @Fryguy any progress on this one?

Using scopes allows us to cut down on the number of queries necessary to
get the parent_blue_folders. (8 -> 3)
@miq-bot
Copy link
Member

miq-bot commented Feb 17, 2017

Checked commit kbrock@38337a1 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
3 files checked, 0 offenses detected
Everything looks good. 🍰

@miq-bot miq-bot closed this Oct 14, 2017
@miq-bot
Copy link
Member

miq-bot commented Oct 14, 2017

This pull request has been automatically closed because it has not been updated for at least 6 months.

Feel free to reopen this pull request if these changes are still valid.

Thank you for all your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants