Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #30053 - Audit page render slow when organization is set #7728

Merged
merged 1 commit into from Mar 31, 2021

Conversation

hao-yu
Copy link
Member

@hao-yu hao-yu commented Jun 8, 2020

The audit page taking long time to load when there are many audit logs and organization is set. This page fixes the issue

@theforeman-bot
Copy link
Member

Issues: #30053

Copy link
Member

@kgaikwad kgaikwad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code wise looks good to me.
I haven't tested it but added small inline comment.

app/models/concerns/taxonomix.rb Outdated Show resolved Hide resolved
@hao-yu
Copy link
Member Author

hao-yu commented Jul 6, 2020

any progress on this?

@ezr-ondrej ezr-ondrej self-assigned this Jul 7, 2020
Copy link
Member

@ezr-ondrej ezr-ondrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @hao-yu I think this is really great change, I'd like to be able to do taxonomies through arel.
That being said, I think we are setting up an API for that not just for Audits, I've left comments what I think could be done bit differently, but I might be wrong and I am open to have a discussion about this.
Wouldn't you mind to have a brief meeting with me and @kgaikwad about this to get us understand it better and thus get it in faster?

clauses << loc_join_arel[:taxonomy_id].in(user_locations)
end

nodes = clauses.any? ? clauses.reduce(:and) : nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's bit weird naming for it, could we come up with something more descriptive?

Suggested change
nodes = clauses.any? ? clauses.reduce(:and) : nil
statement = clauses.any? ? clauses.reduce(:and) : nil

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. naming is very hard sometimes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

end

def join_taxonomies(taxonomies, join_name)
joins("LEFT JOIN #{TAXONOMY_JOIN_TABLE} AS #{join_name} "\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do this through arel as well? I'd rather not mix arel and pure sql if possible and this joins should not be as hard, right? We could pass in the arel table (org_join_arel or loc_join_arel)
I think if we do it through arel, the Rails have it easier to figure wheter the join is already used in a scope or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup. can do. shouldn't be an issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

app/models/concerns/taxonomix.rb Outdated Show resolved Hide resolved
end

def taxed_and_untaxed
in_taxonomy_scope do |nodes|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very unclear method it sounds it's doing the same as with_taxonomy_scope, the purpose of the block is not very clear on the first sight (without actually reading the in_taxonomy_scope body).
If we could get rid of the unless branches from in_taxonomy_scope, the method would not be needed and we could just do:

statements = # code in the block body
taxonomy_join_scope.where(statements)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the purpose of in_taxonomy_scope is to replace the with_taxonomy_scope for more efficient query. I think it might be good to replace with_taxonomy_scope in the future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think block is still needed so that the arel query can be customized better and more flexible, such as the grouping.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezr-ondrej as per discussed, I removed the "in_taxonomy_scope" and will make it reusable for other places in a separate commit.

@hao-yu
Copy link
Member Author

hao-yu commented Jul 14, 2020

@ezr-ondrej please hold the merge. I would like to double check the code. thanks

@hao-yu hao-yu force-pushed the fixed_30053_audit_page_slow branch from 76a23d5 to 22308e7 Compare July 14, 2020 13:14
@hao-yu
Copy link
Member Author

hao-yu commented Jul 14, 2020

Done!

@@ -132,6 +132,44 @@ def scope_by_taxable_ids(scope)
scope.where("#{table_name}.id IN (#{cached_ids.join(',')})")
end
end

def join_organizations
join_taxonomies(Organization.all, org_join_arel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
join_taxonomies(Organization.all, org_join_arel)
join_taxonomies(Organization.unscoped, org_join_arel)

I think, unscoped is required here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside join_taxonomies method it requires only taxonomy{location/organization}_ids. So, would it be good to pass directly ids?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or probably just taxonomy class ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doing "unscoped" is optional or trivial since the only default scope is order(:title). Referring to https://github.com/theforeman/foreman/blob/develop/app/models/taxonomies/organization.rb#L22 unscoped is not used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will prefer passing the scope to the method as i don't have to call pluck on every call manually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for keeping the scope, as we could decide in the future not to use ids, but subselect, so this is more robust.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kgaikwad @ezr-ondrej Done! make them use "unscoped" as suggested.

end

def join_locations
join_taxonomies(Location.all, loc_join_arel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@hao-yu hao-yu force-pushed the fixed_30053_audit_page_slow branch from 22308e7 to 5ef5ce2 Compare July 15, 2020 12:56
@hao-yu
Copy link
Member Author

hao-yu commented Jul 20, 2020

can we merge this if nothing is pending?

Copy link
Member

@ezr-ondrej ezr-ondrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise LGTM 👍
@kgaikwad are you ok with the code? Shall I test it?

kgaikwad
kgaikwad previously approved these changes Jul 22, 2020
Copy link
Member

@kgaikwad kgaikwad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kgaikwad are you ok with the code? Shall I test it?

Sorry for the delay. Changes works well without any failure.
Tried page loading testing using Rack Mini Profiler but not sure why it isn't showing me sql queries logs. Found little improvement.

@hao-yu
Copy link
Member Author

hao-yu commented Jul 22, 2020

@kgaikwad: Thanks for the test. Just curious about the little performance improvement. Can you tell me what was your test result before and after the patch?

Just test in a reproducer with 135K audit entries:

Before:
[I|app|d93f0477] Started GET "/audits?page=1&per_page=20&search="
[I|app|d93f0477] Parameters: {"page"=>"1", "per_page"=>"20", "search"=>""}
[I|app|d93f0477] Completed 200 OK in 77780ms (Views: 28.1ms | ActiveRecord: 76828.5ms)

After:
[I|app|cc1e6f17] Started GET "/audits?page=1&per_page=20&search="
[I|app|cc1e6f17] Parameters: {"page"=>"1", "per_page"=>"20", "search"=>""}
[I|app|cc1e6f17] Completed 200 OK in 3601ms (Views: 15.6ms | ActiveRecord: 2839.3ms)

@kgaikwad
Copy link
Member

@kgaikwad: Thanks for the test. Just curious about the little performance improvement. Can you tell me what was your test result before and after the patch?

@hao-yu, thank you for sharing results.

  1. On develop branch (With just 214 records)
  • WEB
    [I|app|6c3ea52b] Started GET "/audits?page=1&search=" for 127.0.0.1 at 2020-07-22 11:27:51 +0000
    [I|app|0c78ebb3] Started GET "/notification_recipients" for 127.0.0.1 at 2020-07-22 11:27:51 +0000
    [I|app|6c3ea52b] Processing by AuditsController#index as JSON
    [I|app|6c3ea52b] Parameters: {"page"=>"1", "search"=>""}
    ................
    [D|sql|6c3ea52b] ↳ app/controllers/audits_controller.rb:10:in `index'
    [I|app|6c3ea52b] Completed 200 OK in 1636ms (Views: 16.8ms | ActiveRecord: 200.1ms | Allocations: 617139)

  • API
    [I|app|038c4f14] Started GET "/api/audits" for 127.0.0.1 at 2020-07-22 11:07:14 +0000
    [I|app|038c4f14] Processing by Api::V2::AuditsController#index as JSON
    [I|app|038c4f14] Parameters: {"apiv"=>"v2"}
    ............
    [I|app|038c4f14] Completed 200 OK in 878ms (Views: 827.9ms | ActiveRecord: 27.1ms | Allocations: 261240)

  1. With pull-request changes (With just 214 records)
  • WEB
    [I|app|86cf2281] Processing by AuditsController#index as JSON
    ...........
    [I|app|86cf2281] Completed 200 OK in 1690ms (Views: 30.8ms | ActiveRecord: 201.4ms | Allocations: 583445)

  • API
    [I|app|6d3ff47b] Processing by Api::V2::AuditsController#index as JSON
    [I|app|6d3ff47b] Parameters: {"apiv"=>"v2"}
    ...........
    [I|app|6d3ff47b] Completed 200 OK in 759ms (Views: 516.1ms | ActiveRecord: 84.0ms | Allocations: 401884)

I also noticed that new count query takes more time than previous.

  1. With develop branch, example count query -

[D|sql|6c3ea52b] (1.4ms) SELECT COUNT(*) FROM "audits" WHERE ((("audits"."auditable_type" IN ($1, $2, $3, $4, $5, $6) OR "audits"."auditable_type" IN ($7, $8, $9, $10, $11, $12, $13, $14, $15, $16) AND (audits.id IN (184,116,87,71,68,51,80,70,52,190,162,84,170,176,169,92,237,69,180,97,238,236,108,156,59,197,65,98,173,149,311,200,189,239,235,161,117,88,82,188,119,240,153,147,202,196,79,210,187,85,72,95,81,244,160,232,77,302,172,198,212,242,204,249,165,151,91,209,74,179,225,138,181,245,205,96,83,67,201,63,90,105,216,174,213,107,233,86,168,163,223,93,89,241,221,167,14,231,66,158,155,118,195,207,199,75,194,243,99,152,206,157,53,230,301,183,136,150,139,193,12,164,248,137,208,78,159,94,218,186,154,234,211,177,76,106,185,64,178,148,203,58))) OR 1=0 AND "audits"."id" IN (SELECT "taxable_taxonomies"."taxable_id" FROM "taxable_taxonomies" WHERE "taxable_taxonomies"."taxonomy_id" = $17 AND "taxable_taxonomies"."taxable_type" = $18)) OR "audits"."auditable_type" = $19 AND "audits"."id" IN (SELECT "taxable_taxonomies"."taxable_id" FROM "taxable_taxonomies" WHERE "taxable_taxonomies"."taxonomy_id" = $20 AND "taxable_taxonomies"."taxable_type" = $21)) [["auditable_type", "Image"], ["auditable_type", "Operatingsystem"], ["auditable_type", "ComputeAttribute"], ["auditable_type", "Setting"], ["auditable_type", "KeyPair"], ["auditable_type", "Nic::Managed"], ["auditable_type", "Host::Base"], ["auditable_type", "ProvisioningTemplate"], ["auditable_type", "SmartProxy"], ["auditable_type", "Hostgroup"], ["auditable_type", "Filter"], ["auditable_type", "ComputeResource"], ["auditable_type", "HttpProxy"], ["auditable_type", "User"], ["auditable_type", "Parameter"], ["auditable_type", "Domain"], ["taxonomy_id", 1], ["taxable_type", "Audited::Audit"], ["auditable_type", "Organization"], ["taxonomy_id", 2], ["taxable_type", "Audited::Audit"]]

  1. with pull-request change:

[D|sql|86cf2281] (1.5ms) SELECT COUNT(DISTINCT "audits"."id") FROM "audits" LEFT JOIN taxable_taxonomies AS tto ON audits.id = tto.taxable_id AND tto.taxable_type = 'Audited::Audit' AND tto.taxonomy_id IN (1) LEFT JOIN taxable_taxonomies AS ttl ON audits.id = ttl.taxable_id AND ttl.taxable_type = 'Audited::Audit' AND ttl.taxonomy_id IN (2) WHERE ((((("tto"."taxonomy_id" IN (1) AND "ttl"."taxonomy_id" IN (2) AND "audits"."auditable_type" IN ('Host::Base', 'ProvisioningTemplate', 'SmartProxy', 'Hostgroup', 'Filter', 'ComputeResource', 'HttpProxy', 'User', 'Parameter', 'Domain')) OR "audits"."auditable_type" IN ('Image', 'Operatingsystem', 'ComputeAttribute', 'Setting', 'KeyPair', 'Nic::Managed')) OR (1=0 AND "tto"."taxonomy_id" IN (1))) OR ("audits"."auditable_type" IN ('Organization') AND "ttl"."taxonomy_id" IN (2))))

That's why, I requested @ezr-ondrej to test and confirm.

@hao-yu
Copy link
Member Author

hao-yu commented Jul 22, 2020

@kgaikwad @ezr-ondrej Yeah. The new query is slightly slower than the old query when the records count is small (a few hundreds to a few thousands) but it is very insignificant, like < 50ms. so I think it is definitely worth.

Test for 2.4k audit entries

Before:
[I|app|50d9321b] Started GET "/audits?page=1&per_page=20&search="
[I|app|50d9321b] Parameters: {"page"=>"1", "per_page"=>"20", "search"=>""}
[I|app|50d9321b] Completed 200 OK in 783ms (Views: 47.0ms | ActiveRecord: 92.6ms)

After:
[I|app|a399934e] Started GET "/audits?page=1&per_page=20&search="
[I|app|a399934e] Parameters: {"page"=>"1", "per_page"=>"20", "search"=>""}
[I|app|a399934e] Completed 200 OK in 796ms (Views: 44.6ms | ActiveRecord: 121.7ms)

@kgaikwad
Copy link
Member

@kgaikwad @ezr-ondrej Yeah. The new query is slightly slower than the old query when the records count is small (a few hundreds to a few thousands) but it is very insignificant, like < 50ms. so I think it is definitely worth.

@hao-yu - yes, definitely it is! 👍
I apologize for mentioning little based on less records.
Let's wait for inputs from @ezr-ondrej otherwise good to go!

Copy link
Member

@ezr-ondrej ezr-ondrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen some weird times myself, so I've took time to test this properly.
I'm sorry it took a while I've got to it.

Measurements

I've used a following script, please do point out any issues you'd find in the script itself
https://gist.github.com/ezr-ondrej/81c8d2a12fda3df1b23f19d76eae9374

Results

My findings are, that the new query is MUCH slower, so we save time at preparation of the query, but the query itself is very slow in comparison to the old one.

Results: https://pastebin.com/f7aBpdkk
Notice the real time (real - total =(approx) db time)

@hao-yu
Copy link
Member Author

hao-yu commented Aug 6, 2020

@ezr-ondrej Thanks for the test script. I am not exactly sure why the patched version would perform so badly in the test but there is an issue to the script. It is so memory hunger. I am unable to create large records for testing. It will eat all my 8GB + 1.5G swap by just creating 5 - 6K of records.

So I wrote a new rake task to run the benchmark test outside of the test framework.
Note I have factorybot issue when running the rake so i just comment the following out to stop the issue:
test/factories/host_related.rb
#trait :with_build do
# build { true }
#end

https://gist.github.com/hao-yu/68a4852e5541b6fb3f8d6ee5837e312b

The tests indicate that the patched version performs faster as the record count grows, especially in pagination. See the results in:

https://pastebin.com/vqggAkuF

@ezr-ondrej
Copy link
Member

I don't see arel as something that would be harder to maintain, but quite the opposite. I think we should use it more especially for more complex SQL. It plays with Rails much better than plain SQL strings.

To the results, depends witch are we taking about, I've had to make few adjustments to the testing script, but overall it works as expected.
But I agree some tests we got the right results would help.

@hao-yu
Copy link
Member Author

hao-yu commented Aug 19, 2020

Would it be possible to get at least the majority of the performance gain using regular active record methods and only fall back to arel if and where we have no choice?

Before submitting patch, I was doing this with regular active record methods but I think it is very untidy, hard to read and also hard to maintain and reuse due to the complexity of the statement and decided to use arel instead.

The numbers of included results also seem off to me, how come almost all of the audits are included when only half of the hosts should be in the user's location?

If you are referring to the result of test #6 and test #7. It is because I modified the script to set the user.location to nil so that I can save some time to seed thousands of data for the tests. Also worth to note that, count of both Develop and Patched versions are matching in all tests.

Running multiple times might lead to skewed results due to rails or DB caching of the query responses

I ran same query a few times during the test, 2nd and subsequent times are just a bit quicker but not significant. Feel free to test.

Also did we check the impact this has on memory consumption? benchmark-memory gem could be useful in this regard.

I did monitor the memory usage while the query is running and see no significant memory consumption. But happy to prove that with the gem.

@ezr-ondrej ezr-ondrej added Reached an impasse Needs broader discussion and removed Waiting on contributor labels Aug 21, 2020
Copy link
Member

@tbrisker tbrisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that arel is sometimes better than writing sql directly, but very few developers are familiar with it, while all developers are familiar with both active record and sql. I'm still worried that adding ~70 lines of arel code would make this area much less accessible to most developers and would rather minimize it to only where it significantly improves readability.
I'm also a bit concerned about the performance of a double outer join on the tax_tax table in a real db- this table can easily get to the millions of records, and we might be hitting a wall with the db once it gets to doing the joins on disk.
Also added some comments inline.


def taxed_and_untaxed
clauses = []
unless !Organization.current && User.current.admin?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think flipping around the conditional will make it easier to understand (also for location below):

Suggested change
unless !Organization.current && User.current.admin?
if Organization.current || !User.current.admin?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 166 to 182
def user_organizations
Array.wrap(Organization.current || Organization.my_organizations.reorder(nil))
end

def user_locations
Array.wrap(Location.current || Location.my_locations.reorder(nil))
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are very similar to Taxonomy.expand. I would suggest considering using the existing method instead of adding a new one. Also, what about the child taxonomies in the case of the current?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the child taxonomies issue for the current context but I haven't use the "Taxonomy.expand" as it doesn't quite make sense to me. It will cause duplicate code run for non-admin user with any context. E.g. run Taxonomy.expand -> then run get_taxonomy_ids. This will call "Organization.my_organizations" to get all children and then call the "get_taxonomy_ids" to get all children again.


def join_taxonomies(taxonomies, tax_arel)
class_arel = base_class.arel_table
statement = class_arel.join(tax_arel, Arel::Nodes::OuterJoin).on(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is specific to the audits case, where you need the outer join so you can also see the partially taxed and untaxed resources. in other cases the outer join doesn't make sense.
Also, not all models actually include taxonomix - specifically those that are not fully taxed.
Should these methods all live inside the audits extension and not included in all models using taxonomix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the outer joins make sense for all models that include taxonomix and even partially taxed but it is only efficient if the models records can grow large like Audit. It is because outer joins is a little bit slower for smaller tables comparing to the current way. For example, ForemanOpenscap::ArfReport might be the candidate that can apply the outer join..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Tomer is comparing OUTER vs INNER joins. And I'd agree that if we use INNER join, we can use the join like the only mean to filter the records, we wouldn't need further conditions on the joined table.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, outer join is more flexible. BUT then it doesn't have much sense to use the taxonomy_ids in the join itself, because we need another condition using the ids.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong. I had thought about INNER JOIN earlier but I didn't think it will work because I need to include the audit rows even it is only partially taxable, such as taxable only by organization.

Yes, I think taxanomy_ids in the OUTER JOIN is optional. It is just there for me to feel safer. I am happy to remove it if everyone can confirm it is good to remove.

@hao-yu
Copy link
Member Author

hao-yu commented Sep 3, 2020

I'm also a bit concerned about the performance of a double outer join on the tax_tax table in a real db- this table can easily get to the millions of records, and we might be hitting a wall with the db once it gets to doing the joins on disk.

I re-trigger the test script. It is now printing out the EXPLAIN for all tests Please check the links below.

Based on the tests, I notice the following points:

  • Postgres uses hash joins when the total records are not large. It starts to use merge join when the total records grow to 90K. Index scan is also being use while doing the merge join (see Test 4 from the link below). Thus, I don't see how it will write to disk and it is up to the Optimizer to make the best decision whether to do hash or merge joins.
  • Comparing the total costs between the Develop branch and the Patched branch. The Patched branch has a very large improvement. Comparing Test 4, you will see that the total costs are 32608.69 (Patched) VS 1527923.31(Develop). Thus, I don't see a reason that this patch will perform worse than the Develop branch in the case of millions of records.

If you are still concerning about this patch will give negative impact in the case of million records, I think we will have to ask for real data from the user who is hitting this issue.

Benchmark for Develop: https://pastebin.com/cP02RyPh
Benchmark for Patched: https://pastebin.com/0vas8Bbf

Also did we check the impact this has on memory consumption? benchmark-memory gem could be useful in this regard.

According to the below test results, there is a huge memory improvement, especially in count, and pagination. The number of created objects are also significantly less than the Develop branch.

Please read the link below for details:
https://pastebin.com/EaXpr9St

@hao-yu hao-yu force-pushed the fixed_30053_audit_page_slow branch 2 times, most recently from 8667aba to 894509a Compare September 3, 2020 06:41
@theforeman-bot
Copy link
Member

Thank you for your contribution, @hao-yu! This PR has reached an impasse with no new activity for 1 months, closing for now.
Feel free to reopen if you feel an agreement can be reached. This is an automated process.

taxonomy = taxonomy_class.current
if any_context?(taxonomy)
taxonomy_relation = taxonomy_class.to_s.underscore.pluralize
taxonomy_class.send("my_#{taxonomy_relation}").reorder(nil).pluck(:id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use User.taxonomies_and_child_ids(taxonomy_class.to_s.pluralize) to get the ids directly (and in most cases from the cache) rather than trigger another query here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because I would like to return all taxonomies if user is an admin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then how about

Suggested change
taxonomy_class.send("my_#{taxonomy_relation}").reorder(nil).pluck(:id)
User.current.admin? ? taxonomy_class.unscoped.all : User.taxonomies_and_child_ids(taxonomy_class.to_s.pluralize)

taxonomy_relation = taxonomy_class.to_s.underscore.pluralize
taxonomy_class.send("my_#{taxonomy_relation}").reorder(nil).pluck(:id)
else
get_taxonomy_ids(taxonomy, :subtree_ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also get all ancestors if the current taxonomy is a nested one, is that on purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should get all descendants of the current taxonomy including itself. This should be the same as inner_select method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing but that isn't what get_taxonomy_ids does:
https://github.com/theforeman/foreman/blob/develop/app/models/concerns/taxonomix.rb#L60-L62

to get all descendants you should use

Suggested change
get_taxonomy_ids(taxonomy, :subtree_ids)
taxonomy.subtree_ids

directly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code here is based on the below logic. The below logic is getting all the ancestors too so I am a quite confuse how this taxonomy hierarchy should work. If this code is wrong then maybe the below logic is wrong too.

https://github.com/theforeman/foreman/blob/develop/app/models/concerns/taxonomix.rb#L87

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not causing a regression, so I'm all for merging it now.

Copy link
Member Author

@hao-yu hao-yu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. How is this going? Not why this is taking so long.


def join_taxonomies(taxonomies, tax_arel)
class_arel = base_class.arel_table
statement = class_arel.join(tax_arel, Arel::Nodes::OuterJoin).on(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong. I had thought about INNER JOIN earlier but I didn't think it will work because I need to include the audit rows even it is only partially taxable, such as taxable only by organization.

Yes, I think taxanomy_ids in the OUTER JOIN is optional. It is just there for me to feel safer. I am happy to remove it if everyone can confirm it is good to remove.

taxonomy_relation = taxonomy_class.to_s.underscore.pluralize
taxonomy_class.send("my_#{taxonomy_relation}").reorder(nil).pluck(:id)
else
get_taxonomy_ids(taxonomy, :subtree_ids)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should get all descendants of the current taxonomy including itself. This should be the same as inner_select method.

taxonomy = taxonomy_class.current
if any_context?(taxonomy)
taxonomy_relation = taxonomy_class.to_s.underscore.pluralize
taxonomy_class.send("my_#{taxonomy_relation}").reorder(nil).pluck(:id)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because I would like to return all taxonomies if user is an admin.

@theforeman-bot
Copy link
Member

Thank you for your contribution, @hao-yu! This PR has reached an impasse with no new activity for 1 months, closing for now.
Feel free to reopen if you feel an agreement can be reached. This is an automated process.

@ezr-ondrej
Copy link
Member

I've rebased on top of #8370 to run the tests :)

@ezr-ondrej
Copy link
Member

[test foreman] the integration tests were terribly slow, lets try again if that was an anomaly

@ezr-ondrej
Copy link
Member

[test foreman] ok it is not terribly slow, but still seems to be slower than the others.

@hao-yu
Copy link
Member Author

hao-yu commented Mar 31, 2021

The test with this patch seems faster? test with this patch spent 11secs and test without this patch spent 18 secs?

@ezr-ondrej
Copy link
Member

The test with this patch seems faster? test with this patch spent 11secs and test without this patch spent 18 secs?

It is not consistent though (take a look into history of tests) so it is hard to base any evidence on that, but it is working and I'd like to start using arel in taxonomies, because I belive it'll make them more readable. We have doubled the test cases for this so I believe this works as it did and it is faster.

I'm merging this.

@ezr-ondrej ezr-ondrej dismissed tbrisker’s stale review March 31, 2021 11:27

Thanks for the review, I believe all blocking comments were resolved.

Copy link
Member

@ezr-ondrej ezr-ondrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hao-yu ! Let see it in production :)

@ezr-ondrej ezr-ondrej merged commit 9b62448 into theforeman:develop Mar 31, 2021
@ezr-ondrej ezr-ondrej removed the Reached an impasse Needs broader discussion label Apr 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants