Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace pluck with select in permissions to avoid extra queries and extract Permissions::Orders from Permissions #4523

Merged
merged 13 commits into from
Dec 2, 2019

Conversation

luisramos0
Copy link
Contributor

@luisramos0 luisramos0 commented Nov 28, 2019

What? Why?

Closes #4521

I thought of improving performance of Report::LineItems.seach_orders but ended up clean up Permissions (this class is overloaded with responsabilities) by extracting Permissions::Order from it.

Going forward I think we can break Permissions and OrderCyclePermissions in smaller parts and move them to app/services/permissions like Permissions::Order. I wonder if there are any objections to move them to this place.

In addition to extracting the code to a separate class, this PR also replaces a few plucks with selects. This comes from a lesson learned in #3709, if we are going to chain relation methods or use relations in IN clauses of other relations we should use select so that we reduce the number of queries to the DB. So, pluck should only be used on the final step so we take advantage of its use: to avoid mapping the full models and instead just map out the field we want.

Additionally, this PR converts Report::LineItems to class and memoizes orders so it's only executed once (this takes 3secs to the load time of the report I am using to test).

Finally (see last commit), this PR closes 4521 by changing the logic of hidden line items in Report::LineItems. In one particular case of 4521 for the 25th and 26th Nov with live data, page load time moves from 8mins to 77secs 🎉

What should we test?

There's no change to the logic.
It can be a lot of work to validate all these permissions, I think we can trust the auto tests here and run a few manual validations.
One example test would be to test the the order_and_fullfilment report and make sure permissions are still working correctly: a user can see orders of hubs he is a manager of as well as orders in OCs they coordinate as well as orders of hubs selling products of the user's producer enterprises. This last case is the main case changed in this PR: when a manager of a producer of a product that is sold by hub (where the user is not manager of) through an OC (where the user is not manager of the OC coordinator), this user should see the orders with hidden customer details: order addresses and order email.

Other pages that we can test:

  • the bulk orders management page - list of orders should be correct (EDITED: orders of managed enterprises plus orders of coordinated OCs)
  • bulk_coop_report , order_and_distributor_report, /orders_and_fulfillments_report - a simple test should be enough, all these reports will run faster now
  • And finally, the packing report

Release notes

Changelog Category: Changed
Simplified and optimized permissions calculation.

@luisramos0 luisramos0 self-assigned this Nov 28, 2019
@luisramos0 luisramos0 force-pushed the permissions_improve branch 3 times, most recently from 5c18eab to 66376e8 Compare November 29, 2019 10:35
@luisramos0 luisramos0 changed the title WIP Replace pluck with select in permissions to avoid extra queries WIP Replace pluck with select in permissions to avoid extra queries and extract Permissions::Orders from Permissions Nov 29, 2019
@luisramos0 luisramos0 changed the title WIP Replace pluck with select in permissions to avoid extra queries and extract Permissions::Orders from Permissions Replace pluck with select in permissions to avoid extra queries and extract Permissions::Orders from Permissions Nov 29, 2019
@luisramos0 luisramos0 removed the pr-wip label Nov 29, 2019
Copy link
Member

@mkllnk mkllnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great investigation, @luisramos0! That last trick is so important. The old logic was looking in an ever increasing space of hidden line items. No wonder the situation got worse and worse.


Spree::Order.where(id: editable | produced)
Spree::Order.where(id:
managed_orders.select(:id) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this avoid extra queries. Our Rails version doesn't have an or operator in ActiveRecord. I get this on the console:

Spree::Order.where(id: Spree::Order.order(:id).limit(3).select(:id) | Spree::Order.order("id desc").limit(3).select(:id))
#  Spree::Order Load (0.9ms)  SELECT id FROM "spree_orders" ORDER BY id LIMIT 3
#  Spree::Order Load (0.7ms)  SELECT id FROM "spree_orders" ORDER BY id desc LIMIT 3
#=>   Spree::Order Load (1.0ms)  SELECT "spree_orders".* FROM "spree_orders" WHERE "spree_orders"."id" IN (1, 2, 4, 577267, 577266, 577265)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's something that's repeated throughout the current permissions logic. The single pipe runs both queries instead of combining them. I think we should be able to make big improvements there with Rails 4...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could add a temporary gem to backport some of that functionality until we get to Rails 4?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or only comes in rails 5.
I think we should use arel for these cases as suggested in this post.
https://stackoverflow.com/questions/3639656/activerecord-or-query

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding dependencies makes the upgrade harder. The fewer gems, the better.

lib/open_food_network/reports/line_items.rb Outdated Show resolved Hide resolved
Copy link
Contributor

@Matt-Yorkley Matt-Yorkley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I think breaking up OpenFoodNetwork::Permissions and starting to improve it bit by bit is a massive step in the right direction. ❤️

@Matt-Yorkley
Copy link
Contributor

I think we should trust the auto tests here

There's various things that have been touched here, like reports and the BulkLineItemsController. I think it would make sense to test them...

@luisramos0
Copy link
Contributor Author

I have improved the "what to test" part in the description.

@sigmundpetersen
Copy link
Contributor

Something wrong with the build. Same error in 3 places

…ing for line_items that are hidden, it looks for line items that are not editable using a merge statement that performs much better

Also, remove unnecessary if clause, merge will return an empty relation if no items are found, no need to test for empty.

The test report runs in a little over one minute instead of 8minutes
@luisramos0
Copy link
Contributor Author

luisramos0 commented Nov 30, 2019

yeah, i needed to test this better. good that I broke it with this last commit because I dont think the change with merge was correct.

there's something funny about how merge merges the conditions in two queries, it looks like it overwrites conditions on the same field, in this case spree_line_items.order_id

See below for some very interesting

(byebug) editable_line_items_ids.to_sql "SELECT id FROM \"spree_line_items\" WHERE \"spree_line_items\".\"order_id\" IN (SELECT \"spree_orders\".\"id\" FROM \"spree_orders\" WHERE 1=0)"

(byebug) line_items.to_sql
"SELECT "spree_line_items".* FROM "spree_line_items" WHERE "spree_line_items"."id" IN (174) AND "spree_line_items"."order_id" IN (SELECT "spree_orders"."id" FROM "spree_orders" WHERE "spree_orders"."id" IN (200) AND (completed_at IS NOT NULL) AND (state != 'canceled'))"

(byebug) editable_line_items_ids.merge(line_items).to_sql
"SELECT id FROM "spree_line_items" WHERE "spree_line_items"."id" IN (174) AND "spree_line_items"."order_id" IN (SELECT "spree_orders"."id" FROM "spree_orders" WHERE "spree_orders"."id" IN (200) AND (completed_at IS NOT NULL) AND (state != 'canceled'))"

(byebug) line_items.merge(editable_line_items_ids).to_sql
"SELECT id FROM "spree_line_items" WHERE "spree_line_items"."id" IN (174) AND "spree_line_items"."order_id" IN (SELECT "spree_orders"."id" FROM "spree_orders" WHERE 1=0)"

ok, I pushed a different version of the last commit where I use merge with array and not merge with relation to avoid the problem of overwriting conditions.
It still looks really fast compared to master.

Copy link
Member

@mkllnk mkllnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Luis!

# In this case: the IN clause on spree_line_items.order_id from line_items
# overwrites the IN clause on spree_line_items.order_id on editable_line_items_ids
# We convert to array the relation with less elements: line_items
editable_line_items_ids.merge(line_items.to_a)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came across this recently when fixing the Lettuceshare report. I solved it by using SQL:
https://github.com/openfoodfoundation/openfoodnetwork/pull/4168/files#diff-db07cecd37d0749b2162463301567e23L68-R72

I think that conditions in hashes get overridden and it's a bug. If you add the condition as a string as in where("id in (?)") it does not override the condition. In this case I'm wondering if you could do:

editable_line_items_ids.where("spree_line_items.id IN (?)", line_items.select(:id))

But looking at the code again, line_items is loaded into an array anyway. As long as the list is reasonably short, it's better to re-use that in-memory list instead of asking the database to compile the ids again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thanks for sharing. It's really good to know this.
I will leave it as is. I think this will be ok even if there are a few thousand line items in the report.

@lin-d-hop lin-d-hop added the pr-staged-uk staging.openfoodnetwork.org.uk label Dec 2, 2019
@lin-d-hop
Copy link
Contributor

Sorry my testing took so long.
It was quite a curly change and I wanted to put it through its paces.

Testing notes here:
https://docs.google.com/document/d/1QrbafB1BsDhveliknjOKWPZtN4G8UGfeTvWe-YGQHlk/edit#

Brilliant work on this. Thanks so much for all of the fast-paced efforts and patience!!

@luisramos0 luisramos0 merged commit c8395a4 into openfoodfoundation:master Dec 2, 2019
@luisramos0 luisramos0 deleted the permissions_improve branch December 2, 2019 16:31
@sauloperez
Copy link
Contributor

congrats for this awesome work @luisramos0 !

@Matt-Yorkley
Copy link
Contributor

Awesome work @luisramos0!

@sauloperez I mentioned briefly earlier that I don't think background jobs are the way to go (at the moment). If we move reports to background jobs we will still have the same horrible queries hammering the database, just in separate processes and with less visibility. Breaking up and improving OpenFoodNetwork::Permissions here improved the loading of this report from 8 minutes to 77 seconds. In terms of improving reports, I think this is what needs to be done first.

@RachL RachL removed the pr-staged-uk staging.openfoodnetwork.org.uk label Dec 4, 2019
@sauloperez
Copy link
Contributor

sauloperez commented Dec 5, 2019

Sure, let's just see how long we can keep things up without reasonable timeouts. I hope the infrastructure holds up 🙈

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tamar Valley Food Hub - Pack By Supplier report
7 participants