Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote Order#all_adjustments to association. #466

Merged
merged 1 commit into from
Oct 29, 2015
Merged

Promote Order#all_adjustments to association. #466

merged 1 commit into from
Oct 29, 2015

Conversation

mbj
Copy link
Contributor

@mbj mbj commented Oct 28, 2015

Promotes Order#all_adjustments to association.

Benefits:

  • Kill redundant code. The predicate used before is the more complex version of a join by order_id
  • Its possible to specify includes
  • Enables other patches of ours to be applied
  • Reduces the amount of handwritten SQL for an AR buildin (code reduction)
  • Reduces a special case to test for the custom redundant code.
  • Fix invariant violating call sides (mainly in specs).

The MySQL migration was verified, but is slower than the postgresql one.

I'd be happy to apply the check constraint for postgresql by default (under a postgresql adapter guard) feel free to get me feedback on that one.

@mbj mbj changed the title [WIP] Promote Order#all_adjustments to association. Promote Order#all_adjustments to association. Oct 28, 2015
@mbj
Copy link
Contributor Author

mbj commented Oct 28, 2015

To all reviewers this is ready for review.


# Submitter of change does not care about MySQL, as it is not officially supported.
# Still spree officials decided to provide a working code path for MySQL users, hence
# submitter made a AR code path he could validate on PostgreSQL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: a => an

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole comment section does not apply to Solidus as MySQL is officially supported. I'll kill it.

@jordan-brough
Copy link
Contributor

Nice, this is good stuff.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

@jordan-brough How do you feel adding the FK / check constraint under postgresql by default?

order_id =
(SELECT order_id FROM spree_line_items WHERE spree_line_items.id = spree_adjustments.adjustable_id)
WHERE
adjustable_type = 'Spree::LineItem'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a and order_id is null here also? For me that reduces the time from > 4 min (I got impatient and killed the query before it finished) to < 1s.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do. I just checked our internal migration and it also had this clause. It was dropped for spree upstreaming for some reason

@jordan-brough
Copy link
Contributor

@mbj should we add an update for spree_shipments like you have for spree_line_items?

@jordan-brough
Copy link
Contributor

How do you feel adding the FK / check constraint under postgresql by default?

I'm very 👍 on adding the FK constraint. We have that FK in our own DB. We can use the new Rails 4.2 add_foreign_key support for that and get it on postgres and mysql right?

The check constraint seems cool also, though I'm hesitant to start introducing too much db-specific stuff. Though this is just safety-check stuff and not behavioral stuff. Googling around it looks like if we wanted it for mysql also we'd have to add that via a trigger? Which I think has other implications we might not want to get into.

@gmacdougall any thoughts? Or anyone else?

@BenMorganIO
Copy link
Contributor

👍 on another fk! Not too much else I can comment on...

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

@mbj should we add an update for spree_shipments like you have for spree_line_items?

Lets keep this iteration as small as possible. Conceptually yes.

We only optimized /order/populate so far, all our optimizations can in theory also be applied to the shipments / inventory stuff. It suffers from the same design "smells". But for now lets focus on upstreaming what was proved to work in our case.

I'm very on adding the FK constraint. We have that FK in our own DB. We can use the new Rails 4.2 add_foreign_key support for that and get it on postgres and mysql right?

Yes you can. We have full FKs in our app active already and it catches tons of bugs (we have fixes for in our fork).

I'd recommend to add FKs for everything you touch right now. Or even do a big pass adding FKs in postgresql specific syntax. Mostly because it catches so much. I'll look into upstreaming our FK generation / validation code. But not this iteration.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

The check constraint seems cool also, though I'm hesitant to start introducing too much db-specific stuff. Though this is just safety-check stuff and not behavioral stuff. Googling around it looks like if we wanted it for mysql also we'd have to add that via a trigger? Which I think has other implications we might not want to get into.

I've no idea for MySQL. The problem with polymorphic associations is the low degree of DB side validation, this check constraint puts a bit of back-pressure in. For now I think the PostgreSQL specific check constraint is worth it. Especially as code bugs exposed in PostgreSQL will have a positive effect on MySQL installations even when the constraint there does not exist.

Longterm I think we can likely untangle the polymorphic associations, but thats a future point to discuss.

@jordan-brough
Copy link
Contributor

Lets keep this iteration as small as possible. Conceptually yes.

Hm, so I'm saying let's add this:

UPDATE spree_adjustments
SET order_id = (SELECT order_id FROM spree_shipments WHERE spree_shipments.id = spree_adjustments.adjustable_id)
WHERE adjustable_type = 'Spree::Shipment'
AND order_id IS NULL

If we're repairing the order id I'm not sure why we wouldn't do that at the same time? (Even if you didn't do that originally.) Any missing data there is going to cause the null: false constraint we're trying to add here to fail. That's the only other type of adjustment in stock spree I believe.

Yes you can. We have full FKs in our app active already and it catches tons of bugs

Great! We likewise have full FKs and have found it very helpful as well. Can you move the foreign key addition outside of the if postgres condition and use the rails standard add_foreign_key for that?

For now I think the PostgreSQL specific check constraint is worth it. Especially as code bugs exposed in PostgreSQL will have a positive effect on MySQL installations even when the constraint there does not exist

That sounds great to me.

@dkubb
Copy link
Contributor

dkubb commented Oct 29, 2015

We have full FKs in our app active already and it catches tons of bugs (we have fixes for in our fork).

How we did this is we temporarily added immigrant as a gem dependency and used it to generate a migration that added the foreign keys. This allowed us to backfill missing FK's for all declared associations. We could create a future PR that uses the rails 4.2 add_foreign_key method that basically does the same thing.

@jordan-brough
Copy link
Contributor

How we did this is we temporarily added immigrant as a gem dependency and used it to generate a migration that added the foreign keys.

That's actually exactly what we have done as well. :) Since it's relatively easy I think a PR would be a great way to kick off that discussion if you don't mind making one.

@jordan-brough
Copy link
Contributor

Longterm I think we can likely untangle the polymorphic associations

+10000 I'd be really happy to see that happen

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

If we're repairing the order id I'm not sure why we wouldn't do that at the same time? (Even if you didn't do that originally.) Any missing data there is going to cause the null: false constraint we're trying to add here to fail. That's the only other type of adjustment in stock spree I believe.

Because I like to do one thing at a time, at this point. The patch as is was tested against our production system. Before we add more constraints (for example the FK on shipments) we probably want to check all data generation call sides for possible violations.

I prefer 2 PRs over one. And I prefer even more to upstream what I have right now instead of delaying one (this) PR to add stuff that was not done yet.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

I just activated the postgresql check constraint on the polymorphic association keys. Seems solidus has call sides that violate it. Investigating.

@jordan-brough
Copy link
Contributor

Because I like to do one thing at a time, at this point. The patch as is was tested against our production system. Before we add more constraints (for example the FK on shipments) we probably want to check all data generation call sides for possible violations.

@mbj Totally agree with you in general on doing one thing at a time. But are we on the same page here? We're adding null: false to spree_adjustments.order_id in this migration and I'm talking about repairing any spree_adjustments.order_id nulls for spree_adjustments.adjustable_type == 'Spree::Shipment' (just as we've done for line items). If we don't do that then the null: false in this migration could fail, which would not be a good experience for people trying to upgrade.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

@mbj Totally agree with you in general on doing one thing at a time. But are we on the same page here? We're adding null: false to spree_adjustments.order_id in this migration and I'm talking about repairing any spree_adjustments.order_id nulls for spree_adjustments.adjustable_type == 'Spree::Shipment' (just as we've done for line items). If we don't do that then the null: false in this migration could fail, which would not be a good experience for people trying to upgrade.

I think I missed your point. In our case the order_id was fully reconstructible from the line items already. If you have bad data that is only reconstructible by shipments we need to add your statement too.

Sorry for the noise. BTW the postgresql check constraint on the adjustable_id / order_id found other violations. We likely have a patch for these. I hope this is a perfect example on why we want even postgresql specific checks in the code base. That will expose bugs that when fixed even will benefit MySQL users.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

OT: The amount of entropy our databases show compared cross spree / solidus installations shows the big problems the code base has / had in the past / currently.

I really hope we can primarily backfill missing validations and reduce the entropy and do not think about large features. Stabilizing the heck out of this thing before even doing "equivalent sematics but cleaner code" refactoring. Its much easier to improve the implementation of a consistent system.

@gmacdougall
Copy link
Member

Though I would prefer checks that worked in both MySQL and PostgreSQL, I'm with @mbj that a Postgres specific check is better than nothing due to the reasons he noted.

I would want to put some comments around to make sure that we don't intentionally write logic around that relying on the DB check to catch it. However, in the case of a mistake, I like my DB to complain, and tell me what happened so I can fix it rather than put bad data in the DB for years.

I'm also hugely on board with cleaning up our data historically to correct mistakes and/or bad decisions that were made in the past, rather than leaving hacks in our code to deal with the different way things used to be.

I like this change to all_adjustments, and echo @jordan-brough with sentiments that there is much more to improve here, but this is a great first step.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

Longterm I think we can likely untangle the polymorphic associations
+10000 I'd be really happy to see that happen

We did some experiments with promising results a while back on performance optimization passes. But it turns out the memory scope thing we developed was the better mid-term choice that also creates the infrastructure we need to do the reduction of polymorphic associations.

When this PR got accepted, I can look into the memory scope one that fixes a lot of N * (N+1) instances in the order populator.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

I like this change to all_adjustments, and echo @jordan-brough with sentiments that there is much more to improve here, but this is a great first step.

Yes this is only a first step.

@jordan-brough
Copy link
Contributor

I hope this is a perfect example on why we want even postgresql specific checks in the code base

To be clear: I think that's great and am not opposed to it at all. Like I said before, these are just additional safety checks, not behavioral differences, so that seems good to me. I'm mostly saying "we haven't done this before so let's make sure everyone is on board" and "maybe we should also do it for mysql". I don't think we should for mysql since it would involve triggers. @gmacdougall or @jhawthorn any thoughts on that stuff?

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

To be clear: I think that's great and am not opposed to it at all. Like I said before, these are just additional safety checks, not behavioral differences, so that seems good to me. I'm mostly saying "we haven't done this before so let's make sure everyone is on board" and "maybe we should also do it for mysql". I don't think we should for mysql since it would involve triggers. @gmacdougall or @jhawthorn any thoughts on that stuff?

I'm fine with that approach, but I'd love to not delay this PR to wait on a MySQL solution for weeks.

@jordan-brough
Copy link
Contributor

Just saw @gmacdougall's response. Cool. @gmacdougall are you likewise thinking we shouldn't venture there for now with mysql?

WHERE
adjustable_type = 'Spree::LineItem'
AND order_id IS NULL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you missing a semi-colon here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. BTW I'd ping you in case this is ready for a next pass. No need to waste your time that does not pass CI yet.

@gmacdougall
Copy link
Member

I'm fine with leaving out MySQL advanced constraints knowing how terrible the implementation would be.

@jordan-brough
Copy link
Contributor

I'd love to not delay this PR to wait on a MySQL solution for weeks

Haha, I doubt it would be weeks. :) This PR has been open for 15 hours now, I don't think we need to start talking about "not delaying this any further" quite yet. :) (We've had a lot of discussion on this PR but I think a lot of that is discussion on the good patterns you're contributing, which merit discussion and consensus.)

I'm fine with leaving out MySQL advanced constraints knowing how terrible the implementation would be.

Cool.

@mbj This is looking great to me! If you can move the foreign key addition up out of the postgres if and use add_foreign_key for it that would be great.

Do you want to move the polymorphic constraint into a separate PR or tackle it now, given that there seem to be issues with it? (Either way sounds great to me.)

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

@mbj This is looking great to me! If you can move the foreign key addition up out of the postgres if and use add_foreign_key for it that would be great.

AFAIK add_foreign_key is not supported by the rails version we are running right now. I'l ldouble check this. This info is from our spree fork, not solidus one. I'm still building up the "mental map" for the solidus state of the project. This will take some time, but when I'm done I'll proactively use the least powerful primitive that does the job. (add_forkeign_key is less powerful than a generic SQL DML string).

Do you want to move the polymorphic constraint into a separate PR or tackle it now? (Either way sounds great to me.)

Checking our fork in a sec for fixes for the violating call sides. If they exist I'll include them into this PR, if not I'll extract the bug exposing check in a separate PR.

@jordan-brough
Copy link
Contributor

AFAIK add_foreign_key is not supported by the rails version we are running right now. I'l ldouble check this. This info is from our spree fork, not solidus one.

Yeah, Solidus is on Rails 4.2, which supports it. We've just recently begun making use of it, actually, thanks to @BenMorganIO: 8394c3c#diff-61cdaf6ae78f46af1b88b4cbb3614770R3

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

Yeah, Solidus is on Rails 4.2, which supports it.

K, thx on pointing me on this. If I can keep working on Solidus (clients decision, not mine) its likely the number of those pointers will reduce to zero as I internalize the state of this project over time.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

Yeah, Solidus is on Rails 4.2, which supports it. We've just recently begun making use of it, actually, thanks to @BenMorganIO: 8394c3c#diff-61cdaf6ae78f46af1b88b4cbb3614770R3

Actually there are reasons I cannot use this commit as a reference:

8394c3c#diff-61cdaf6ae78f46af1b88b4cbb3614770R3

  • Rails generates indeterministic FK identifiers. I highly recommend to add name: $owner_table_$colum to the options. The default is something useles like fk_rails_$randomhex that makes comparing databases cross installations complex.
  • We know the AR layer handles destroys of adjacent records. Unless we remove this responsibility the actions set for on_update and on_delete are to permissive and do not mirror a case where we actually intentionally want the DB to do these cascading actions.

I did not follow this commit for the reasons above. Should I add an issue to fix the above concerns I have with the reference commit?

This is not ready for review yet, as there are still invariant violating call sides poping up.

@jordan-brough
Copy link
Contributor

@mbj Sorry, I didn't mean for that commit to be a reference. It was just an FYI that we've begun adding foreign keys to Solidus.

What you have now is exactly what I was hoping for.

the DB to do these cascading actions.

Yes -- I do not want us to use cascade here. See our discussion here about cascade. We've decided that we don't want cascade anywhere since rails+sqlite doesn't support foreign keys at all yet (and we say we support sqlite).

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

Yes -- I do not want us to use cascade here. See our discussion here about cascade. We've decided that we don't want cascade anywhere since rails+sqlite doesn't support foreign keys at all yet (and we say we support sqlite).

I went with explicit :restrict for now. This is most intention revealing.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

@jordan-brough Ready for next pass.

I needed to do some more cycles to catch all constraint violating call sides. All of them where hidden in specs and only manifest on specific sequences of execution.

@jordan-brough
Copy link
Contributor

👍 great stuff! Thanks for sticking with this long discussion.

Awaiting a +1 from @jhawthorn or @gmacdougall or @cbrunsdon to merge.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

Thanks for sticking with this long discussion.

This is expected 1rst PR cost to pay when syncing with another team. I'd be more worried / concerned when this discussion had NOT happen.

@cbrunsdon
Copy link
Contributor

I'm super 👍 but will leave @jhawthorn to officially call it on our end

WHERE
adjustable_type = 'Spree::Shipment'
AND order_id IS NULL
SQL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is failing for me under mysql (which our CircleCI unfortunately doesn't test). Can we split it into three separate execute statements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is failing for me under mysql

Can you show me how it failed?

(which our CircleCI unfortunately doesn't test)

Lets fix it. We already use multiple containers, I could easily change the build-ci.rb script to run each projects also against a different DB environment variable.

This would double the CI times, or in case you double containers would be at free (time) cost.

I know this is not ideal, but an improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also we could add per subproject / DB combination status reports to the github PR via its API. I did that once for a client. Dunno if we have the time resources to do it here but still an option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if you double CI containers (or state a twice as long build is okay), I'd be fine to run the build-ci.rb improvement to include MySQL as a followup on this PR (assumption my client wants me contributing here, just trying to build up nice TODOs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could use travis 😉. https://travis-ci.org/jhawthorn/solidus

You are right. I've created #469 to track this.

Here's the error I get

Mysql2::Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'UPDATE spree_adjustments SET order_id = (SELECT order_id FROM spree_line_items W' at line 1: UPDATE spree_adjustments SET order_id = adjustable_id WHERE adjustable_type = 'Spree::Order' AND order_id IS NULL ; UPDATE spree_adjustments SET order_id = (SELECT order_id FROM spree_line_items WHERE spree_line_items.id = spree_adjustments.adjustable_id) WHERE adjustable_type = 'Spree::LineItem' AND order_id IS NULL ; UPDATE spree_adjustments SET order_id = (SELECT order_id FROM spree_shipments WHERE spree_shipments.id = spree_adjustments.adjustable_id) WHERE adjustable_type = 'Spree::Shipment' AND order_id IS NULL/home/jhawthorn/src/solidus/vendor/bundle/ruby/2.2.0/gems/activerecord-4.2.4/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:305:in `query'

Splitting the query into three statements fixes the issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting the query into three statements fixes the issue.

MySQL still amazes me. Will do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the adapter doesn't split by ; into separate statements. Which I think is semi-sensible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think the adapter should not. As he'd had to implement a full SQL parser to determine the role of a ; character.

Likely its a protocol level limitation by MySQL.

@mbj mbj mentioned this pull request Oct 29, 2015
* Fix invariant violating call sides
* Add migrations to fix bad data already in the system
@jhawthorn
Copy link
Contributor

👍 Looks great. Thanks for working through this.

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

You need more CI resources: Queued 16:14 waiting for builds to finish

@mbj
Copy link
Contributor Author

mbj commented Oct 29, 2015

I see this constantly. When both @dkubb and me are working at full speed at the current resource level we'd cause much huger build queues that slow down everyone.

Is someone paying for CircleCI right now?

jhawthorn added a commit that referenced this pull request Oct 29, 2015
Promote Order#all_adjustments to association.
@jhawthorn jhawthorn merged commit 15b1fe4 into solidusio:master Oct 29, 2015
@cbrunsdon
Copy link
Contributor

@mbj nobody is paying for it right now but I'll talk with hawthorn next time he is in the office and we can look at what kind of bullet we'd need to bite. Don't really want this slowing people down.

@mbj mbj deleted the fix/order-adjustment-association branch May 4, 2017 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants