Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4.x]: Upgrade script from Commerce 3 to Commerce 4 - Migrating address data extremely slow #3286

Closed
ragnarfrosti opened this issue Oct 2, 2023 · 27 comments
Assignees
Labels
bug commerce4 Issues related to Commerce v4

Comments

@ragnarfrosti
Copy link

What happened?

We are preparing one of our client webs for the Craft/Commerce 4 upgrade.

The final thing we are doing before launching the update is running the upgrade, locally in DDEV, using a copy of the production database with all customer/order data.

That database has ~200 000 orders and ~400 000 addresses and running the commerce/upgrade command on that database is extremely slow and specially the Migrating address data part that takes more than 8 hours to finalize.

How did your test-upgrade with a large amount of order-data work performance wise? I guess we must have missed something in our commerce setup because this can't be the expected behaviour of the upgrade script (or is something wrong with the script performance?)

Any help appreciated since we are planning to push out the update to production later this week and having to close the web for over 8 hours is an option that is hard to present to the client.

Craft CMS version

4.5.5

Craft Commerce version

4.3.0

PHP version

8.0.30

Operating system and version

No response

Database type and version

No response

Image driver and version

No response

Installed plugins and versions

@ragnarfrosti ragnarfrosti added commerce4 Issues related to Commerce v4 bug labels Oct 2, 2023
@codyjames
Copy link

Oof, yeah, I'm concerned about this for our upcoming migration as well. Hoping there is a way to speed this up!

@Anubarak
Copy link

Anubarak commented Oct 5, 2023

There is nearly no way to increase the performance for this.
There would be if it was the other way around (Element to Record) but since the system needs to create around 400.000 elements it will just take that much time because elements are split between so many different tables.

While there could be minor improvements in the code they won't change too much and the impact is not too much.
You could ask for more power / more RAM / more CPU on the target system to reduce the time it takes but a long migration process is one of the disadvantages to make addresses elements.

@lukeholder
Copy link
Member

lukeholder commented Oct 5, 2023

We are investigating any improvements we can make and will update if we find anything, but please keep in mind that once you have migrated, and run the upgrade the command on production your site, the site will continue to work, but customers will not have all their addresses available for selection until the upgrade command has finished. New orders can still be made etc. We will document this on the upgrade guide.

@ragnarfrosti
Copy link
Author

but please keep in mind that once you have migrated and run the upgrade the command on production your site will continue to work but customer will not have all their addresses available for selection until the upgrade command has finished

That does not seems to be entirely true, I can view some part of the web yes, but I can't write to the database when the update script is running since the transaction locks the database. So when I try to login or add products in the cart when running the update script the requests times out and returns "General error: 1205 Lock wait timeout exceeded;"

We have a test environment on the same server as the production environment and have tried to run the update script there with increased resources (54GB RAM, 32GB innodb buffer pool size settings etc) and the performance on that server was even worse than locally and would take ~360 hours!

tmux-server

The 8 hours estimate I mention in the first message is the best case scenario since then we usually see ETA around 1-2 weeks now locally. Once in a while we get the ETA down to ~8 hours but not every time. We have disabled all plugins (except for Craft + Commerce), we have tried all types of different DDEV setups with improved server/PHP values (10 cores, 32GB memory, improved buffers, etc) in both MySQL/MariaDB environments without any reliable results.

@isodude
Copy link

isodude commented Oct 5, 2023

I think craft\commerce\services\Orders->eagerLoadAddressesForOrders is partly to blame here. The SQL query takes about 3 seconds to complete.

EDIT: when testing the same query before the upgrade it's quite fast (0.1s). Is it not possible to make the same query but for a whole bunch matches at the same time?
EDIT2: Could the problem arise when a schema migration is done?
EDIT3: It seems that (sourceBillingAddressId|sourceShippingAddressId) is not indexed even though it's used in a WHERE.

@ragnarfrosti
Copy link
Author

ragnarfrosti commented Oct 5, 2023

@isodude I think you are onto something here.

I looked at the Order service and just below the function you referred to above is an address-event-handler:
public function afterSaveAddressHandler(ModelEvent $event): void that makes a query that uses sourceBillingAddressId and sourceShippingAddressId in the where condition.

If i skip the code from this event handler I get an ETA of less than 1 hour instead of 146 hours.
The sourceBillingAddressId and sourceShippingAddressId didn't exist in Craft 3 if I remember it correctly so this code should be ok to skip completely during the commerce/upgrade command?

Since we are going to use these fields in Commerce 4, then surely the system will need indexes on these two fields to avoid this performance bottleneck when using these two fields in a query?

Is this something that can help you with the performance improvements @lukeholder ?

Screenshot 2023-10-05 at 23 44 37

EDIT: Added index on these two columns in our test environment and the Address migrations of the commerce/upgrade script ETA is just under 1 hour even when running the code from afterSaveAddressHandler.

@lukeholder
Copy link
Member

lukeholder commented Oct 6, 2023

@ragnarfrosti

Fantastic finds!

I have made the various improvements. Could you all please test:

To get the fix early, change your craftcms/commerce requirement in composer.json to:

"require": {
  "craftcms/commerce": "dev-develop#e71e76686e60bcb39c0b70b029c45e1016fbfe8a as 4.3.0",
  "...": "..."
}

Then run composer update.

cc @isodude @Anubarak @codyjames

@lukeholder lukeholder self-assigned this Oct 6, 2023
@Anubarak
Copy link

Anubarak commented Oct 6, 2023

Great 👍
Will test them next week and give feedback.
One thing that might be a possible overall performance increase for general element saves would be a way to save multiple elements in batches via batchInsert. Of course this might be highly risky in case those elements are going to reference each other but I guess in this case it would be a performance increase.

That could be useful for other purposes as well - we would miss some events of course - but I had to write that once too (just very specific for one element type)
https://www.yiiframework.com/doc/api/2.0/yii-db-command#batchInsert()-detail

@ragnarfrosti
Copy link
Author

@lukeholder yes I ran the upgrade script using the branch above and that works well and I get the same ETAs as with my manually fixes yesterday 👍

@lukeholder
Copy link
Member

@Anubarak yeah we are looking into it but it's more complex than active record inserts.

@ragnarfrosti excellent. Will be in a release soon.

@christofersandin
Copy link

@lukeholder

We are investigating any improvements we can make and will update if we find anything,

Regarding issues like this, how large data sets is an upgrade script like this tested with on your end? I want to think that there are quite a few Craft Commerce systems out there with both 500,000, 1,000,000, and 1,500,000 addresses.

please keep in mind that once you have migrated and run the upgrade the command on production your site will continue to work but customer will not have all their addresses available for selection until the upgrade command has finished. New orders can still be made etc. We will document this on the upgrade guide.

Is this actually true? From our experience the system locks database writes and essential data like countries for shipping and tax rules, etc., are populated during the transaction or am I missing something here?

@lukeholder
Copy link
Member

lukeholder commented Oct 6, 2023

@christofersandin the upgrade slowness was caused by new code added in 4.2.x and the performance regression was not caught until we had received a number of complaints well after 4.0 had come out. We tested the original 4.0 upgrade in development with a couple of large real databases as well as synthetic and it seemed reasonable when 4.0 came out. We will continue to monitor and see if we can add performance regressions tests in future releases. Thanks.

@lukeholder
Copy link
Member

Will be included in a tagged version release in a few days.

@isodude
Copy link

isodude commented Oct 6, 2023 via email

@Anubarak
Copy link

Anubarak commented Oct 11, 2023

image
I need to wait about ~47 hours until I can say how long it takes now
(Edit: all User events are deaktivated)

@ragnarfrosti
Copy link
Author

We did the migration in production couple of days ago and the whole process took about 2 hours on the production server (it took about ~45 minutes locally).

But now we just noticed that the script "Creating a inactive user for each guest email" not creates a inactive user for each guest email, but for each guest order.

So if there are more than one guest order for a single e-mail address then the script will create multiple inactive guest accounts, all with the same e-mail address. Now we have 25.000 email addresses with more than 2 inactive user accounts.

@lukeholder
Copy link
Member

lukeholder commented Oct 13, 2023

@ragnarfrosti sorry I am a little unclear what has happened. Are you saying that multiple inactive users have been created with the same email?

@ragnarfrosti
Copy link
Author

Hi @lukeholder

Yes,

I ran the following query after the upgrade script and also on a backup from Craft 3 we had from last week.

SELECT email, COUNT(email) as total
FROM craft_users
GROUP BY email
ORDER BY total DESC

No email-address appeared more than once before the upgrade but after the upgrade script it we had ~25000 of duplicates, 2 or more in the table.

I've just looked at couple of those and it seems the number of email addresses has to do with the number of orders the customers had purchased with each e-mail as a guest.

So

  • if a guest e-mail had 1 order before the upgrade script then 1 guest user is created
  • if a guest e-mail had 2 orders before the upgrade script then 2 guest users are created
  • if a guest e-mail had 3 orders before the upgrade script then 3 guest User are created
  • ... etc

@nfourtythree
Copy link
Contributor

Hi @ragnarfrosti

I was able to replicate the issue you highlighted. We have just pushed up a fix for duplicating inactive users for guest customer issue. This will be included in the next release of Commerce.

To get this early, change your craftcms/commerce requirement in your project's composer.json to:

"require": {
  "craftcms/commerce": "dev-develop#fb9d51bf9bac5d26033b56215f14dc33f35e8f14 as 4.3.0",
  "...": "..."
}

Then run composer update.

We will continue to look into other improvements regarding upgrade speed.

Thanks!

@joshuapease
Copy link
Contributor

joshuapease commented Oct 13, 2023

I had the same "multiple inactive users" issue as @ragnarfrosti on a project.

That fix seems to have resolved it.

@Anubarak
Copy link

Anubarak commented Oct 17, 2023

I just increased the performance a "little bit" for me as well by changing another query
https://github.com/craftcms/commerce/blob/4.3.0/src/console/controllers/UpgradeController.php#L1168

$guestEmails = (new Query())
    ->select(['[[o.email]]'])
    ->from(['o' => $ordersTable])
    ->leftJoin(['u' => $usersTable], 'o.email = u.email')
    ->where(['u.email' => null])
    ->andWhere(['not', ['o.email' => null]])
    ->column();

->andWhere(['not', ['o.email' => null]]) will not cover empty strings. This query fetched 160.000 empty emails and tried to create many users with an empty mail address...

after changing it to

->andWhere([
    'and',
    ['not', ['o.email' => null]],
    ['not', ['o.email' => '']]
])

or just

->andWhere(Db::parseParam('o.email', ':notempty:'))

it will ignore orders with an email as empty string.
Now it only takes 8h instead of 80h Creating a inactive user for each guest email

@nfourtythree
Copy link
Contributor

Hi @Anubarak

Thanks for that, as a quick proof of concept we pushed up this fixed yesterday which does the same thing but just in code. This fixed the issue and the plan was to update it today to be in the query as you have pointed in your message.

Thanks!

@ragnarfrosti
Copy link
Author

Great that you were able to solve the multiple customers in the database @nfourtythree and @lukeholder.

But since we already had run the script on the production web we have ~23000 guest users in the database with 2 accounts or more. These guest accounts are about 71.000 so we need to remove ~58000 of those from the database since we've already received complaints from couple of these users that they aren't able to register on the web (they aren't able to enable the account because of the duplicates).

Do you have any suggestions how we can do this in the best way without messing up any other order/commerce related data?

@smockensturm
Copy link

This is crazy. Should a large store go offline while the upgrade script runs?

@lukeholder
Copy link
Member

lukeholder commented Oct 19, 2023

@ragnarfrosti Can you please send your DB dump to support@craftcms.com and we will write a script that will fix the data ASAP.

@smockensturm All Craft sites (Not just Commerce) go offline while they are upgraded (migrations are run). The Commerce upgrade command should be run immediately after that and the site should be operational while the command completes its process, users might not see all of their addresses until the command completes. The previous reports of this not being possible are due to the bug that has been fixed. This is a major version update so some downtime is required. We don't plan to have future versions of Commerce needing to migrate so many elements again. We have been making changes to Commerce 5 to ease and speed up the upgrade process. Thanks.

@lukeholder
Copy link
Member

Closing due to the original issue being fixed in 4.3.1, please make a new issue if you have any other problems and we will look into it immediately.

@smockensturm
Copy link

@lukeholder thank you. one quick followup if i may. it appears that if a pre Commerce 4 user has some addresses 'saved' but has never completed an order (and thus never becoming a customer) those addresses are not migrated to the users account. is that true?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug commerce4 Issues related to Commerce v4
Projects
None yet
Development

No branches or pull requests

9 participants