Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a column to the filecache is super time consuming (depending on db) #22747

Closed
butonic opened this issue Mar 1, 2016 · 57 comments
Closed

Comments

@butonic
Copy link
Member

butonic commented Mar 1, 2016

The resulting ALTER TABLE for the added checksum column in OC9 will create a temp table with the same columns as the filecache, add the colum, COPY the table content and finally rename it and delete the original table. Big installations will experience a very long downtime as we have seen today with a huge activity table change when updating to oc 8.2.2.

In this case mysql 5.5 was used. In the documentation for it I see no way around copying the contents: http://dev.mysql.com/doc/refman/5.5/en/alter-table.html#idm140603778401088 It seems this has still not changed with mysql 5.7: http://dev.mysql.com/doc/refman/5.7/en/innodb-create-index-overview.html#idm140023654764880 although it should be less of a performance hog.

cc @MorrisJobke @DeepDiver1975 @karlitschek

@butonic butonic added this to the 9.0-current milestone Mar 1, 2016
@butonic
Copy link
Member Author

butonic commented Mar 1, 2016

Postgres may omit rebuilding the table if no default value is specified: http://www.postgresql.org/docs/9.5/static/sql-altertable.html#AEN72358

@karlitschek
Copy link
Contributor

yes. this is actually the wanted behavior. we want to simulate the upgrade because we actually touch the real table. the reason is that we had a ton of bugs before triggered by combinations of specific db types and servers with specific limitations. certain sometimes broken entries in the table, size limitations ir the combination of all if that.
so we destroyed user data in the past.
the fix is to copy the table first and test it. it is accepted that this takes some times but is better then destroying the database.
experienced admin can skip this if they tested the upgrade manually befores

@MorrisJobke
Copy link
Contributor

the fix is to copy the table first and test it. it is accepted that this takes some times but is better then destroying the database.

It's not our code that copies the table. It's mysql that copies the table. We run the upgrade simulation steps -> that itself triggers a copy of the same table in mysql (1) and apply the schema update (ALTER TABLE) which itself will copy the table internally(2). Then we run the same on the actual code (3+4).

This means we copy the whole table 4 times instead of 1 time. This totally kills bigger installations on mysql.

I guess we either need a way to figure out why mysql copies the table sometimes when a column is added (I guess this is the case when the whole table doesn't fit in memory).

@MorrisJobke
Copy link
Contributor

@karlitschek @DeepDiver1975 You asked me about this. It seems that I was wrong here. Today I noticed that the complete activity table was copied while running the alter table statement to add two columns (owncloud/activity@0470887).

@MorrisJobke
Copy link
Contributor

We should re-evaluate the add of the column to the file cache because this could add a serious problem.

Ref #21997

@karlitschek @DeepDiver1975 Sorry - I did a mistake by stating that this will not be a problem. 😢

@MorrisJobke
Copy link
Contributor

@PVince81 @nickvergessen I remind that Frank said that you tested this and it worked fine. What exactly was the test scenario?

@PVince81
Copy link
Contributor

PVince81 commented Mar 1, 2016

I think @karlitschek tested it himself and it was fast.

@DeepDiver1975
Copy link
Member

adding sev1 - this needs to be sorted out

@MorrisJobke
Copy link
Contributor

I updated my description above - it's a ratio of 1 to 4.

@PVince81
Copy link
Contributor

PVince81 commented Mar 1, 2016

To clarify: this report is NOT about simulated update.
It's about the ALTER TABLE that happens afterwards.

@DeepDiver1975
Copy link
Member

http://dev.mysql.com/doc/refman/5.7/en/alter-table.html - see 'Storage, Performance, and Concurrency Considerations'

@MorrisJobke
Copy link
Contributor

I think @karlitschek tested it himself and it was fast.

The question is how big was the table and how much memory was available. This happened on an instance with 177M entries in the activity table.

@MorrisJobke
Copy link
Contributor

To clarify: this report is NOT about simulated update.
It's about the ALTER TABLE that happens afterwards.

Correct.

@DeepDiver1975
Copy link
Member

Reading the docs the behavior is depending on the mysql version

@MorrisJobke
Copy link
Contributor

This was a mariadb 5.5.48 on RHEL 7.

@DeepDiver1975
Copy link
Member

looks like the behavior can be influenced by setting ALGORITHM=INPLACE in the command.
InnoDB is supporting this

@MorrisJobke
Copy link
Contributor

looks like the behavior can be influenced by setting ALGORITHM=INPLACE in the command.

@butonic Didn't you say that this does not work in innodb?

@DeepDiver1975
Copy link
Member

http://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl.html

'The online DDL feature enhances many types of ALTER TABLE operations to avoid table copying, blocking of DML operations while DDL is in progress, or both. '

@DeepDiver1975
Copy link
Member

online ddl is inplace since 5.5

@karlitschek
Copy link
Contributor

Sorry for the misunderstanding. I see now that this is about the alter table command itself and not our code.

@karlitschek
Copy link
Contributor

I indeed did a test on my local machine with some automatically generated data. obviously not a real production environment. And in this case this was surprisingly fast to add a column to an existing huge table. basically instant. I used a a new mariadb. The latest that comes with Ubuntu 15.10

@karlitschek
Copy link
Contributor

I don't see an alternative to the current approach. It was always clear that we have long db migrations again once we touch the filecache table. This was expected. The only alternative would be to put the checksum into a new table and join them. This would mean fast migration and slower during operation. We would have two heavy load tables then instead of one plus the join overhead. -> Not an option.
Let's explore if online ddl can help in new mysql/mariadb versions. Otherwise we have to live with long running db migrations.
By the way. it was the plan of the new updater to make it possible to do the db alter tables asynchronously before the real update. unfortunately this wasn't done for 9.0

@nickvergessen
Copy link
Contributor

What we could also do is to give the ALTER TABLE command to the customers before the update.
So they can modify the table in the night before the update, where usage is low.

Our updater then does not need to create the table anymore and therefor is faster.

@karlitschek
Copy link
Contributor

@nickvergessen Yes. This is what I meant earlier. :-)

@nickvergessen
Copy link
Contributor

@nickvergessen Yes. This is what I meant earlier. :-)

And what I meant is, that although the updater is not capable to do that, we can still achieve this manually, so the update is faster and we don't need to do big changes now anymore.

@PVince81
Copy link
Contributor

PVince81 commented Mar 9, 2016

So if I understand well, we need to adjust the updater to give estimates ?
Assigned @VicDeo

@VicDeo
Copy link
Member

VicDeo commented Mar 9, 2016

er, I wouldn't touch that with a 10 foot pole because it might kill kittens...
But there IS a way to speedup ALTER TABLE for MySQL in particular.
Something like that

CREATE TABLE new_table LIKE old_table;
ALTER TABLE new_table ADD new_column _datatype_;
INSERT INTO new_table SELECT *, NULL FROM old_table;
DROP TABLE old_table;
RENAME TABLE new_table TO old_table;

@MorrisJobke
Copy link
Contributor

But there IS a way to speedup ALTER TABLE for MySQL in particular.

Why should this be faster?

@VicDeo
Copy link
Member

VicDeo commented Mar 14, 2016

@MorrisJobke It's my own recipe but something that I found online. This should be measured to say for sure. The main reason is a lack of a temporary table I guess.

@VicDeo
Copy link
Member

VicDeo commented Mar 14, 2016

A planned workaround is asking user to run ALTER TABLE query on his own in case database platform is MySQL/Postrge and number of entries in filecache table exceeds a certain threshold.

@MorrisJobke
Copy link
Contributor

A planned workaround is asking user to run ALTER TABLE query on his own in case database platform is MySQL/Postrge and number of entries in filecache table exceeds a certain threshold.

How does this speed up the process? What is this threshold?

@VicDeo
Copy link
Member

VicDeo commented Mar 15, 2016

@MorrisJobke

How does this speed up the process?

This is a workaround and not the solution.

What is this threshold?

I think it's something like ~300k (and more) filecache entries. What's your opinion?

@VicDeo
Copy link
Member

VicDeo commented Mar 15, 2016

some benchmarks that I've done on solidgear docker instance by bumping version and adding checksum filed to filecache (and droping the column on the next iteration)
MySQL 5.1 (chosen intentionally because 5.5 is faster)

num of filecache entries time, seconds
1 000 3-6s
10 000 ~ 8s
100 000 170-180
300 000 ~ 1300

@butonic
Copy link
Member Author

butonic commented Mar 18, 2016

There is official documentation on how to improve bulk insert performance: http://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-bulk-data-loading.html

Basically disabling index updates and uniqueness checks. Be aware that enabling them after the insert operation causes a rebuild of the index. It will still be faster than updating the index on every insert, however.

@VicDeo
Copy link
Member

VicDeo commented Mar 18, 2016

@butonic it's not possible to disable indexes completely for innodb storage and altering is not the same with bulk insert.

The only profit I see is DB copying speedup: #23395

Here are four test results on the same container (Schema change was nothing but adding/droping checksum column to filecache table with 300 000 entries in it):
Time of occ upgrade completion (including schema check) in seconds

operation 8.2.3 8.2.3 + #23395
add column 1417 1131
drop column 1360 1085

this doesn't speed up the migration of original DB, but a test migration performance tweak

@guruz
Copy link
Contributor

guruz commented Mar 18, 2016

Why not have a new table for the checksum?
With a nice index from fileId->(checksumType,checkSum)
That way you also save space for directories or files that don't have a checksum

@nickvergessen
Copy link
Contributor

But we would need to left join the column all the time, which is even worse for huge installations

@MorrisJobke
Copy link
Contributor

Why not have a new table for the checksum?

Because it's too late.

@VicDeo
Copy link
Member

VicDeo commented Mar 21, 2016

@PVince81 @DeepDiver1975 So, what finally needs to be done?

Add the following to update notification in Web UI for stable8.2 && MySQL && oc_filecache ahs more than 200k entries?:

According to your filecache table size upgrade might take too long. You can execute the following query on your database to speedup the upgrade process: ALTER TABLE oc_filecache ADD COLUMN checksum varchar(255) DEFAULT NULL AFTER permissions; `

@PVince81
Copy link
Contributor

Sounds good to me. @karlitschek @DeepDiver1975 @butonic @MorrisJobke what do you think ?

@karlitschek
Copy link
Contributor

we discussed this and we don't want to have a new table for the. we optimize for production speed instead of upgrade speed

@karlitschek
Copy link
Contributor

I think we should add a warning to the web ui and the cli if you have more then 200k. Please check the original updater 2.0 issue for the right place in the flow where we have the hint about the length of the upgrade.
We should use this text instead:

You have an ownCloud installation with over 200.000 files so the upgrade might take a while. The recommendation is to use the command-line instead of the web interface for big ownCloud servers. Hint: You can speed up the upgrade by executing this SQL command manually: ALTER TABLE oc_filecache ADD COLUMN checksum varchar(255) DEFAULT NULL AFTER permissions; `

Not. Please show the hint only if using mysql.
Please show the CLI hint only in the web ui.

Does this make sense?

@VicDeo
Copy link
Member

VicDeo commented Mar 23, 2016

@karlitschek shouldn't this notice be added to 8.2 in order to be displayed before any action takes place?

@karlitschek
Copy link
Contributor

I think 9.0.x is fine

@nickvergessen
Copy link
Contributor

But the update to 9.0.x is the one causing the troubles?! 😕

@MorrisJobke
Copy link
Contributor

But the update to 9.0.x is the one causing the troubles?!

And then already the new 9.0.x code is available before the upgrade step has run. So this should be fine ;)

@VicDeo VicDeo mentioned this issue Mar 24, 2016
2 tasks
@ghost ghost modified the milestones: 9.0.2-current-maintenance, 9.0.1 Apr 11, 2016
@PVince81
Copy link
Contributor

So if I understand well the warning has been added 9.0.1, closing hence.

Please reopen if you think there is something left to do.

@lock
Copy link

lock bot commented Aug 4, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Aug 4, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants