Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Also deduplicate schema cache data when using the init_with interface #36529

Merged
merged 1 commit into from Jun 21, 2019

Conversation

Projects
None yet
3 participants
@casperisfine
Copy link

commented Jun 21, 2019

Ref: #35891

I forgot to also apply the deduplication in init_with, which is actually the most likely one to be used.

Another small improvement is that @columns_hash is now eagerly indexed so that it's not done on the fly on the first request that access it.

Sorry both overlooking this in the previous PRs.

@kaspth @Edouard-chin @rafaelfranca

@rails-bot rails-bot bot added the activerecord label Jun 21, 2019

@kaspth kaspth merged commit 75a0cf4 into rails:master Jun 21, 2019

2 checks passed

buildkite/rails Build #61857 passed (9 minutes, 4 seconds)
Details
codeclimate All good!
Details
@kaspth

This comment has been minimized.

Copy link
Member

commented Jun 21, 2019

Curious to hear what the results are of all these optimizations now! 😄

@casperisfine

This comment has been minimized.

Copy link
Author

commented Jun 21, 2019

We already know since I've been running this change as a monkey patch for a while now: #35860 (comment)

114 MB of RAM saved for us. But we're a bit special as our database is heavily sharded, meaning each table information exist ~50 times in memory.

However even non sharded applications should see a significant memory reduction because many columns are similar form one table to the other. For instance, for a single shard we have ~5.5k columns, but only 2k unique ones (e.g all tables have the same id, updated_at, created_at column).

@kaspth

This comment has been minimized.

Copy link
Member

commented Jun 21, 2019

Nice! What about parsing speed times from not storing columns_hash? Or were you using your own parsing mechanism?

@casperisfine

This comment has been minimized.

Copy link
Author

commented Jun 21, 2019

Yeah, that I can't tell because we have a custom serialization format. That was a purely altruistic PR 😉

But since columns are the vast majority of the schema cache payload, I wouldn't be surprised if it made a low double digit percentage difference (20-30% wet finger estimate).

I'd need to find an app using both Rails 6 and the standard format to benchmark the difference, unfortunately I have none of those.

@kaspth

This comment has been minimized.

Copy link
Member

commented Jun 21, 2019

Haha, your altruism is appreciated! 😄 I think we'll know the results soon enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.