Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Also deduplicate schema cache data when using the init_with interface #36529

Merged
merged 1 commit into from Jun 21, 2019

Conversation

@casperisfine
Copy link
Contributor

@casperisfine casperisfine commented Jun 21, 2019

Ref: #35891

I forgot to also apply the deduplication in init_with, which is actually the most likely one to be used.

Another small improvement is that @columns_hash is now eagerly indexed so that it's not done on the fly on the first request that access it.

Sorry both overlooking this in the previous PRs.

@kaspth @Edouard-chin @rafaelfranca

@kaspth kaspth merged commit 75a0cf4 into rails:master Jun 21, 2019
2 checks passed
@kaspth
Copy link
Member

@kaspth kaspth commented Jun 21, 2019

Curious to hear what the results are of all these optimizations now! 😄

@casperisfine
Copy link
Contributor Author

@casperisfine casperisfine commented Jun 21, 2019

We already know since I've been running this change as a monkey patch for a while now: #35860 (comment)

114 MB of RAM saved for us. But we're a bit special as our database is heavily sharded, meaning each table information exist ~50 times in memory.

However even non sharded applications should see a significant memory reduction because many columns are similar form one table to the other. For instance, for a single shard we have ~5.5k columns, but only 2k unique ones (e.g all tables have the same id, updated_at, created_at column).

@kaspth
Copy link
Member

@kaspth kaspth commented Jun 21, 2019

Nice! What about parsing speed times from not storing columns_hash? Or were you using your own parsing mechanism?

@casperisfine
Copy link
Contributor Author

@casperisfine casperisfine commented Jun 21, 2019

Yeah, that I can't tell because we have a custom serialization format. That was a purely altruistic PR 😉

But since columns are the vast majority of the schema cache payload, I wouldn't be surprised if it made a low double digit percentage difference (20-30% wet finger estimate).

I'd need to find an app using both Rails 6 and the standard format to benchmark the difference, unfortunately I have none of those.

@kaspth
Copy link
Member

@kaspth kaspth commented Jun 21, 2019

Haha, your altruism is appreciated! 😄 I think we'll know the results soon enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants