Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop checking schema cache version before loading it #36925

Closed

Conversation

casperisfine
Copy link
Contributor

There isn't much value in this check, but it has several downsides.

The only point of it, is to save users from loading a severely outdated
cache, which arguably is a bug in the user deploy and migration process.

On the other side, if the application does implement 0-downtime deployments,
all migrations already need to be fully backward compatible, hence loading
a cache that is a couple migrations behind is much better than invalidating
it which will cause lots of schema queries to be issued.

Another issue is that queries during the boot process are inherently
problematic, because it create a hard dependency on a functional and healthy
database to allow web workers to boot, which in a outage recovery scenario
can makes things much harder.

e.g. you deploy a very slow query that overloads the database, you start
a rollback but the web workers with the sane version crash on boot because
they can't query the schema cache version.

cc @rafaelfranca @Edouard-chin

@casperisfine casperisfine force-pushed the stop-checking-schema-cache-version branch 2 times, most recently from 9e8274d to c213dc8 Compare August 13, 2019 14:30
There isn't much value in this check, but it has several downsides.

The only point of it, is to save users from loading a severely outdated
cache, which arguably is a bug in the user deploy and migration process.

On the other side, if the application does implement 0-downtime deployments,
all migrations already need to be fully backward compatible, hence loading
a cache that is a couple migrations behind is much better than invalidating
it which will cause lots of schema queries to be issued.

Another issue is that queries during the boot process are inherently
problematic, because it create a hard dependency on a functional and healthy
database to allow web workers to boot, which in a outage recovery scenario
can makes things much harder.

e.g. you deploy a very slow query that overloads the database, you start
a rollback but the web workers with the sane version crash on boot because
they can't query the schema cache version.
@casperisfine casperisfine force-pushed the stop-checking-schema-cache-version branch from c213dc8 to 1e1588d Compare August 13, 2019 14:38
@kaspth
Copy link
Contributor

kaspth commented Aug 13, 2019

I wonder if we should even bother with a deprecation cycle. Schema caches have never been one of the most used features, so people may not be aware of this particular toe-stub with it.

@casperisfine
Copy link
Contributor Author

casperisfine commented Aug 13, 2019

I agree. I initially simply stripped that code but @rafaelfranca said he preferred to go though deprecation. I don't mind either way, but it's true that not deprecating it would make my life easier (this is in preparation of further PRs).

Schema caches have never been one of the most used features

Yeah, I wonder how much they are used. It's kind of a shame because they are really essential for not tanking performance post deploy/restart.

@jeremy
Copy link
Member

jeremy commented Aug 13, 2019

The only point of it, is to save users from loading a severely outdated cache, which arguably is a bug in the user deploy and migration process.

This downplays the point of it. Can running an app with an outdated schema break the app in ways it doesn't expect?

loading a cache that is a couple migrations behind is much better than invalidating it which will cause lots of schema queries

I think we need a stronger evaluation of the costs of running an app with a schema that doesn't match the actual database.

Running with an out-of-sync schema may be reasonable for some apps and deployment strategies. Is it reasonable as a default for all apps and deployment strategies?

@casperisfine
Copy link
Contributor Author

Can running an app with an outdated schema break the app in ways it doesn't expect?

As much as running a migration while the app is still running. Effectively you are changing the schema after it was loaded by the app.

So unless you are putting your app in some kind of maintenance mode before running your migrations, you are routinely running with an outdated schema cache (regardless of wether you loaded it from a dump or queried the database).

Is it reasonable as a default for all apps and deployment strategies?

As mentioned by @kaspth, the schema cache dump feature is mostly used by experimented users who know Rails well enough, and have an application big enough to have noticed the queries spikes post deploy.

@matthewd
Copy link
Member

the schema cache dump feature is mostly used by experimented users who know Rails well enough

We've recent-ish-ly discussed the idea of encouraging its use by default for everyone, as part of a standard deploy process right next to asset generation. To me this change would be a hard veto on that idea, and I'd rather not give up on it yet.

As @jeremy suggests, I think this is a reasonable "go fast, I'll make sure things don't break" option, but not a well-balanced choice for most users.

(I'd also prefer to give those users a better experience too, by deferring the version check until we connect for some other reason [so, validate the schema cache the first time we try to use it instead of immediately], and/or somehow explicitly accomodating the idea that these two schema versions in particular are compatible despite being different numbers. But for now I'll settle for not leaving people with hard-to-diagnose production problems.)

@casperisfine
Copy link
Contributor Author

encouraging its use by default for everyone

That would be a good thing, indeed.

To me this change would be a hard veto on that idea, and I'd rather not give up on it yet.

I don't understand why. If anything this makes it more usable, as your cache wouldn't be (almost) silently ignored. Also if the consensus end up being that this version check is actually a good thing (which I totally don't agree with but wtv), then I understand a veto about the deprecation, but not about the second flag.

by deferring the version check until we connect for some other reason

Yes, that's what I planned as a followup, but it requires some relatively important refactoring first, and is kinda related to sharding support, but I'm not quite ready to submit all that.

not leaving people with hard-to-diagnose production problems

I'm curious what you have in mind here, because I'm trying to think about deploy strategies where it would cause a problem, but can't think of anything.

How do you usually integrate it in your deploy workflow?

@rails-bot
Copy link

rails-bot bot commented Dec 17, 2019

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants