New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eagerly define attribute methods in production #33959

Merged
merged 1 commit into from Sep 24, 2018

Conversation

Projects
None yet
5 participants
@eugeneius
Member

eugeneius commented Sep 24, 2018

Similar to #29559.

The attribute methods for a model are currently defined lazily the first time that model is instantiated, even when config.eager_load is true. This means the first request to use each model incurs the cost, which usually involves a database round trip to fetch the schema definition.

By defining the attribute methods for all models while the application is booting, we move that work out of any individual request. When using a forking web server, this also reduces the number of times the schema definition is queried by doing it once in the parent process instead of from each forked process during their first request.

Eagerly define attribute methods in production
The attribute methods for a model are currently defined lazily the first
time that model is instantiated, even when `config.eager_load` is true.
This means the first request to use each model incurs the cost, which
usually involves a database round trip to fetch the schema definition.

By defining the attribute methods for all models while the application
is booting, we move that work out of any individual request. When using
a forking web server, this also reduces the number of times the schema
definition is queried by doing it once in the parent process instead of
from each forked process during their first request.
@rails-bot

This comment has been minimized.

Show comment
Hide comment
@rails-bot

rails-bot Sep 24, 2018

r? @kamipo

(@rails-bot has picked a reviewer for you, use r? to override)

rails-bot commented Sep 24, 2018

r? @kamipo

(@rails-bot has picked a reviewer for you, use r? to override)

@kamipo kamipo merged commit 0627442 into rails:master Sep 24, 2018

2 checks passed

codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rafaelfranca

This comment has been minimized.

Show comment
Hide comment
@rafaelfranca

rafaelfranca Sep 24, 2018

Member

Thanks! We had this initializer in our application for a while and we forgot to send a PR. I'll improve it later since this now is going to bombard your database with many queries on boot and depending on the number of process you are booting at the same time it can take your database down. I think this should only run if the schema information is on the cache already.

Member

rafaelfranca commented Sep 24, 2018

Thanks! We had this initializer in our application for a while and we forgot to send a PR. I'll improve it later since this now is going to bombard your database with many queries on boot and depending on the number of process you are booting at the same time it can take your database down. I think this should only run if the schema information is on the cache already.

@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Sep 25, 2018

Member

Absent a schema cache, I feel uneasy about deliberately hitting the database during boot (even with #31221 to clean up afterwards).

The only concrete behavioural change I can point to is what happens if the application boots without the database available: previously we'd boot, then fail on each request until it comes back (and thus recover as soon as the database is there); we'll now presumably fail to boot, the process will exit, and we rely on the system process manager to start us again from scratch.

Assuming said process manager is on a tight restart loop, that makes "the database is MIA" a CPU-burning failure. I feel like we had a bug where a service was depended on during boot a couple of years ago, and it was noticed because it amplified an Actual Issue in someone's production environment. (My memory says Redis & @arthurnn​-era Shopify, but it's also possible I've just imagined the whole thing.)

I think this should only run if the schema information is on the cache already.

Yeah, that doesn't give the run-queries-before-fork benefit, but feels like it could be a safer balance to me. Run-queries-at-deploy is even less queries, after all. 😊

Member

matthewd commented Sep 25, 2018

Absent a schema cache, I feel uneasy about deliberately hitting the database during boot (even with #31221 to clean up afterwards).

The only concrete behavioural change I can point to is what happens if the application boots without the database available: previously we'd boot, then fail on each request until it comes back (and thus recover as soon as the database is there); we'll now presumably fail to boot, the process will exit, and we rely on the system process manager to start us again from scratch.

Assuming said process manager is on a tight restart loop, that makes "the database is MIA" a CPU-burning failure. I feel like we had a bug where a service was depended on during boot a couple of years ago, and it was noticed because it amplified an Actual Issue in someone's production environment. (My memory says Redis & @arthurnn​-era Shopify, but it's also possible I've just imagined the whole thing.)

I think this should only run if the schema information is on the cache already.

Yeah, that doesn't give the run-queries-before-fork benefit, but feels like it could be a safer balance to me. Run-queries-at-deploy is even less queries, after all. 😊

@eugeneius

This comment has been minimized.

Show comment
Hide comment
@eugeneius

eugeneius Sep 25, 2018

Member

I was hoping that this would provide most of the benefit of the schema cache dump without requiring any knowledge of how the application is deployed (and so could be enabled by default), but if talking to the database during boot is a no-go then that's not going to work. For my own use case, I'm fine with it being dependent on the presence of a schema cache dump.

@rafaelfranca, I'm happy to write another patch to address the concerns raised here - but if you'd prefer to upstream the Shopify implementation directly instead, that's fine too.

Member

eugeneius commented Sep 25, 2018

I was hoping that this would provide most of the benefit of the schema cache dump without requiring any knowledge of how the application is deployed (and so could be enabled by default), but if talking to the database during boot is a no-go then that's not going to work. For my own use case, I'm fine with it being dependent on the presence of a schema cache dump.

@rafaelfranca, I'm happy to write another patch to address the concerns raised here - but if you'd prefer to upstream the Shopify implementation directly instead, that's fine too.

@matthewd

This comment has been minimized.

Show comment
Hide comment
@matthewd

matthewd Sep 25, 2018

Member

Maybe we should consider promoting the schema cache as the default path -- just another thing you run along with asset precompile as part of a standard production deployment process.

Member

matthewd commented Sep 25, 2018

Maybe we should consider promoting the schema cache as the default path -- just another thing you run along with asset precompile as part of a standard production deployment process.

@rafaelfranca

This comment has been minimized.

Show comment
Hide comment
@rafaelfranca

rafaelfranca Sep 25, 2018

Member

Feel free to open the PR. I can give you a gist with our implementation too.

Member

rafaelfranca commented Sep 25, 2018

Feel free to open the PR. I can give you a gist with our implementation too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment