Add support of read slaves. #104

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
7 participants
@philiptzou

No description provided.

@rdegges

This comment has been minimized.

Show comment Hide comment
@rdegges

rdegges Feb 3, 2013

I know this isn't documented and doesn't come with unit tests, but I'd like to +1 this as I think having read slave support in Flask-SQLAlchemy is critical.

I've got several really large scale applications that need this, and making use of hacks is kinda lame =/

rdegges commented Feb 3, 2013

I know this isn't documented and doesn't come with unit tests, but I'd like to +1 this as I think having read slave support in Flask-SQLAlchemy is critical.

I've got several really large scale applications that need this, and making use of hacks is kinda lame =/

@mitsuhiko

This comment has been minimized.

Show comment Hide comment
@mitsuhiko

mitsuhiko May 16, 2013

Owner

I like the idea but not the implementation. Wonder though what a better way would be. I suppose a Foo.query.from_slave().get(...) makes more sense.

Owner

mitsuhiko commented May 16, 2013

I like the idea but not the implementation. Wonder though what a better way would be. I suppose a Foo.query.from_slave().get(...) makes more sense.

@eriktaubeneck

This comment has been minimized.

Show comment Hide comment
@eriktaubeneck

eriktaubeneck Dec 18, 2013

This is quite an old pull request, but I am planning on figuring out some sort of solution for my app soon. This post by zzzeek seems to be an interesting implementation. The basic idea is that the call to master or slave doesn't need to be made explicitly, but instead the switch is made on the context of the request. The switch to master happens once a write starts and isn't switched back until a commit or flush happens. There is also a use_bind() function that allows the user to explicitly select a bind for a session (I assume this would prevent queries that try to write to a slave session somehow, maybe raise an exception).

I would also probably add something like a @use_bind() decorator that could be used to decorate a routing function and would have the effect of setting the session to that bind for duration of that function.

Won't be able to actually sit down and work on this for a few weeks at least, but would be curious from the people who've looked at this issue, and if, given a clean PR with tests and documentation, @mitsuhiko would pull something like this in.

This is quite an old pull request, but I am planning on figuring out some sort of solution for my app soon. This post by zzzeek seems to be an interesting implementation. The basic idea is that the call to master or slave doesn't need to be made explicitly, but instead the switch is made on the context of the request. The switch to master happens once a write starts and isn't switched back until a commit or flush happens. There is also a use_bind() function that allows the user to explicitly select a bind for a session (I assume this would prevent queries that try to write to a slave session somehow, maybe raise an exception).

I would also probably add something like a @use_bind() decorator that could be used to decorate a routing function and would have the effect of setting the session to that bind for duration of that function.

Won't be able to actually sit down and work on this for a few weeks at least, but would be curious from the people who've looked at this issue, and if, given a clean PR with tests and documentation, @mitsuhiko would pull something like this in.

@immunda

This comment has been minimized.

Show comment Hide comment
@immunda

immunda Jul 3, 2014

Collaborator

@eriktaubeneck Did you ever come to a solution on this?

Collaborator

immunda commented Jul 3, 2014

@eriktaubeneck Did you ever come to a solution on this?

@eriktaubeneck

This comment has been minimized.

Show comment Hide comment
@eriktaubeneck

eriktaubeneck Jul 3, 2014

No never was able to spend some time on it, but now that the project is being maintained again, I will look into it. Thanks for bringing it back to life.

No never was able to spend some time on it, but now that the project is being maintained again, I will look into it. Thanks for bringing it back to life.

@philiptzou

This comment has been minimized.

Show comment Hide comment
@philiptzou

philiptzou Jul 3, 2014

We have introduced this patch to guokr.com production since February 2013. It is useful and works well. We are still using it.

But we discovered some problem related to transaction consistence. Automatic connection picker breaks transaction since the slave connection does not share the same transaction with master. For example:

parent = ParentModel(name='parent')
db.session.add(parent)
db.session.commit()
parent.children.append(ChildModel(name='child'))
db.session.flush()  # Insert children though the master conn
parent.children_count = parent.children.count()  # but query count from a slave conn
db.session.commit()
assert parent.children_count == 0
# NO AssertionError raised

So we introduced a magic variable named "diable_slaves". It disables slave connections even in SELECT clause. This fixed our problem. But in fact, most of our programmers forget to set the variable at very first time. Although it will be fixed easily when we met the problem or unittest simply not passed.

Now I think we should turn slaves off and always use master by default. Only enable them when we needed. It can be a decorator like use_slaves, or like @mitsuhiko mentioned before - query.from_slaves().

@use_slaves
def fetch_something(foo):
    """Use slaves here at your own risk

    do some read works, but Exception will be raised when try to write
    """
    pass

We have introduced this patch to guokr.com production since February 2013. It is useful and works well. We are still using it.

But we discovered some problem related to transaction consistence. Automatic connection picker breaks transaction since the slave connection does not share the same transaction with master. For example:

parent = ParentModel(name='parent')
db.session.add(parent)
db.session.commit()
parent.children.append(ChildModel(name='child'))
db.session.flush()  # Insert children though the master conn
parent.children_count = parent.children.count()  # but query count from a slave conn
db.session.commit()
assert parent.children_count == 0
# NO AssertionError raised

So we introduced a magic variable named "diable_slaves". It disables slave connections even in SELECT clause. This fixed our problem. But in fact, most of our programmers forget to set the variable at very first time. Although it will be fixed easily when we met the problem or unittest simply not passed.

Now I think we should turn slaves off and always use master by default. Only enable them when we needed. It can be a decorator like use_slaves, or like @mitsuhiko mentioned before - query.from_slaves().

@use_slaves
def fetch_something(foo):
    """Use slaves here at your own risk

    do some read works, but Exception will be raised when try to write
    """
    pass
@brunsgaard

This comment has been minimized.

Show comment Hide comment
@brunsgaard

brunsgaard Apr 10, 2015

@mitsuhiko, would you accept a PR taking this approach? It is in accordance with this post http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/ .

Of cause the code would be rewritten to be more reuseable and flexable, but do you support the approach, to the slave/master problem?

from .ext.flask_ses import SES
from .ext.flask_ocmg_stateservice import StateService
from flask_ocid import Ocid, OcidException
from flask.ext.sqlalchemy import SQLAlchemy as SQLAlchemyBase
from flask.ext.sqlalchemy import get_state
from flask_jsonschema import JsonSchema
import sqlalchemy.orm as orm
from functools import partial


ocid = Ocid()
ses = SES()
jsonschema = JsonSchema()
stateservice = StateService()


class RoutingSession(orm.Session):
    _name = None

    def __init__(self, db, autocommit=False, autoflush=False, **options):
        self.db = db
        self.app = db.get_app()
        self._model_changes = {}
        orm.Session.__init__(
            self,
            autocommit=autocommit,
            autoflush=autoflush,
            bind=db.engine,
            binds=db.get_binds(self.app),
            **options
        )

    def get_bind(self, mapper=None, clause=None):

        try:
            state = get_state(self.app)
        except (AssertionError, AttributeError, TypeError) as err:
            self.app.logger.error(
                "Unable to get Flask-SQLAlchemy configuration."
                " Outputting default bind. Error:" + err)
            return orm.Session.get_bind(self, mapper, clause)

        # If there are no binds configured
        # connect using the default SQLALCHEMY_DATABASE_URI
        if state is None or not self.app.config['SQLALCHEMY_BINDS']:
            self.app.logger.error(
                "Connecting -> DEFAULT. Unable to get Flask-SQLAlchemy"
                " bind configuration. Outputting default bind.")
            return orm.Session.get_bind(self, mapper, clause)

        elif self._name:
            self.app.logger.debug("Connecting -> {}".format(self._name))
            return state.db.get_engine(self.app, bind=self._name)

        elif self._flushing:  # we who are about to write, salute you
            self.app.logger.debug("Connecting -> MASTER")
            return state.db.get_engine(self.app, bind='master')

        else:
            self.app.logger.debug("Connecting -> slave")
            return state.db.get_engine(self.app, bind='slave')

    def using_bind(self, name):
        s = RoutingSession(self.db)
        vars(s).update(vars(self))
        s._name = name
        return s


class SQLAlchemy(SQLAlchemyBase):

    def __init__(self, *args, **kwargs):
        SQLAlchemyBase.__init__(self, *args, **kwargs)
        self.session.using_bind = lambda s: self.session().using_bind(s)

    def create_scoped_session(self, options=None):
        if options is None:
            options = {}
        scopefunc = options.pop('scopefunc', None)
        return orm.scoped_session(
            partial(RoutingSession, self, **options), scopefunc=scopefunc
        )

db = SQLAlchemy(session_options={'autocommit': True})

@mitsuhiko, would you accept a PR taking this approach? It is in accordance with this post http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/ .

Of cause the code would be rewritten to be more reuseable and flexable, but do you support the approach, to the slave/master problem?

from .ext.flask_ses import SES
from .ext.flask_ocmg_stateservice import StateService
from flask_ocid import Ocid, OcidException
from flask.ext.sqlalchemy import SQLAlchemy as SQLAlchemyBase
from flask.ext.sqlalchemy import get_state
from flask_jsonschema import JsonSchema
import sqlalchemy.orm as orm
from functools import partial


ocid = Ocid()
ses = SES()
jsonschema = JsonSchema()
stateservice = StateService()


class RoutingSession(orm.Session):
    _name = None

    def __init__(self, db, autocommit=False, autoflush=False, **options):
        self.db = db
        self.app = db.get_app()
        self._model_changes = {}
        orm.Session.__init__(
            self,
            autocommit=autocommit,
            autoflush=autoflush,
            bind=db.engine,
            binds=db.get_binds(self.app),
            **options
        )

    def get_bind(self, mapper=None, clause=None):

        try:
            state = get_state(self.app)
        except (AssertionError, AttributeError, TypeError) as err:
            self.app.logger.error(
                "Unable to get Flask-SQLAlchemy configuration."
                " Outputting default bind. Error:" + err)
            return orm.Session.get_bind(self, mapper, clause)

        # If there are no binds configured
        # connect using the default SQLALCHEMY_DATABASE_URI
        if state is None or not self.app.config['SQLALCHEMY_BINDS']:
            self.app.logger.error(
                "Connecting -> DEFAULT. Unable to get Flask-SQLAlchemy"
                " bind configuration. Outputting default bind.")
            return orm.Session.get_bind(self, mapper, clause)

        elif self._name:
            self.app.logger.debug("Connecting -> {}".format(self._name))
            return state.db.get_engine(self.app, bind=self._name)

        elif self._flushing:  # we who are about to write, salute you
            self.app.logger.debug("Connecting -> MASTER")
            return state.db.get_engine(self.app, bind='master')

        else:
            self.app.logger.debug("Connecting -> slave")
            return state.db.get_engine(self.app, bind='slave')

    def using_bind(self, name):
        s = RoutingSession(self.db)
        vars(s).update(vars(self))
        s._name = name
        return s


class SQLAlchemy(SQLAlchemyBase):

    def __init__(self, *args, **kwargs):
        SQLAlchemyBase.__init__(self, *args, **kwargs)
        self.session.using_bind = lambda s: self.session().using_bind(s)

    def create_scoped_session(self, options=None):
        if options is None:
            options = {}
        scopefunc = options.pop('scopefunc', None)
        return orm.scoped_session(
            partial(RoutingSession, self, **options), scopefunc=scopefunc
        )

db = SQLAlchemy(session_options={'autocommit': True})
@davidism

This comment has been minimized.

Show comment Hide comment
@davidism

davidism Apr 9, 2016

Collaborator

I'd rather not directly include and support the various recipes from SQLAlchemy. Instead, Flask-SQLAlchemy should be extensible enough to allow them to be implemented when needed. As @brunsgaard shows, it is.

Collaborator

davidism commented Apr 9, 2016

I'd rather not directly include and support the various recipes from SQLAlchemy. Instead, Flask-SQLAlchemy should be extensible enough to allow them to be implemented when needed. As @brunsgaard shows, it is.

@davidism davidism closed this Apr 9, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment