Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

with_for_update() interacts badly with lazy='joined' #4100

Closed
sqlalchemy-bot opened this issue Sep 29, 2017 · 23 comments
Closed

with_for_update() interacts badly with lazy='joined' #4100

sqlalchemy-bot opened this issue Sep 29, 2017 · 23 comments
Assignees
Milestone

Comments

@sqlalchemy-bot
Copy link
Collaborator

Migrated issue, originally created by Diggsey (@Diggsey)

The FOR UPDATE clause seems to lock both the table selected from, plus any tables which are part of a join.

This means that when a model has a relationship which is eagerly loaded, using for_update() on a query of that model will unexpectedly lock all of the joined tables!

I think this is so surprising and subtle, and the performance implications so terrible, that it should be considered a bug.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

I can understand "surprising", but not "unexpected". If you ask for FOR UPDATE and joined eager loading at the same time, I'd expect all the joined tables to be locked. What behavior would you expect ?

@sqlalchemy-bot
Copy link
Collaborator Author

Diggsey (@Diggsey) wrote:

To me "surprising" and "unexpected" are synonyms.

Either way, the point is that the eager loading is not explicit as part of the query - it's implicit, and thought of more as an optimisation.

Put it this way: if two programmers are working on a project, and one of them uses with_for_update(), and then the other changes a lazy='subquery' to a lazy='joined', I think it's not realistic to expect the second programmer to realise the implications of that change (namely that all access to the database will be serialised!)

The behaviour I'd expect would be for locking to be unaffected by the more implicit features such as eager loading.

If sqlalchemy is unable to generate reasonable SQL to achieve that, it should error.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

Either way, the point is that the eager loading is not explicit as part of the query - it's implicit, and thought of more as an optimisation. and then the other changes a lazy='subquery' to a lazy='joined',

this is not true in most cases, usually the joinedload() option is used which is explicit.

I think it's not realistic to expect the second programmer to realise the implications of that change

Someone changing lazy to joined at the mapping level needs to be aware of all queries in which this entity is used. joined eager loading makes for a much more complex query which can easily fall over if not used sparingly. This flag should never be set at the global mapper level without someone testing the application while logging queries. I will gladly add warnings to the documentation to this effect.

The behaviour I'd expect would be for locking to be unaffected by the more implicit features such as eager loading.

Eager loading is usually explicit, so this would be asking for a difference in behavior between the explicit form of eager loading vs. the mapper level form of eager loading. This may be reasonable but I'd be concerned about this still being an implicit guess being made.

If sqlalchemy is unable to generate reasonable SQL to achieve that, it should error.

again if someone asks for joined eager loading explicitly with with_for_update(), it should do what's given.

It seems like you got bit by something here but the root cause seems to be that "lazy='joined'" does not have enough warnings in the documentation. The "lazy" options should not be casually flipped around in production applications.

@sqlalchemy-bot
Copy link
Collaborator Author

Diggsey (@Diggsey) wrote:

In that case, consider this a feature request for a higher level operation (similar to with_for_update()) but which only locks rows from tables explicitly requested regardless of what other features are in use.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

What you want to do here is to disable eager loads when you use with_for_update():

query(Entity).enable_eagerloads(False).with_for_update()

Those eagerloads can be disabled individually as well. There's not quite a generalized way to turn off just "joined' eager loads without naming them explicitly, but I wouldn't oppose a feature like that either.

As far as generalizing that the joined eager load still takes place but specific tables are named, this would not be worth it because SQL does not cleanly support this. Postgresql can use OF against a table, which the method supports explicitly, but when on Oracle, OF requires a column expression, not a table expression, so this might be hard to generalize. Then on MySQL (I don't know what database you're using) you don't have this option at all. On SQL Server, there's no "SELECT FOR UPDATE", there's "WITH (UPDLOCK)" which you can do with SQLAlchemy but not through the "FOR UPDATE" mechanism, and there's not much point to trying to pretend "SELECT FOR UPDATE" acts like SQL Server's "WITH (UPDLOCK)" because it is quite different.

Short story is SELECT FOR UPDATE is kind of a non-declarative part of SQL in practice (e.g. needs to know the "how" a little too much) so it is explicitly mapped to SQL. with_for_update() supercedes the previous "with_lockmode()" method which tried to "abstract" FOR UPDATE, and failed to be flexible or transparent enough for people's needs.

so if SQL had that in a generic way, that should be part of how with_for_update works. However, SQL does not have this. Postgresql has "FOR UPDATE OF", which is an option supported by with_for_update(), Oracle supports "FOR UPDATE OF" as well, but

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

there's also even the notion of putting in warnings when joined + for update is present and encouraging enable_eagerloads(False), (or a hypothetical enable_eagerloads(joined=False)), but then folks who want to actually do things that way will be annoyed by it.

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

definitely though, new warning boxes in the docs for with_for_update() and "lazy='joined'".

@sqlalchemy-bot
Copy link
Collaborator Author

Diggsey (@Diggsey) wrote:

What about a global option to "turn off" with_for_update() on queries with joins, such that using it in such cases is a hard error? Maybe that's too specific to add to sqlalchemy, but is there a way I could implement that in my application?

Anyway, a warning on "with_for_update()" and a suggestion to use "enabled_eagerloads(False)" seem like a good idea.

@sqlalchemy-bot
Copy link
Collaborator Author

Changes by Michael Bayer (@zzzeek):

  • added labels: orm

@sqlalchemy-bot
Copy link
Collaborator Author

Changes by Michael Bayer (@zzzeek):

  • set milestone to "1.3"

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

What about a global option to "turn off" with_for_update() on queries with joins, such that using it in such cases is a hard error? Maybe that's too specific to add to sqlalchemy, but is there a way I could implement that in my application?

detecting that the query has joins is possible but a little involved. you can subclass Query and have with_for_update() imply enable_eagerloads(False), that would be easy.

@sqlalchemy-bot
Copy link
Collaborator Author

Changes by Michael Bayer (@zzzeek):

  • removed labels: orm
  • added labels: documentation

@sqlalchemy-bot
Copy link
Collaborator Author

Changes by Michael Bayer (@zzzeek):

  • changed milestone from "1.3" to "1.2.x"

@sqlalchemy-bot
Copy link
Collaborator Author

Julien Nakache wrote:

Hello guys,

I noticed a different issue when using joinedload and with_for_update in Mysql with a REPEATABLE READ isolation level. In my case, it seems that the with_for_update only locked the joinedtable because the FOR UPDATE statement is only applied to the joinedtable - the root table is not locked because it is queried via the subselect.

That created issues on our end because we assumed that the source table was locked first. Is that a known behavior? I'm happy to share a repro if that's helpful.

Cheers,
J

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

@juliennakache just show me the SQL output on that

@sqlalchemy-bot
Copy link
Collaborator Author

Julien Nakache wrote:

@zzzeek Sure! Here is a simplified version of the model definition:

class Consultation(models_base.Base):
    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True, nullable=False)
    consultation_uuid = sqlalchemy.Column(sqlalchemy.String(32), primary_key=True, index=True, nullable=False)
    patient_member_id = sqlalchemy.Column(sqlalchemy.String(8))

    consultation_actions = sqlalchemy.orm.relationship(
        lambda: ConsultationAction,
        primaryjoin='Consultation.consultation_uuid == ConsultationAction.consultation_uuid',
        foreign_keys='ConsultationAction.consultation_uuid',
        uselist=True,
        order_by='desc(ConsultationAction.id)'
    )


class ConsultationAction(models_base.Base):
    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True, nullable=False)
    consultation_uuid = sqlalchemy.Column(sqlalchemy.String(32), sqlalchemy.ForeignKey(Consultation.consultation_uuid),
                                          nullable=False)
    action_type = sqlalchemy.Column(sqlalchemy.Enum(*ActionType.ALL), nullable=False)

Here is the sqlalchemy query:

db_session\
        .query(models.Consultation)\
        .filter(models.Consultation.consultation_uuid == consultation_uuid)\
        .options(joinedload(models.Consultation.consultation_actions))\
        .with_for_update()\
        .first()

Here is the generated SQL:

SELECT anon_1.telemedicine_consultation_id AS anon_1_telemedicine_consultation_id, anon_1.telemedicine_consultation_consultation_uuid AS anon_1_telemedicine_consultation_consultation_uuid, anon_1.telemedicine_consultation_patient_member_id AS anon_1_telemedicine_consultation_patient_member_id, consultation_action_1_id, consultation_action_1.consultation_uuid AS consultation_action_1_consultation_uuid, consultation_action_1.action_type AS consultation_action_1_action_type
FROM (

	SELECT telemedicine.consultation.id AS telemedicine_consultation_id, telemedicine.consultation.consultation_uuid AS telemedicine_consultation_consultation_uuid, telemedicine.consultation.patient_member_id AS telemedicine_consultation_patient_member_id
	FROM telemedicine.consultation 
	WHERE telemedicine.consultation.consultation_uuid = %s
 	LIMIT %s

 ) AS anon_1 LEFT OUTER JOIN telemedicine.consultation_action AS consultation_action_1 ON anon_1.telemedicine_consultation_consultation_uuid = consultation_action_1.consultation_uuid ORDER BY consultation_action_1.id DESC FOR UPDATE;

Thanks

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

@juliennakache and.....the FOR UPDATE should be embedded inside the subquery? can you confirm this syntax is accepted by MySQL ?

@sqlalchemy-bot
Copy link
Collaborator Author

Julien Nakache wrote:

@zzzeek I confirmed that it works on mysql 5.6.27 and that it locks the root table / index. I'm not expert on the SQL spec but it seems to make sense that it works given how subqueries are usually executed by SQL engines.

I also confirmed that it works by having a FOR UPDATE on both the subselect and the outer select statement. In this case, the subselect is locked before the outer statement. Which is why I was expecting.

In the short term, it seems more correct to me to emit the FOR UPDATE on the subselect. Also it might make sense longer-term to make with_for_update parameterizable so we have finer-grained control over locking.

Thank you!

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

isnt this the opposite claim of what this ticket says?

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

oh never mind, that applies to when there is not a subquery...

@sqlalchemy-bot
Copy link
Collaborator Author

Michael Bayer (@zzzeek) wrote:

@juliennakache please continue at #4246

@sqlalchemy-bot sqlalchemy-bot added documentation bug Something isn't working labels Nov 27, 2018
@sqlalchemy-bot sqlalchemy-bot added this to the 1.2.x milestone Nov 27, 2018
@zzzeek zzzeek removed bug Something isn't working orm labels Nov 29, 2018
@zzzeek zzzeek added the orm label Jan 13, 2019
@zzzeek zzzeek modified the milestones: 1.2.x, 1.3.xx May 18, 2019
@zzzeek zzzeek modified the milestones: 1.3.xx, 1.3.x Dec 18, 2019
@zzzeek zzzeek modified the milestones: 1.3.x, 1.4.x May 26, 2021
@jvanasco jvanasco self-assigned this Sep 9, 2021
@sqla-tester
Copy link
Collaborator

jonathan vanasco has proposed a fix for this issue in the master branch:

Fixes: #4100 https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3100

@sqla-tester
Copy link
Collaborator

jonathan vanasco has proposed a fix for this issue in the main branch:

Fixes: #4100 https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3221

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants