Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy='raise' seems overly aggressive and complains about the wrong field #7176

Closed
RubeRuby opened this issue Oct 11, 2021 · 8 comments
Closed
Labels
expected behavior that's how it's meant to work. consider the "documentation" label in addition orm

Comments

@RubeRuby
Copy link

RubeRuby commented Oct 11, 2021

Describe the bug

Let's say we have three models: a User, UserGroup and an Organization.
A user can belong to a UserGroup and needs to belong to an Organization. A UserGroup needs to belong to the same Organization as the user.
I've specified lazy='raise' for all of the relationships because we use the async engine and so lazy loading is not possible.
Now let's say we access some_user_group.organization through a joinedload but at the same time we access some_user.user_group through a joinedload as well. This will result in the following error:
sqlalchemy.exc.InvalidRequestError: 'UserGroup.organization' is not available due to lazy='raise'
The interesting part is that adding a joinedload with the organization for the user as well (so user->user_group->organization) fixes the problem. Even though I am not accessing that field. So it seems to complain without reason. I hope it becomes clear in the code sample

To Reproduce

import unittest

from sqlalchemy import Column, Integer, ForeignKey, String, PrimaryKeyConstraint
from sqlalchemy.orm import joinedload, sessionmaker, relationship
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.ext.declarative import declarative_base


Base = declarative_base()


class Organization(Base):
    __tablename__ = 'organizations'
    id = Column(Integer, primary_key=True)
    name = Column(String(255), nullable=False)
    user_groups = relationship('UserGroup', back_populates='organization', lazy='raise')


class UserGroupHasUser(Base):
    __tablename__ = 'user_groups_have_users'
    user_group_id = Column(Integer, ForeignKey('user_groups.id'))
    user_id = Column(Integer, ForeignKey('users.id'))
    __table_args__ = (PrimaryKeyConstraint('user_group_id', 'user_id'),)


class UserGroup(Base):
    __tablename__ = 'user_groups'
    id = Column(Integer, primary_key=True)
    name = Column(String(255), nullable=False)
    organization_id = Column(Integer, ForeignKey('organizations.id'), nullable=False)
    organization = relationship('Organization', back_populates='user_groups', lazy='raise')
    users = relationship('User', secondary='user_groups_have_users', back_populates='user_groups', lazy='raise')


class User(Base):
    __tablename__ = 'users'
    id = Column(Integer, primary_key=True)
    name = Column(String(255), nullable=False)
    user_groups = relationship('UserGroup', secondary='user_groups_have_users', back_populates='users', lazy='raise')
    organization_id = Column(Integer, ForeignKey('organizations.id'), nullable=False)
    organization = relationship('Organization', lazy='raise')


class LazyRaise(unittest.IsolatedAsyncioTestCase):
    async def asyncSetUp(self):
        self.engine = create_async_engine('sqlite+aiosqlite:///:memory:')
        self.session = sessionmaker(bind=self.engine, class_=AsyncSession)
        self.db = self.session()

        self.conn = await self.engine.begin()
        await self.conn.run_sync(Base.metadata.create_all)
        
        organization = Organization(name='Test org.')
        user = User(name='Test user', organization=organization)
        user_group = UserGroup(name='Test user group', organization=organization)

        user.user_groups.append(user_group)
        self.db.add_all([user_group, organization, user])

        await (self.db.commit())


    async def asyncTearDown(self):
        await (self.conn.close())


    async def test_case(self):
        # Comment line below to fix the error and run the unittest successfully
        user = await self.db.get(User, 1, [joinedload(User.user_groups)])
        user = await self.db.get(User, 1, [joinedload(User.user_groups).joinedload(UserGroup.organization)])
        user_group = await self.db.get(UserGroup, 1, [joinedload(UserGroup.organization)])
        print(user_group.organization)
        return True


if __name__ == '__main__':
    unittest.main()

Error

root@0246b5b6963b:/src/backend-core/backend_core/tests# python3 test.py 
E
======================================================================
ERROR: test_case (__main__.LazyRaise)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.9/unittest/async_case.py", line 65, in _callTestMethod
    self._callMaybeAsync(method)
  File "/usr/lib/python3.9/unittest/async_case.py", line 88, in _callMaybeAsync
    return self._asyncioTestLoop.run_until_complete(fut)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/usr/lib/python3.9/unittest/async_case.py", line 102, in _asyncioLoopRunner
    ret = await awaitable
  File "/src/backend-core/backend_core/tests/test.py", line 72, in test_case
    print(user_group.organization)
  File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/attributes.py", line 481, in __get__
    return self.impl.get(state, dict_)
  File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/attributes.py", line 926, in get
    value = self._fire_loader_callables(state, key, passive)
  File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/attributes.py", line 962, in _fire_loader_callables
    return self.callable_(state, passive)
  File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/strategies.py", line 836, in _load_for_state
    self._invoke_raise_load(state, passive, "raise")
  File "/usr/local/lib/python3.9/dist-packages/sqlalchemy/orm/strategies.py", line 795, in _invoke_raise_load
    raise sa_exc.InvalidRequestError(
sqlalchemy.exc.InvalidRequestError: 'UserGroup.organization' is not available due to lazy='raise'

----------------------------------------------------------------------
Ran 1 test in 0.033s

FAILED (errors=1)

Versions

  • OS: MacOS Big Sur 11.5.1
  • Python: 3.9.2
  • SQLAlchemy: 1.4.25
  • Database: SQLite (in memory)
  • DBAPI (eg: psycopg, cx_oracle, mysqlclient): I don't know

Additional context

No response

@RubeRuby RubeRuby added the requires triage New issue that requires categorization label Oct 11, 2021
@CaselIT
Copy link
Member

CaselIT commented Oct 11, 2021

Hi,

That's the expected documented behaviour. Session will return locally existing element if they are in the identity map.
To force a reload you can specify populate_existing

@CaselIT CaselIT added expected behavior that's how it's meant to work. consider the "documentation" label in addition loader options ORM options like joinedload(), load_only(), these are complicated and have a lot of issues orm and removed requires triage New issue that requires categorization loader options ORM options like joinedload(), load_only(), these are complicated and have a lot of issues labels Oct 11, 2021
@RubeRuby
Copy link
Author

I see, thanks!

@CaselIT
Copy link
Member

CaselIT commented Oct 11, 2021

Another alternative may be to expire User.user_groups (I think not tried it) before the self.db.get(UserGroup, 1, [joinedload(UserGroup.organization)])

@zzzeek we have had some other issues similar to this one, maybe we need some update to the docs? @RubeRuby do you have suggestions that would have helped in this case?

@RubeRuby
Copy link
Author

For me it just wasn't clear that it would cache objects like this, since I am explicitly overwriting it with a new query. I don't know what the best place is to tell people though

@zzzeek
Copy link
Member

zzzeek commented Oct 11, 2021

Another alternative may be to expire User.user_groups (I think not tried it) before the self.db.get(UserGroup, 1, [joinedload(UserGroup.organization)])

@zzzeek we have had some other issues similar to this one, maybe we need some update to the docs? @RubeRuby do you have suggestions that would have helped in this case?

as mentioned elsewhere I want to get the Session to include a parameter so you can set up default execution options, then i want to start telling people maybe they want to turn on "populate_existing" for sessions right up front. there's not too much reason to not have this, with the caveat that we dont have a lot of experience having everyone do this such that we'd see what "reasons" there might be that we aren't aware of.

@CaselIT
Copy link
Member

CaselIT commented Oct 11, 2021

@zzzeek ok. maybe a new issue makes sense as reminder

@zzzeek
Copy link
Member

zzzeek commented Oct 11, 2021

revising my comment, in the example given they are using session.get(), and in that case i would hope the docs are pretty clear that this method does not emit SQL even for the primary entity if it's already present. if this were using session.execute(select()), that's where we tell folks to use the populate_existing execution option, but that doesnt apply to this method, which instead has a direct populate_existing pararameter.

https://docs.sqlalchemy.org/en/14/orm/session_api.html?highlight=session%20get#sqlalchemy.orm.Session.get

If the given primary key identifier is present in the local identity map, the object is returned directly from this collection and no SQL is emitted, unless the object has been marked fully expired. If not present, a SELECT is performed in order to locate the object.

options – optional sequence of loader options which will be applied to the query, if one is emitted.

@RubeRuby had you checked that documentation when you had this problem? does it explain why your test does what it does or is there something we can do to improve?

@RubeRuby
Copy link
Author

RubeRuby commented Oct 12, 2021

I did read the documentation earlier but not in a targeted way for this issue, as the title of my (non) issue suggests I hadn't attributed it to get() but to lazy='raise'. I could have expected caching through a get only but the fact that this carries over to a joinedload confused me a little I think. Reading the get docs in hindsight explains my issue, but I think I didn't fully understand the terminology reading it the first time and since I didn't have this issue then I didn't take the time to read it more closely. Looking at it now I mainly find 'local identity map' confusing, cache sounds a lot clearer to me. Though you can probably attribute that to personal experience.
A side note is that I posted a question about this on Stackoverflow 6 days ago which still has no answer but maybe just not the right people have read it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
expected behavior that's how it's meant to work. consider the "documentation" label in addition orm
Projects
None yet
Development

No branches or pull requests

3 participants