Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: relationships loading #105

Closed
wants to merge 12 commits into from

Conversation

gazorby
Copy link
Contributor

@gazorby gazorby commented Nov 30, 2023

Pull Request Checklist

  • New code has 100% test coverage
  • (If applicable) The prose documentation has been updated to reflect the changes introduced by this PR
  • (If applicable) The reference documentation has been updated to reflect the changes introduced by this PR
  • Pre-Commit Checks were ran and passed
  • Tests were ran and passed

Description

Expose SQLAlchemy relationship loading techniques through the repository interface.

The idea is to have SQLAlchemy loading styles more integrated into the repository API, via a single .load() repository method that sets the relationships to load on the repository model.

Here is the proposed API:

# List author and load all their books
authors = await AuthorRepo(session=session).load(books=True).get_one(name="J.R.R Tolkien")
assert all(isinstance(book, BookModel) for book in author.books)

# Calling .load() only affect the next query, so no relationship will be explictly loaded here
authors = await AuthorRepo(session=session).get_one(name="J.R.R Tolkien")
print(authors[0].book.title) # May throw an error if book is not configured to be lazy loaded on the mapper side

# Go one step deeper by loading publisher too
# Chil relations are chained with the '__' separator 
author = await AuthorRepo(session=session).load(books__publisher=True).get_one(name="J.R.R Tolkien")
assert all(isinstance(book, BookModel) for book in author.books)
assert all(isinstance(book.publisher, PublisherModel) for book in author.books)

# Ellispis (...) load all nested relationship under (and including) the specified one
author = await AuthorRepo(session=session).load(books=...).get_one(name="J.R.R Tolkien")
assert all(isinstance(book, BookModel) for book in author.books)
assert all(isinstance(book.publisher, PublisherModel) for book in author.books)
assert all(isinstance(book.publisher.company, CompanyModel) for book in author.books)

# We can customize how relationship is loaded by passing a SQLAlchemy relationship loading style
# https://docs.sqlalchemy.org/en/20/orm/queryguide/relationships.html#summary-of-relationship-loading-styles
author = await AuthorRepo(session=session).load(books="subqueryload").get_one(name="J.R.R Tolkien")
assert all(isinstance(book, BookModel) for book in author.books)

# Exclude all relationships to be loaded (overriding mapper configuration), and only load books
load_config = SQLAlchemyLoadConfig(default_strategy="raiseload")
author = await AuthorRepo(session=session).load(load_config, books=True).get_one(name="J.R.R Tolkien")
assert all(isinstance(book, BookModel) for book in author.books)

# You can pass a SQLAlchemyLoad object if you want
# to set a default set of relationships to be loaded on a repository
repo = AuthorRepo(session=session, load=SQLAlchemyLoad(books=True))

author = await repo.get_one(name="J.R.R Tolkien")
assert all(isinstance(book, BookModel) for book in author.books)

await repo.session.expunge_all()

authors = await repo.list()
for author in authors:
    assert all(isinstance(book, BookModel) for book in author.books)

It also makes relationship loading more "composable" since SQLAlchemyLoad are standalone objects that can be reused.

Close Issue(s)

@cofin
Copy link
Member

cofin commented Nov 30, 2023

This is a cool idea! You are on a roll this week!

Is it possible to get the tests working for 3.8?

from advanced_alchemy.repository._util import get_instrumented_attr, wrap_sqlalchemy_exception
from advanced_alchemy.repository.typing import ModelT
from advanced_alchemy.utils.deprecation import deprecated

if TYPE_CHECKING:
from collections import abc
from datetime import datetime
from typing import Self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may need to be typing_extensions for us to get 3.8 support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just fixed it, but EllipsisType does not seem to available before 3.10

@@ -169,6 +193,9 @@ async def add(
"""
with wrap_sqlalchemy_exception():
instance = await self._attach_to_session(data)
if self._load:
await self._flush_or_commit(auto_commit=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the auto_commit follow what was sent in from the method or actually be True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should stay True since the following self._refresh_with_load() emits a select to get back the newly inserted rows with loaded relationships.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking through this a bit more, and it's still not totally clear.

We definitely need a flush so that the inserted row is loaded into relationship. As long as it isn't a new session though, we don't have to commit to make that happen. However, i'm not quite following why the commit is necessary.

Is there a simple use case that I can walk through to visualize why the commit over a flush is needed?

@gazorby
Copy link
Contributor Author

gazorby commented Nov 30, 2023

Actually, it comes from topsport-com-au/starlite-saqlalchemy#304, after some iterations ;)

@provinzkraut
Copy link
Member

provinzkraut commented Nov 30, 2023

This is a great addition @gazorby!

However, I have concerns about the interface. The pipeline style doesn't really fit in with the rest of the repositories, as it's the only place this would be used. Furthermore, the SQLAlchemyLoad, in combination with the pipeline pattern on the repository, comes quite close to just remodeling SQLAlchemy's query building, if we think a few iterations and feature additions ahead here.

Another concern is the "blanket keyword argument" style, which isn't great for type checking (and testing), and is something that - in this case - SQLAlchemy would do better than we here.

I would propose that for now, the interface for this stays functional, which could maybe look something like this:

# List author and load all their books
authors = await AuthorRepo(session=session).get_one(name="J.R.R Tolkien", load=Author.books)
assert all(isinstance(book, BookModel) for book in author.books)

author = await AuthorRepo(session=session).get_one(
  name="J.R.R Tolkien", 
  load=[Author.books, BookModel.publisher]
)
assert all(isinstance(book, BookModel) for book in author.books)
assert all(isinstance(book.publisher, PublisherModel) for book in author.books)

# use loaders from sqla
author = await AuthorRepo(session=session).get_one(
  name="J.R.R Tolkien",
  load=subqueryload(Author.books)
)
assert all(isinstance(book, BookModel) for book in author.books)

@gazorby gazorby marked this pull request as draft February 7, 2024 14:37
@LonelyVikingMichael
Copy link

Hi @gazorby

I wasn't sure if you'd be continuing with this, so I'd made my own, admittedly lazier POC here #130 - I've just now noticed your recent activity.
Aside from agreeing with @provinzkraut's comments above, what I really like about SQLAlchemy's loader strategies is fine grained control over deeply nested data, to be more specific - the load_only method comes to mind.

If we take for example the relationship of Author -> Books -> Chapters, we can do the following with pure SQLAlchemy:

result = session.execute(
    select(Author).options(selectinload(Author.books).selectinload(Book.chapters).load_only(Chapter.name)
)

This gives us the opportunity to emit a more lightweight query in the context of the book hierarchy if the Chapter table for example had multiple other columns of metadata, but in this particular context, we're only interested in the name.

I also often make use of has() and any() to refine results, e.g.

# find out about authors and only their books containing "foo" in the title
select(Author).options(selectinload(Author.books.any(Book.title.ilike("%foo%")))

The native approach makes a lot of sense to me here, let me know your thoughts?

@gazorby
Copy link
Contributor Author

gazorby commented Feb 14, 2024

Hi @LonelyVikingMichael!

I refactored my codebase with a new implementation exposing the API @provinzkraut suggested, which I also agree with, and brings several benefits over my previous iteration, not least being fully typed and much lighter.

So now I can use plain SQLAlchemy loaders or pass a nested list of relationships and let the repository generate the loaders:

author_repo.get(id, load=Author.books)
author_repo.get(id, load=[(Author.books, Book.chapter), Author.publisher]
author_repo.get(id, load=selectinload(Author.books).selectinload(Book.chapters).load_only(Chapter.name))

Will update the PR when I have some time

@gazorby
Copy link
Contributor Author

gazorby commented Apr 7, 2024

supplanted by #157

@gazorby gazorby closed this Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants