Skip to content

Commit

Permalink
Merge pull request #118 from allmonday/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
allmonday committed Apr 20, 2024
2 parents a41db0d + 80e0891 commit 03fb0e4
Show file tree
Hide file tree
Showing 5 changed files with 242 additions and 30 deletions.
2 changes: 2 additions & 0 deletions docs/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@ And then add some `resolve` & `post` methods, for example `resolve_comments` wil

```python linenums="1" hl_lines="7-8 11-13 20-21 24-25"
from __future__ import annotations
from pydantic_resolve import Resolver
from pydantic import BaseModel

class MySite(BaseModel):
name: str
Expand Down
177 changes: 177 additions & 0 deletions docs/philosophy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
## What's the most suitable solution for UI integration?

let's talk about a fetching & post-processing solution for building data from simple to complicated.

## First, Start from GraphQL
The major GraphQL flow is like:

```
GraphQL flow =
Query
-> Parsed Schema
-> BFR (Breadth-first resolve)
-> data
```

For most scenarios of integration, it has some shortcomings:

- Query is too flexible, need maintenance during iteration
- The Potential of dataloader is limited
- No post process: BFR runs from top to bottom, it miss the stages of from bottom to top.

We'll talk them in details and discuss about how to improve/fix them.

Demos are written in python with libs of `[pydantic, pydantic-resolve]`

### Query is too flexible

`Query` can be harmful.

`Query` is friendly for client but kind of nightmare for server, especially for integrations.

In fact exposing Query interface to client without limitation can always be a bad choose, as a result, server side lost the control of the whole business information/flow, and you’ll always be forced to take client into consideration during debugging.

It also lost the capability of refactoring for server side because queries are dynamic and difficult to predict the usages in real world, and client can query too much to lower the performance.

As a query language, GraphQL plays pretty if you are not performance sensitive.

As a API, it will be terrible if server do not has full control over the query.

So, it's a reasonable choose to maintain the `Query` at server side (eg: WunderGraph)


### Parsed Schema

Parsed schema is the result of intersection from overall schema and query target.

Comparing to keeping query at service side, keeping the `specific` schemas directly (code first) is more reasonable, and then we don't need to care about the spec of `GraphQL` any more.

- define new schema by inheriting from domain schemas.
- extend new fields with resolve methods.

```python
import blog_service as bs
import comment_service as cs

class MySiteBlog(bs.Blog):
comments: list[cs.Comment] = []
def resolve_comments(self, loader=LoaderDepend(cs.blog_to_comments_loader)):
return loader.load(self.id)

comment_count: int = 0
def post_comment_count(self):
return len(self.comments)
```


### BFR and dataloader
BFR (breadth-first resolve) is the core process for fetching descendants and at the same time - avoid N+1 query with dataloader.

It resolves level by level, concurrently.

We can run this process manually, to resolve descendants for data.

```python
data = await Resolver().resolve(data)
```

`Dataloader` here is just a simple keys collector, plays as a role of `join` in SQL.

If we want some additional `where` conditions, the interface it not supported.

It would be better if we can define loader like:

```python
class FeedbackLoader(DataLoader):
private: bool

async def batch_load_fn(self, comment_ids):
async with async_session() as session:
res = await session.execute(select(Feedback)
.where(Feedback.private==self.private)
.where(Feedback.comment_id.in_(comment_ids)))
rows = res.scalars().all()
return build_list(rows, comment_ids, lambda x: x.comment_id)
```

and provide `where` fields by:

```python
data = Resolver(
loader_params={
FeedbackLoader: {
'private': private_comment}}).resolve(data)
```

If we can revisit nodes after their descendants are resolve and do additional modifications, it's would be very powerful for constructing UI specific view data.

from simply calculate counts,
```python
class Blog(BaseModel):
id: int
title: str

comments: list[Comment] = []
async def resolve_comments(self):
return await query_comments(self.id)

comment_count: int = 0
def post_comment_count(self):
return len(self.comments)
```

to collect fields (b_name) from descendants:

```python
class A(BaseModel):
b_list: List[B] = []
async def resolve_b_list(self):
return [dict(name='b1'), dict(name='b2')]

names: List[str] = []
def post_names(self, collector=SubCollector('b_name')):
return collector.values()

class B(BaseModel):
__pydantic_resolve_collect__ = {
'name': 'b_name',
'items': 'b_items'
}
name: str
items: List[str] = ['x', 'y']
```


## Then, Start from RESTful
Developers always complain the additional queries in RESTful API: query all comments after blogs are ready.

Everything get simple after we defined Blog as:

```python
class MySiteBlog(bs.Blog):
comments: list[cs.Comment] = []
def resolve_comments(self, loader=LoaderDepend(cs.blog_to_comments_loader)):
return loader.load(self.id)

comment_count: int = 0
def post_comment_count(self):
return len(self.comments)
```

and trigger it manually by:

```python
blogs = await get_blogs()
blogs = await Resolver().resolve(blogs)
```

We can inherit the blog schemas and then extends comments for each blog, using dataloader for resolving, looks much like the GraphQL but you can still enjoy all benefits from RESTful. (caching, auth...)


## Conclusion

Compare to GQL, The major different is, these specific schemas are used by each single entry, not reused between entries. So developers can `refactor` one entry without worrying about breaking others.

Another pain point for constructing view data is the gap of structure between persistent layer and a specific view data. Resolving alone can’t overcome that except introducing some new tools to adjust the fetched data during the backward stage of BFT. It can be very very powerful.

I implemented these concepts into a python library named Pydantic-resolve, and using it with FastAPI, immediately I gained the benefits from both GraphQL and restful, also the additionally seamlessly integration experience from openapi-typescript-codegen (hey-api/openapi-ts).
81 changes: 53 additions & 28 deletions docs/tree.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Tree
# Tree, self-reference structure

## Build a tree
## Build a tree with dataloader

```python
from __future__ import annotations
Expand Down Expand Up @@ -73,36 +73,61 @@ asyncio.run(main())

## Construct the path with parent

```python hl_lines="6"
if we want to visit parent node to build a full path field, use the `parent`.

```python hl_lines="10"
class Tree(BaseModel):
name: str
id: int
content: str
children: List[Tree] = []

def resolve_children(self, loader=LoaderDepend(Loader)):
return loader.load(self.id)

path: str = ''
def resolve_path(self, parent):
if parent is not None:
return f'{parent.path}/{self.name}'
return self.name

@pytest.mark.asyncio
async def test_tree():
data = dict(name="a", children=[
dict(name="b", children=[
dict(name="c")
]),
dict(name="d", children=[
dict(name="c")
])
])
data = await Resolver().resolve(Tree(**data))
assert data.dict() == dict(name="a", path="a", children=[
dict(name="b", path="a/b", children=[
dict(name="c", path="a/b/c", children=[])
]),
dict(name="d", path="a/d", children=[
dict(name="c", path="a/d/c", children=[])
])
])
if parent:
return f'{parent.path}/{self.content}'
else:
return self.content
```

then it's done

```json
{
"id": 1,
"content": "root",
"path": "root",
"children": [
{
"id": 2,
"content": "2",
"path": "root/2",
"children": [
{
"id": 4,
"content": "4",
"path": "root/2/4",
"children": []
}
]
},
{
"id": 3,
"content": "3",
"path": "root/3",
"children": [
{
"id": 5,
"content": "5",
"path": "root/3/5",
"children": []
}
]
}
]
}
```

## Sum up from bottom to top
Expand Down Expand Up @@ -140,7 +165,7 @@ asyncio.run(main())
```


```json hl_lines="18 21"
```json hl_lines="7 15 18 21"
{
"count": 10,
"children": [
Expand Down
11 changes: 9 additions & 2 deletions examples/11_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,18 @@ async def batch_load_fn(self, keys):
class Tree(BaseModel):
id: int
content: str

path: str = ''
def resolve_path(self, parent):
if parent:
return f'{parent.path}/{self.content}'
else:
return self.content

children: List[Tree] = []

def resolve_children(self, loader=LoaderDepend(Loader)):
return loader.load(self.id)

async def main():
tree = Tree(id=1, content='root')
tree = await Resolver(loader_params={Loader: {'records': records}}).resolve(tree)
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ nav:
- Dataloader: dataloader.md
- Inheritance: inherit.md
- Tree: tree.md
- Philosophy: philosophy.md
- Reference:
- API: reference_api.md
- Change log:
Expand Down

0 comments on commit 03fb0e4

Please sign in to comment.