Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backreferences in defining colums #83

Closed
pankus opened this issue Feb 8, 2017 · 5 comments
Closed

Backreferences in defining colums #83

pankus opened this issue Feb 8, 2017 · 5 comments
Labels

Comments

@pankus
Copy link

pankus commented Feb 8, 2017

First, thank you so much for your work. It is great!
I'm trying to manage a table having m:m relationship mapped with a backref. Something similar to the Parent:Child scheme as described in the sqlalchemy tutorial (http://docs.sqlalchemy.org/en/latest/orm/basic_relationships.html#many-to-many).
By defining the columns for the Parent class, for instance, if I use the backreference Parent.children this message appears: " DataTables warning: table id=maindata - Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with Parent.children has an attribute 'cast' ".
In sum, this code does not work as I expect (but I'm a newbie):

columns = [
...
ColumnDT(Parent.id, search_method='yadcf_range_number_slider'),
ColumnDT(Parent.children, search_method='yadcf_text'),
...
]

Is there any workaround to solve such a case?

@tdamsma
Copy link
Collaborator

tdamsma commented Feb 9, 2017

Can you also post your query? It should be something like this

query().select_from(Parent).join(Child)

Then you can add columns like so:

columns = [
...
ColumnDT(Parent.id, search_method='yadcf_range_number_slider'),
ColumnDT(Child.name, search_method='yadcf_text'),
...
]

Of course this will result in a table row for each combination of Parent<->Child you have. If you want to have one row for every Parent, then you have to aggregate the Children into one string, e.g. a comma separated list. You should be able to do that with string_agg or something, not 100% sure.

query().select_from(Parent).outerjoin(Child).group_by(Parent.id)

And a column like:

ColumnDT(func.string_agg(Child.name, ','), search_method='yadcf_text')

@pankus
Copy link
Author

pankus commented Feb 9, 2017

dear tdamsma thank you very much for your quick answer.
Well, following your suggestion, the main query now is:

query = db.session.query().select_from(Parent).outerjoin(Child, Parent.children)

and the columns:

columns = [
...
ColumnDT(Parent.id, search_method='yadcf_range_number_slider'),
ColumnDT(Child.name, search_method='yadcf_text'),
...
]

In this case everything works well
However, as you pointed out, this gives a table with each possible combination of the relationship Parent<->Child, whereas I need a row for every Parent and an array of children grouped together. But if the main query becomes:

query = db.session.query().select_from(Parent).outerjoin(Child, Parent.children).group_by(Parent.id)

with the relevant column:

ColumnDT(func.string_agg(Child.name, ', '), search_method='yadcf_text')

This error message appears:

DataTables warning: table id=maindata - Multiple rows were found for one()

As far as I can understand, something happens when the group_by clause is added to the query.
On the other hand I checked the query in my database as well as in python shell and it is correct. It gives back the exactly the number of rows I would expect.

p.s. In one of my tries I forgot to append .group_by(Parent.id) to the query. I that case the error message was:

DataTables warning: table id=maindata - (psycopg2.ProgrammingError) aggregate functions are not allowed in WHERE

I don't know if this is relevant here or is a different error.

@tdamsma
Copy link
Collaborator

tdamsma commented Feb 10, 2017

@pankus, glad to help. Forgot about that, but indeed aggregate functions are not allowed in WHERE, so you can not search for the child column now. I guess you do want to be able to do this, in which case you can use a CTE / WITH clause to achieve this.

Luckily, using SQLAlchemy this is not too difficult. From the top of my head (so might contain some mistakes):

parent_with_children = dbsession.query(
    Parent.id, 
    func.string_agg(Child.name, ', ').label('children')
).\
    outerjoin(Child).\
    group_by(Parent.id).\
    cte()

query = db.session.query().select_from(parent_with_children )

columns = [
ColumnDT(parent_with_children .c.id, search_method='yadcf_range_number_slider'),
ColumnDT(parent_with_children .c.children, search_method='yadcf_text'),
]

Note that if your data set is large, this might be a bit slow, as I believe the join between parent and children is always run over the entire parent and children table. The result of this join is put in a CTE, which is then filtered and paginated. Also, this CTE is not indexed.

To overcome this, you could define a separate Materialized view for the parent_with_children table, which you could index/limit/filter as any other table, with no performance impact. Of course, you then need to put in triggers or other logic to keep the Materialized view up to date.

Another alternative would be to partially denormalize you schema, by storing the children in the parent in e.g. a JSONB column. Of course this has other implications

@pankus
Copy link
Author

pankus commented Feb 10, 2017

Congratulations @tdamsma. I'm really impressed by your knowledge of this subject. I'm new to sqlalchemy and above all I'm not a real programmer, thus all this appears to me really as a true alchemy.
My application now works like a charm. As a minor drawback, I noticed just a little delay in the responses; however this is not a real problem, especially if one takes into consideration the many advantages of using db relationships in such a way. Thanks a lot.

Just the last question (only to learn something more). What about the error message
DataTables warning: table id=maindata - Multiple rows were found for one()

Did you understand what does it means? Does it depend on .group_by() clause?

@tdamsma
Copy link
Collaborator

tdamsma commented Feb 17, 2017

In the function _set_yadcf_data min/max values are determined for the 'yadcf_range_number_slider' columns. Selecting the min/max of a query with a group_by clause returns more than one row. Am not entirely sure if using Group_by is actually a good idea and should be supoorted (in which case thsi is a bug), or if it is better to disallow use of group_by in the query altogether.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants