Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupConcat deserialize to lists or sets #50

Open
adamchainz opened this issue Mar 27, 2015 · 12 comments
Open

GroupConcat deserialize to lists or sets #50

adamchainz opened this issue Mar 27, 2015 · 12 comments

Comments

@adamchainz
Copy link
Owner

It's kinda annoying that GroupConcat doesn't automatically split back into lists (or maybe sets when distinct=True) on the python side - 90% of the use cases require the user to do this manually with str.split on the result before being able to use this code.

I've looked into using the list and set fields from #10 and #11 but it's a bit complicated - they don't support the custom separators, the datatypes aren't so neat to convert, it requires a bit of trickery. It might still be possible, otherwise we might have to just write custom code or some dynamic field type to achieve this - so that it ties in with the rest of the ORM.

@adamchainz adamchainz changed the title GroupConcat deserialize with SetTextField or ListTextField GroupConcat deserialize to lists or sets Apr 6, 2015
@tuky
Copy link
Contributor

tuky commented May 20, 2019

This would be a really nice feature. currently i am working around this by doing

Subquery(Coalesce(qs.annotate(annotation=GroupConcat('related_pk')).values('annotation'), Value(''), output_field=ListCharField(models.CharField())))

but this is buggy, when you want to do __contains on the subquery. With the Coalesce above it searches with the Value('') in the subquery expression and if you remove the Coalesce and replace it with an ExpressionWrapper to do the conversion to ListCharField, then you end up with

  File "/var/env/lib/python3.6/site-packages/django_mysql/models/lookups.py", line 175, in as_sql
    params = lhs_params + rhs_params
TypeError: can only concatenate tuple (not "list") to tuple

edit:
Using django 2.2

edit 2:
currently i am monkey patching SetContains.as_sql to feature

    params = list(lhs_params) + list(rhs_params)

@adamchainz
Copy link
Owner Author

Re: edit 2, that seems like it shouldn't be necessary as both vars should be lists. Can you give the full traceback? Where did the tuple come from? Transforms inside Django itself expect to receive lists for both params variables, e.g. https://github.com/django/django/blob/master/django/contrib/postgres/lookups.py#L12

@tuky
Copy link
Contributor

tuky commented May 28, 2019

Here is the full traceback:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
    City.objects.annotate_region_slugs().filter(region_slugs__contains='foo-slug')
  File "/var/env/lib/python3.6/site-packages/django/db/models/query.py", line 250, in __repr__
    data = list(self[:REPR_OUTPUT_SIZE + 1])
  File "/var/env/lib/python3.6/site-packages/django/db/models/query.py", line 274, in __iter__
    self._fetch_all()
  File "/var/env/lib/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/var/env/lib/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/var/env/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1087, in execute_sql
    sql, params = self.as_sql()
  File "/var/env/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 489, in as_sql
    where, w_params = self.compile(self.where) if self.where is not None else ("", [])
  File "/var/env/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 405, in compile
    sql, params = node.as_sql(self, self.connection)
  File "/var/env/lib/python3.6/site-packages/django/db/models/sql/where.py", line 81, in as_sql
    sql, params = compiler.compile(child)
  File "/var/env/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 405, in compile
    sql, params = node.as_sql(self, self.connection)
  File "/code/src/core/fields.py", line 65, in as_sql
    params = lhs_params + rhs_params
TypeError: can only concatenate tuple (not "list") to tuple

i monkey patched the method like so to print some outputs:

def as_sql(self, qn, connection):
    lhs, lhs_params = self.process_lhs(qn, connection)
    rhs, rhs_params = self.process_rhs(qn, connection)

    print(repr(lhs))
    print(repr(lhs_params))
    print(repr(rhs))
    print(repr(rhs_params))

    params = lhs_params + rhs_params
    # Put rhs on the left since that's the order FIND_IN_SET uses
    return 'FIND_IN_SET(%s, %s)' % (rhs, lhs), params

it prints this:

'(SELECT GROUP_CONCAT(U0.`slug`) AS `slugs` FROM `core_region` U0 WHERE ST_Contains(U0.`polygon`, (`core_city`.`point`)))'
()
'%s'
['foo-slug']

@adamchainz
Copy link
Owner Author

Huh. It looks like this might be a bug in django itself, the subquery on the left hand side is what's returning its params in a tuple. And indeed it seems that the SQLCompiler class forces tuples: https://github.com/django/django/blob/master/django/db/models/sql/compiler.py#L614 . Can you raise this on https://code.djangoproject.com/ , CC me, and even look into a patch? 😇 https://docs.djangoproject.com/en/dev/internals/contributing/

@tuky
Copy link
Contributor

tuky commented May 29, 2019

Tuples seem the way to go, so this might require fixing in here, i guess.

adamchainz added a commit that referenced this issue Jun 12, 2019
adamchainz added a commit that referenced this issue Jun 14, 2019
@adamchainz
Copy link
Owner Author

Version 3.2.0 treats everything as tuples, try that.

@tuky
Copy link
Contributor

tuky commented Jun 14, 2019

Thank you very much, don't have to monkey patch it anymore :-)

@javiplx
Copy link

javiplx commented Apr 26, 2020

Not completelly sure whehther is related to this particular issue, but with 1.11.17/3.2.0 for django/django-mysql, I've need to patch following https://jira.mongodb.org/browse/PYMODM-77.
With that change, I can successfully run

[...].annotate(members=GroupConcat('user_id', output_field=SetCharField(CharField))).[...]

I also needed to raise value of group_concat_max_len mysql variable, but that's just because my output lists are very long and got truncated in the mysql side

I'm still in the process to review this, because I get a discrepancy in the number of entries reported by count() and group_concat() with the same group_by/annotation

EDIT: Forget about the mismatch. It just happened that my sample record seemed to be completed with the extended length (not clearly ending with , as with original sizes). Setting max_len to x100 (instead of x10 as originally), produced the right result.

@adamchainz
Copy link
Owner Author

@javiplx I am not sure how that MongoDB ORM issue aligns with here. Can yuo make a PR for your patch, including a test that fails before and passes after?

@javiplx
Copy link

javiplx commented Apr 26, 2020

I wasn't even aware that it was mongodb for django. Just finding matches for the exception I got.
I'll fork and submit the changes.

@javiplx
Copy link

javiplx commented Apr 26, 2020

The PR is there (#646). It cannot be merged because i started from tag v3.2.0 but you can see changes there.

And it is even more funnier.
With the original 3.2.0 from PyPI, I got

TypeError: to_python() missing 1 required positional argument: 'value'

which is the text what lead me to the "solution" (which is the first commit on the PR), that enabled me to give explicit output_field argument. Then, my laziness suggested me to supply default fields on code (as already suggested there), and I found that when output_field is not explicitly given I get a similar exception , but complaining about too many positional arguments (our python is 3.7.4 in case that matters).

So, I'm somewhat happy because second commit is what I actually wanted, but also a bit disliked. Anyway, I'll make shortly a second PR with that second commit

@adamchainz
Copy link
Owner Author

I wasn't even aware that it was mongodb for django.
It's not even for Django. They just happened to copy the Django ORM for their MongoDB ORM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants