Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated table rows after sorting by column header #361

Closed
wtfrank opened this issue Aug 3, 2016 · 36 comments
Closed

Duplicated table rows after sorting by column header #361

wtfrank opened this issue Aug 3, 2016 · 36 comments
Labels

Comments

@wtfrank
Copy link

wtfrank commented Aug 3, 2016

I'm experiencing an issue where some rows are rendered multiple times in place of other rows. The table in question used to render perfectly well on an earlier versions of django and django-tables2. I upgraded both of them at the same time to the latest version of each (django 1.10, django-tables2 1.2.4) then this issue started happening.

See this image here for a demonstration: http://imgur.com/a/pPRT8

When I load a table without choosing a sort order by the column, I don't see this issue. Once I click on a column header to change the sort order then rows are duplicated.

So far I've identified that the data passed into the BoundRows object is correct, with no duplicated items. However iterating through the bound rows returns duplicated entries. So it looks like something may be changing the data object while the iteration is taking place.

This issue is happening on tables of length ~150, and I haven't (yet) been able to reproduce this issue with a smaller test dataset.

This snippet shows the test code I have used to narrow down the cause of the issue.

--- a/django_tables2/rows.py
+++ b/django_tables2/rows.py
@@ -8,6 +8,7 @@ from django.utils import six
 from .columns.linkcolumn import BaseLinkColumn
 from .utils import A, AttributeDict, call_with_appropriate, computed_values

+import sys

 class BoundRow(object):
     """
@@ -187,9 +188,17 @@ class BoundRows(object):
     def __init__(self, data, table):
         self.data = data
         self.table = table
+        for d in data:
+            print >>sys.stderr, d
+
+        print >>sys.stderr, "end data"
+

     def __iter__(self):
         for record in self.data:
+            print >>sys.stderr, "__iter__", record
             yield BoundRow(record, table=self.table)
@wtfrank
Copy link
Author

wtfrank commented Aug 3, 2016

Crystallising the queryset via list() stops the issue happening:

+++ b/django_tables2/rows.py
@@ -8,6 +8,7 @@ from django.utils import six
 from .columns.linkcolumn import BaseLinkColumn
 from .utils import A, AttributeDict, call_with_appropriate, computed_values

+import sys

 class BoundRow(object):
     """
@@ -187,9 +193,26 @@ class BoundRows(object):
     def __init__(self, data, table):
         self.data = data
         self.table = table


     def __iter__(self):
+        print >>sys.stderr, type(list(self.data.data))

@jieter
Copy link
Owner

jieter commented Aug 17, 2016

Thanks for reporting, casting to list is not acceptable as a solution since that won't work with bigger datasets.

Can you try to downgrade your django-tables2 version to make sure it's not a problem with updating your Django version?

@wtfrank
Copy link
Author

wtfrank commented Aug 17, 2016

Yeah definitely not a solution - but it may be indicative of the cause. The other reason casting to list isn't a suitable solution is because it doesn't always seem to be a queryset holding the data anyway, in which case self.data.data won't exist.

@jieter
Copy link
Owner

jieter commented Aug 22, 2016

@wtfrank, I see, but it would really help if you could provide a reproducible test case. Did you try downgrading django-tables2 to the previous version you used?

@fliphess
Copy link

fliphess commented Sep 5, 2016

I think i'm running into the same issue: In my table, one row is missing, and another one is a duplicate... For me this started with version 1.2.2.
Downgrading to 1.2.1 "fixxes" the problem for me.... Is there any information i can provide to make the search easier?

@jieter
Copy link
Owner

jieter commented Sep 6, 2016

A minimal test case would be really nice to have. We can then run git bisect to find the bad commit.

@fliphess
Copy link

fliphess commented Sep 6, 2016

I'm looking into a test case at the moment, but strangely enough i cannot reproduce the issue on my test environment ...

Locally (using ./manage runserver using sqlite) all is well, while in production (running uwsgi and mysql) the problem of dupes exists... (both environments run on equal versions)

The queryset does not contain any dupes.... But at the time the table.html is rendered, the queryset is missing one object and another one is duplicate...

Getting back to you soon with a test case hopefully :)

@felixxm
Copy link
Contributor

felixxm commented Sep 6, 2016

The issue occurs also in 1.2.6 version. I think it is sth with #329 or #330 but so far I can not find way to fix it.

@jieter
Copy link
Owner

jieter commented Sep 7, 2016

Those are only applicable to 1.2.6, so I don't expect them to be the source of this bug.

@drdaeman
Copy link

drdaeman commented Oct 8, 2016

Sorry, I'm not able to provide a proper reproducible case, but I've just got this issue (or, at least, I believe it's this issue) and can tell that it certainly had happened between 1.2.1 and 1.2.2.

Whenever I downgrade to 1.2.1, everything looks just fine. When I upgrade to 1.2.2 or above (I've started with latest 1.2.6, of course), I consistently lose the first row (out of three I have in DB) and get a copy of another one instead of it. E.g., when on 1.2.1 I have:

date hits ...
2016-10-08 123 ...
2016-10-07 321 ...
2016-10-06 0 ...

On 1.2.2-1.2.6 I consistently get this instead:

date hits ...
2016-10-06 0 ...
2016-10-07 321 ...
2016-10-06 0 ...

I'm using Django 1.9.9 on Python 2.7.12, project was started using pydanny/cookiecutter-django (although heavily modified after that), and my code looks like this:

class DailySummaryTable(tables.Table):
    class Meta:
        model = DailySummary
        attrs = {"class": "paleblue"}
        empty_text = _("No stats yet")
        fields = ("date", "hits", ...long boring list...)
        # order_by = ("-date",)
        # orderable = False

Model's Meta only has verbose_name{,_plural} stuff, and view is a simple DetailView that looks like this:

class SourceDetailView(LoginRequiredMixin, DetailView):
    model = Source
    ...
    def get_context_data(self, **kwargs):
        ctx = super(...)
        ctx["stats"] = DailySummaryTable(DailySummary.objects.filter(source=self.object))
        return ctx

And there are three rows in DB (PostgreSQL 9.6), all having the same source_id (I have only one), so they all match the query. Strangely, If I replace .filter(source=...) with .all() the issue seems to disappear.

That's all I was able to figure out. Hope it helps. I guess, I'll just stick with 1.2.1 for now :)

@jieter
Copy link
Owner

jieter commented Oct 9, 2016

Thanks for showing your wanderings, I'll look into this issue later.

@op-alex-reid
Copy link

I just ran into the same issue, crystallising the queryset via list() is not a viable solution on our side neither.

@jieter
Copy link
Owner

jieter commented Oct 19, 2016

@op-alex-reid can mlyou try to turn your use case into a minimal reproducible test case?

@lukasklein
Copy link

I'm having the same problem (Django==1.9, django-tables==1.2.3) with the following code:

class UserPlayerTable(tables.Table):
    actions = tables.LinkColumn('player:my_players_detail', args=[A('slug')],
                                text=_('View / Edit'),
                                verbose_name=_('View / Edit'), empty_values=())
    embed = tables.LinkColumn('player:my_players_embed', args=[A('slug')],
                                text=_('View / Embed now'),
                                verbose_name=_('View / Embed now'), empty_values=())

    class Meta:
        template = 'tables/table.html'
        model = Player
        fields = ('created', 'slug', 'actions', 'embed')
        attrs = {"class": "changeset"}
        order_by = ['-created']
        orderable = False

and

UserPlayerTable(Player.objects.filter(user=context['impersonate_user']))

Interestingly, when I set orderable = True, the problem does not occur.

In my case, crystallising the queryset via list() is an option (and the fix for now), since there are only at most 5 rows in the table.

@jieter
Copy link
Owner

jieter commented Oct 27, 2016

Thanks for the example!

@jieter jieter added the bug label Nov 14, 2016
@jieter
Copy link
Owner

jieter commented Dec 10, 2016

I tried reproducing this using this code:

class Player(models.Model):
    person = models.ForeignKey(Person)
    created = models.DateTimeField(auto_now_add=True)
    score = models.PositiveIntegerField()

def test_issue_361(per_page=5):

    bob = Person.objects.create(first_name='Bob', last_name='Builder')
    eve = Person.objects.create(first_name='Eve', last_name='Dropper')

    for i in range(10):
        Player.objects.create(person=bob, score=randint(0, 200))
        Player.objects.create(person=eve, score=randint(200, 400))
        Player.objects.create(person=bob, score=5)
        Player.objects.create(person=eve, score=5)

    class UserPlayerTable(tables.Table):
        class Meta:
            model = Player
            fields = ('person.name', 'score',)
            order_by = ['-score']
            orderable = False

    queryset = Player.objects.filter(score=5)

    table = UserPlayerTable(queryset)
    RequestConfig(request, paginate={'per_page': per_page}).configure(table)
    html = table.as_html(request)

    # count the number of rows, subtract one for the header
    assert (html.count('<tr') - 1) == per_page

but this doesn't yield duplicated rows...

@n0ctua
Copy link

n0ctua commented Jan 20, 2017

I encountered the same issue today. I'm using django-tables2==1.2.9 in a Django==1.10 project, where I also utilize django-filters==1.0.1.

Interestingly enough the doubling of rows in my table only occurs if

  1. there are many entries to display
    (so if I enable pagination with 25 records per page everything is fine, but if I display the full dataset I get multiple identical entries)

  2. the table is sorted by a column that contains non-strings
    (everything is fine if I sort by a column that contains e.g. names, but sorting by columns with dates, ints or bools results in trouble)

Downgrading to 1.2.1 solves the problem, but that is not really an option.
Furthermore the dupes seem to be shown instead of and not in addition to real entries, because the count {{filter.qs.count}} is as it should be.

I hope this helps a little bit with getting to the bottom of the problem, because other than that I really enjoy working with django-tables2.

Thanks for all your work!

@jieter
Copy link
Owner

jieter commented Jan 20, 2017

@n0ctua thanks!

Can you have a look at the exact queries executed and maybe share them here?

@intiocean
Copy link
Contributor

I've just had this problem too and understand why it's happening for me so thought I'd post as I suspect others have it for the same reason. (Although, I haven't tried with other versions of django-tables2, so I might be wrong as I don't understand why that would stop this happening... Although... actually thinking about it I would hypothesise that this is because in older versions non-paginated tables ran a single query against the db instead of a query for each row – more details on this below...)

It's because SQL order by on non-unique fields is not deterministic so when combined by top n you get the same top n rows again and again. (Also, for some reason when I create a Table with no pagination it generates a query for every row. I.E. top 1).

I'm not sure what the best solution is here? I think the only way to guarantee this will work is to always have the final order by be on a unique field. Ideally the primary key? Also, might be worth adding some notes to the docs warning about this?

As a separate problem but one that will really reduce how often this happens (and speed up queries for non-paginated tables) is if instead of running a top 1 query for every row in the table would be just running a single query that fetches all the results.

@jieter
Copy link
Owner

jieter commented Feb 27, 2017

@intiocean this number of queries for non-paginated tables is quite strange, not sure why this happens, and it should not happen. This test case shows the problem:

def test_single_query_for_non_paginated_table(settings):
    '''
    A non-paginated table should not generate a query for each row, but only
    one query to count the rows and one to fetch the rows.
    '''
    from django.db import connection
    settings.DEBUG = True

    for i in range(10):
        Person.objects.create(first_name='Bob %d' % randint(0, 200), last_name='Builder')

    num_queries_before = len(connection.queries)

    class PersonTable(tables.Table):
        class Meta:
            model = Person
            fields = ('first_name', 'last_name')

    table = PersonTable(Person.objects.all())
    request = build_request('/')
    RequestConfig(request, paginate=False).configure(table)
    table.as_html(request)

    # print '\n'.join(q['sql'] for q in connection.queries)
    assert len(connection.queries) - num_queries_before == 2

@intiocean
Copy link
Contributor

intiocean commented Feb 27, 2017

I have tried, unsuccessfully, to create a test for the paginated case that fails. I don't understand why but everything appears to work as normal in the paginated case... I thought I'd get rows that appear as if duplicated on multiple pages if sorting by a non-unique field.

jieter added a commit that referenced this issue Feb 27, 2017
jieter added a commit that referenced this issue Feb 27, 2017
jieter added a commit that referenced this issue Feb 27, 2017
@jieter
Copy link
Owner

jieter commented Feb 27, 2017

@wtfrank, @intiocean @n0ctua @op-alex-reid @fliphess I just pushed some commits to master which might fix this. Can you verify?

@jieter
Copy link
Owner

jieter commented Feb 27, 2017

@intiocean If that is the source of this issue, I think we can only fix this issue partially by documenting this behaviour.

intiocean pushed a commit to intiocean/django-tables2 that referenced this issue Feb 27, 2017
intiocean pushed a commit to intiocean/django-tables2 that referenced this issue Feb 27, 2017
@intiocean
Copy link
Contributor

intiocean commented Feb 27, 2017

I can confirm that this fixes non-paginated tables @jieter. This also improves performance for non-paginated tables as there is now only one query instead of one for each row.

Check:
With 962f502 and sorting by the 'Notes' column -> duplication
image

With f853078 and sorting by the 'Notes' column -> no duplication 🎉
image

@intiocean
Copy link
Contributor

Any chance you could do a release @jieter?

@jieter
Copy link
Owner

jieter commented Feb 27, 2017

wohoo, finally fixed!

Yes, I'll release within an hour.

@jieter jieter closed this as completed Feb 27, 2017
@jieter
Copy link
Owner

jieter commented Feb 27, 2017

released 1.4.0

@intiocean
Copy link
Contributor

Awesome, thanks @jieter!

@n0ctua
Copy link

n0ctua commented Mar 1, 2017

@jieter and @intiocean thank you both so much for fixing this! Works perfectly now.

@sibiryoff
Copy link

sibiryoff commented Jan 20, 2019

Hi, I have just experienced this issue on 1.21.2 version and Django==2.0.6. Thanks to intiocean 's comment, fixed it with setting additional ordering by 'pk'.

@silvio-liqi
Copy link

We are working with Django==4.1.2 and django-tables2 = "^2.5.3" and are having this issue.
Adding list to the get_table_data fixed the duplicates while ordering for a header column, but it's a workaround

@jieter
Copy link
Owner

jieter commented Oct 6, 2023

Does this comment help? #361 (comment)

@silvio-liqi
Copy link

silvio-liqi commented Feb 1, 2024

No it doesn't. We filter the queryset using both a date field and the id (primary key field), but records are still duplicated between pages.

The problem appears when using a header column to sort (the date column) and the results are paginated.

Current workaround is to wrap the queryset in a list/iter: in this case, the uniqueness of records is preserved

@silvio-liqi
Copy link

I will report some additional info during debugging.
I think the "problem" may lay within the RequestConfig
image

Problem is that table.order_by is overridden by current request sort parameters. So it is ignored any previous order by done by database

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests