Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement natural sorting for the sites list #303

Closed
nicko170 opened this issue Jul 15, 2016 · 12 comments
Closed

Implement natural sorting for the sites list #303

nicko170 opened this issue Jul 15, 2016 · 12 comments
Labels
type: feature Introduction of new functionality to the application

Comments

@nicko170
Copy link

Orders 10, 11, 12, 13, 1, 2, 3

@nicko170
Copy link
Author

image

@jeremystretch
Copy link
Member

Looks like another candidate for natsort. Should be simple to implement.

Just out of curiosity, why do you name sites like this?

@jeremystretch jeremystretch added the type: feature Introduction of new functionality to the application label Jul 15, 2016
@nicko170
Copy link
Author

That would be awesome, I had a read but django isn't really my thing - was hoping to just be able to fix and send back :-)

Sites of ours are assigned a number which everything is labelled against - so switches are sw-6-1 sw-6-2 etc, routers bdr-1-1, bdr-1-2, bdr-6-1 - Was done before I started and I just kept it going for new sites.

@jeremystretch jeremystretch changed the title Sites Page - Order numerical numbers first Implement natural sorting for the sites list Jul 15, 2016
@Zanthras
Copy link
Contributor

My coworkers on the enterprise side name sites(aka offices) like that as well because it matches their ip space assigned. I find it odd, but to each their own.

@jeremystretch
Copy link
Member

I know I suggested natsort earlier, but thinking further on this I'm not sure it's a road we want to go down. Calling natsorted on a queryset forces its evaluation prior to pagination (which is handled by the table's RequestConfig). This means that all rows matching the given filter set are pulled from the database to be sorted, even when only a subset will be displayed.

A more efficient approach would be to effect sorting the database by CASTing integers, but it's impossible to achieve a foolproof sorting scheme since the format of a site name is arbitrary. Maybe it would be enough to order sites (and potentially other objects) first by a leading integer, if one exists, and then fall back to the default ordering. Something like this:

queryset.extra(select={
    '_number': "CAST(SUBSTRING({} FROM '^([0-9]+)') AS integer)".format(sql_col),
}).order_by('_number')

@Zanthras
Copy link
Contributor

I'll do some research, try and see if i can find a more general solution, possibly chunk the string into up to 3 or 4 parts to sort each chunk normally, which should result in better? returned ordering, and still pre data read on the db.

@Zanthras
Copy link
Contributor

Zanthras commented Jul 19, 2016

I saw this option http://stackoverflow.com/questions/12965463/humanized-or-natural-number-sorting-of-mixed-word-and-number-strings/20667107#20667107 actually implementing something like that seemed to require falling back to a querysetraw which seems to not be worth it at all. Some of the other options seemed to require custom types or tables which could be doable, but not directly obvious from django's ORM. I have here the best i could come up with in those constraints.

Given some data from the provided example at the top of the issue here is the sorted results
image

I'll submit a pull request for the complete thing if this is a decent alternative, the bulk of the code is this function.

def natural_ordering(queryset, sql_col, fallback_ordering):

    ordering = ('_strchunk1', '_intchunk1', '_strchunk2', '_intchunk2', '_strchunk3', '_intchunk3', '_strchunk4',
                '_intchunk4') + fallback_ordering

    return queryset.extra(select={
        '_strchunk1': "SUBSTRING({} FROM '^(\D+)')".format(sql_col),
        '_intchunk1': "CAST(SUBSTRING({} FROM '^(\d+)') AS integer)".format(sql_col),
        '_strchunk2': "SUBSTRING({} FROM '^\d+(\D+)')".format(sql_col),
        '_intchunk2': "CAST(SUBSTRING({} FROM '^\D+(\d+)') AS integer)".format(sql_col),
        '_strchunk3': "SUBSTRING({} FROM '^\D+\d+(\D+)')".format(sql_col),
        '_intchunk3': "CAST(SUBSTRING({} FROM '^\d+\D+(\d+)') AS integer)".format(sql_col),
        '_strchunk4': "SUBSTRING({} FROM '^\d+\D+\d+(\D+)')".format(sql_col),
        '_intchunk4': "CAST(SUBSTRING({} FROM '^\D+\d+\D+(\d+)') AS integer)".format(sql_col),
    }).order_by(*ordering)`

@jeremystretch
Copy link
Member

I think we can simplify that a lot. I only see the need for three chunks:

  1. Leading digits, if any (treated as a single integer)
  2. Middle text
  3. Trailing digits, if any (treated as a single integer)

This should cover 99% of cases, I would think.

@jeremystretch
Copy link
Member

I think I have the ordering pretty much nailed down:

queryset = super(NaturalOrderByManager, self).get_queryset().extra(select={
            id1: "CAST(SUBSTRING({}.{} FROM '^(\d+)') AS integer)".format(db_table, primary_field),
            id2: "SUBSTRING({}.{} FROM '^\d*(.*?)\d*$')".format(db_table, primary_field),
            id3: "CAST(SUBSTRING({}.{} FROM '(\d+)$') AS integer)".format(db_table, primary_field),
        })

Just deciding now how best to integrate it via managers.

@jeremystretch
Copy link
Member

jeremystretch commented Jul 20, 2016

I've implemented a simple (and possibly naïve) form of natural ordering for sites, racks, and devices. Objects are ordered appropriately based on leading and trailing integers. Please try it out.

With the new ordering applied:
sites_list

@joachimtingvold
Copy link
Contributor

Out of curiosity; why do the sort server-side? This requires extra queries every time you resort a column by clicking on it. As an example; Datatables for jQuery has natural sort.

@jeremystretch
Copy link
Member

Sorting on the client side requires pulling down the entire table. For some objects, this is potentially thousands of rows. In these cases it's usually much faster to cap the database query and return only the subset of rows which will be displayed.

if-fi pushed a commit to if-fi/netbox that referenced this issue Oct 1, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Jan 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: feature Introduction of new functionality to the application
Projects
None yet
4 participants