Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with ORDER BY on multiple columns #132

Closed
ajrcarey opened this issue Mar 23, 2019 · 6 comments

Comments

Projects
None yet
2 participants
@ajrcarey
Copy link

commented Mar 23, 2019

First, thank you for creating nanoSQL!

I'm having a problem with ORDER BY and I can't figure out if I'm doing something wrong or if there's a bug somewhere in nanoSQL. It's hard to believe the latter, given how obvious the problem is, but I really can't see what I'm doing wrong. Hopefully it's something obvious.

This is using the pre-built nano-sql.min.js version 2.2.1, which is the latest I could find. The distribution URL is https://cdn.jsdelivr.net/npm/@nano-sql/core@2.2.1/dist/ . If there's a later pre-built version, please let me know where it's located and I'll happily upgrade.

Consider the following html / javascript file, which loads five identically-shaped CSV files containing random user data into a table and then runs a query:

<!DOCTYPE html>
<html lang="en">
<head>
    <title>NanoSQL ORDER BY test</title>
    <meta charset="utf-8">
    <script src="https://cdn.jsdelivr.net/npm/@nano-sql/core@2.2.1/dist/nano-sql.min.js"></script>
</head>
<body>
</body>
<script>
    nSQL().connect({
        id: 'test',
        mode: 'TEMP',
        tables: [
            {
              name : 'users',
              model : {
                'id:uuid' : {'pk': true},
                'forename:string' : {},
                'surname:string' : {},
                'gender:string' : {},
                'age:int' : {},
                'education:string' : {},
                'occupation:string' : {},
                'salary:int' : {}
              },
              indexes: {
                'forename:string' : {},
                'surname:string' : {},
                'age:int' : {},
                'gender:string' : {},
                'education:string' : {},
                'occupation:string' : {},
                'salary:int' : {}
              }
            }
        ]
    }).then(() => {
        var promises = [];

        for (var i = 1; i <= 5; i ++) {
            promises[i - 1] = fetch(i + '.csv')
                .then(response => response.text())
                .then(csv => nSQL('users').loadCSV(csv));
        }

        Promise.all(promises).then(() => {
            nSQL('users')
                .query('select')
                .where(['age', '<=', 21])
                .orderBy(['surname ASC', 'forename ASC'])
                .stream((row) => console.log(row.forename + ' ' + row.surname));
        });
    });
</script>
</html>

The query should output all users aged 21 or under, ordered by surname then forename. There are 29 results. Here is the results list:

Tiana Brown
Maddie Campbell
Freddie Chapman
Daryl Chapman
Freddie Clark
Ryan Cooper
Alexia Craig
Haris Davis
Madaline Ellis
Abigail Evans
Adison Grant
Connie Hamilton
Michael Hamilton
Amy Holmes
Kristian Howard
Grace Johnston
Kellan Jones
Jenna Mason
Honey Perkins
Paul Reed
Vincent Richards
Frederick Riley
Edith Rogers
Lucy Rogers
Kate Taylor
Isabella Thompson
Briony Thompson
Madaline Thompson 

At a glance it looks fine, but if you look closely you will notice that Freddie Chapman and Daryl Chapman are out of order, as are Isabella Thompson, Briony Thompson, and Madaline Thompson. Additionally, the results are inconsistent from run to run; on this particular run it was the Chapmans and the Thompsons out of order, but on other runs it might be the Rogers or the Hamiltons. The results are entirely unpredictable and, I suspect, are based on the order on which data is loaded into the database. I think the ORDER BY forename ASC is being ignored entirely.

The html file and sample data are attached.

order-by.zip

The expected result is for all results to ordered correctly by surname then forename. At the moment, the ordering by surname is secure and consistent between runs, but the ordering by forename is not.

@ajrcarey

This comment has been minimized.

Copy link
Author

commented Mar 23, 2019

Further experimentation indicates the ordering is only a problem when retrieving the query results directly via .exec(). If I retrieve the results one at a time by .stream(), or through setting up a cache via .cache(), the ordering is perfect every time.

@ClickSimply

This comment has been minimized.

Copy link
Owner

commented Mar 24, 2019

Thanks so much for reporting this! 2.2.2 has just been released and should resolve everything!

There shouldn't be any difference between .exec() and .stream(), when you call .exec() it just puts the result of .stream() into an array and returns that, here's the source code:

    public exec(returnEvents?: boolean): Promise<any[]> {

        return new Promise((res, rej) => {
            let buffer: any[] = [];
            this.stream((row) => {
                if (row) {
                    buffer.push(row);
                }
            }, () => {
                res(buffer);
            }, rej, returnEvents);
        });
    }
@ajrcarey

This comment has been minimized.

Copy link
Author

commented Mar 25, 2019

Many thanks. I've tested with 2.2.2 and it's much improved.

Occasionally, though - I don't have hard figures but I'd say roughly every five or so runs - the two data points of Freddie Chapman and Freddie Clark are mixed up. The order output should be:

..
Daryl Chapman
Freddie Chapman
Freddie Clark
...

Every so often, however, it comes out like this:

...
Daryl Chapman
Freddie Clark
Freddie Chapman
...

I believe the expected result, given the first field in the ORDER BY clause is 'surname', should be for both the Chapmans to be sorted before Freddie Clark.

I can confirm that the results are now consistent across .exec(), .stream(), and .cache(); I'm not sure if I was doing something wrong before, but .exec() showed different behaviour from .stream() and .cache() in 2.2.1 for me.

@ajrcarey

This comment has been minimized.

Copy link
Author

commented Mar 25, 2019

(Additionally, while .exec(), .stream(), and .cache() are all behaving identically, I've noticed that .toCSV() does not - that is, on some test runs where .exec() et al return the ordered results correctly, the same query, with the same ORDER BY clause, dumped via .toCSV() can still return rows out of order.)

@ClickSimply

This comment has been minimized.

Copy link
Owner

commented Mar 25, 2019

Using .toCSV() shouldn't have different results for the same reason stream and exec don't, here's the source code:

    public toCSV(headers?: boolean): Promise<string> {
        return this.exec().then((json: any[]) => Promise.resolve(this._db.JSONtoCSV(json, headers)));
    }

As you can see it's just using .exec() and converting it to a CSV.

Regardless, just kicked out 2.2.3 which should actually resolve this issue.

@ajrcarey

This comment has been minimized.

Copy link
Author

commented Mar 26, 2019

Yes, after your previous explanation I was surprised to see different behaviour from .toCSV(). This has made me question whether the problem was something on my end.

That said, initial testing with 2.2.3 suggests that sort order across all retrieval functions is now stable and consistent from run to run. I will let you know if I spot any further problems here, but for now I think it's fine to mark this as resolved in 2.2.3. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.