New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InkArray.groupBy grouping incorrectly #436

Closed
Asuza opened this Issue Mar 25, 2015 · 6 comments

Comments

Projects
None yet
3 participants
@Asuza

Asuza commented Mar 25, 2015

For a live example, you can see one here: https://jsfiddle.net/Asuza/mx0e49fq/

The groupBy method isn't grouping my browser data by version. This was the simplest case I could come up with to reproduce the issue consistently.

Given the following data:

var result,
    browserData;

browserData = [
  {
    "platform": "VISTA",
    "browser": "ie",
    "version": "9"
  },
  {
    "platform": "VISTA",
    "browser": "ie",
    "version": "10"
  },
  {
    "platform": "WIN8",
    "browser": "ie",
    "version": "10"
  },
  {
    "platform": "VISTA",
    "browser": "ie",
    "version": "9"
  },
  {
    "platform": "VISTA",
    "browser": "ie",
    "version": "10"
  }
];

You would expect two groups to be the output, when grouping them by version.

However, running the following code:

Ink.requireModules(['Ink.Util.Array_1'], function (InkArray) {
    result = InkArray.groupBy(browserData, {
        key: function (item) {
            return item.version;
        }
    });
});

Results in 4 groups of data. It has the correct number of items total, but they're definitely not grouped properly. The same issue occurs when grouping by platform, but it seems to work fine when grouping by browser.

If my usage is incorrect, please let me know.

Thanks.

@suskind

This comment has been minimized.

Show comment
Hide comment
@suskind

suskind Mar 26, 2015

Contributor

Hi @Asuza
you are totally right

The algorithm is wrong, it will be fixed soon and it will be available in the next release

For now you can use the method sorting the array first

result = InkArray.groupBy(InkArray.sortMulti(browserData, 'version'), {
        key: function (item) {
            return item.version;
        }
    });

Or if you are using a simple array:

result = InkArray.groupBy([1, 1, 2, 3, 1, 2, 1, 3].sort()); 
Contributor

suskind commented Mar 26, 2015

Hi @Asuza
you are totally right

The algorithm is wrong, it will be fixed soon and it will be available in the next release

For now you can use the method sorting the array first

result = InkArray.groupBy(InkArray.sortMulti(browserData, 'version'), {
        key: function (item) {
            return item.version;
        }
    });

Or if you are using a simple array:

result = InkArray.groupBy([1, 1, 2, 3, 1, 2, 1, 3].sort()); 
@fabiosantoscode

This comment has been minimized.

Show comment
Hide comment
@fabiosantoscode

fabiosantoscode Mar 26, 2015

Contributor

This method does the same as the python groupby() method, which works exactly like that, and also generates the same kind of confusion. It's meant to be simple and do one thing, which is why it doesn't sort your data.

If you look closely at the first example, you'll see there's a lone [1] in the end.

I talked to @suskind personally just now, and he suggested I add an option to sort before grouping, which just calls sortMulti on the data before doing the real groupBy().

Also, I'll add a note to the docs to clarify that it doesn't sort anything.

Contributor

fabiosantoscode commented Mar 26, 2015

This method does the same as the python groupby() method, which works exactly like that, and also generates the same kind of confusion. It's meant to be simple and do one thing, which is why it doesn't sort your data.

If you look closely at the first example, you'll see there's a lone [1] in the end.

I talked to @suskind personally just now, and he suggested I add an option to sort before grouping, which just calls sortMulti on the data before doing the real groupBy().

Also, I'll add a note to the docs to clarify that it doesn't sort anything.

@Asuza

This comment has been minimized.

Show comment
Hide comment
@Asuza

Asuza Mar 26, 2015

I'm new to using Ink, and I'm not aware of how closely it's supposed to emulate Python, but I would argue that any method that generally creates confusion shouldn't have the more confusing behavior be the default.

In that regard, I would opt for groupBy to sort by default, and have an additional option to turn sorting off.

My argument also assumes that more developers would want their data to be grouped by the key than to be grouped in chunks where the keys match and are also next to each other.

Asuza commented Mar 26, 2015

I'm new to using Ink, and I'm not aware of how closely it's supposed to emulate Python, but I would argue that any method that generally creates confusion shouldn't have the more confusing behavior be the default.

In that regard, I would opt for groupBy to sort by default, and have an additional option to turn sorting off.

My argument also assumes that more developers would want their data to be grouped by the key than to be grouped in chunks where the keys match and are also next to each other.

@fabiosantoscode

This comment has been minimized.

Show comment
Hide comment
@fabiosantoscode

fabiosantoscode Mar 26, 2015

Contributor

I see.

I agree that this default behavior is against the principle of least astonishment, because of the common use case and SQL's similarly named GROUP BY clause.

However, turning the default around breaks the API for existing code and shouldn't be done.

Contributor

fabiosantoscode commented Mar 26, 2015

I see.

I agree that this default behavior is against the principle of least astonishment, because of the common use case and SQL's similarly named GROUP BY clause.

However, turning the default around breaks the API for existing code and shouldn't be done.

@fabiosantoscode

This comment has been minimized.

Show comment
Hide comment
@fabiosantoscode

fabiosantoscode Mar 30, 2015

Contributor

Closed in 2cf8e7e, @suskind let's do a release :)

Contributor

fabiosantoscode commented Mar 30, 2015

Closed in 2cf8e7e, @suskind let's do a release :)

@Asuza Asuza closed this Mar 31, 2015

@fabiosantoscode

This comment has been minimized.

Show comment
Hide comment
@fabiosantoscode

fabiosantoscode Mar 31, 2015

Contributor

In the end, I ended up changing the algorithm to a non-sort-destructing one. So 2212 would group like 2221.

Out in 3.1.6

Contributor

fabiosantoscode commented Mar 31, 2015

In the end, I ended up changing the algorithm to a non-sort-destructing one. So 2212 would group like 2221.

Out in 3.1.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment