New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Array.slice for `from` > Array.length and `to` > Array.length #399

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@eeue56
Contributor

eeue56 commented Sep 9, 2015

Adds checks for from and to being great or equal to the length of the array, fixing #349 by preventing a.height issues.

@eeue56 eeue56 changed the title from Add past-the-end checks to Array.slice to Fix Array.slice for `from` > Array.length and `to` > Array.length Sep 9, 2015

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Sep 13, 2015

Member

Can you explain a bit more? It seems like the function has certain behavior as is, and you are changing that behavior. Some folks have suggested negative indexes for various functions. I think it'd be helpful to have a more complete explanation of what exactly is changing here and how it fits into other things.

Member

evancz commented Sep 13, 2015

Can you explain a bit more? It seems like the function has certain behavior as is, and you are changing that behavior. Some folks have suggested negative indexes for various functions. I think it'd be helpful to have a more complete explanation of what exactly is changing here and how it fits into other things.

@eeue56

This comment has been minimized.

Show comment
Hide comment
@eeue56

eeue56 Sep 13, 2015

Contributor

Sure -

This only involves adding checks for when the start and end (called from and to) indices are greater than the length of the array - so negative indexes still work as they do already.
The existing behaviour is maintained. The only change here is that when Array.slice is given arguments that goes over the length of the Array passed in, they default to the length of the array instead.

Consider the following -

Array.slice 1 (Array.length array + 1) array 

where Array.length array is > 32.

Because the array has a length greater than M, which is 32 - the default table size, the array height is greater than 0, meaning it skips the tail-case for recursion in Array.sliceRight and Array.sliceLeft -

// Handle leaf level.
        if (a.height === 0)
        {
            var newA = { ctor:'_Array', height:0 };
            newA.table = a.table.slice(0, to);
            return newA;
        }

When the sliceRight function carries on, it uses

        // Slice the right recursively.
        var right = getSlot(to, a);
        var sliced = sliceRight(to - (right > 0 ? a.lengths[right - 1] : 0), a.table[right]);

to get the next slice. Here is where the problem is - when to is bigger than the length of the array, getSlot will return a table index that's greater than the number of tables, causing the next call to sliceRight to take an undefined object as a parameter, cause the object undefined doesn't have property a.height error seen in #349.

The changes I've made make it so that when Array.slice is called, if to is too big then it'll just default to the length of the Array - restricting the indexing done by getSlot and therefore getting the correct table for passing down to the next recursion level.
The same applies for from here too - if from begins from past-the-end of the Array, then it just returns an empty array. As far as I could figure from the documentation, this is consistent with the already existing behaviour, and prevents errors with sliceLeft.

               // if the start of the slice goes past the end of the array, 
        // then just return empty
        if (from >= length(a)){
            return empty;
        }

        // if the end of the slice goes past the end of the, 
        // just make `to` the length of the array
        // -1 since array is 0 based
        if (to >= length(a) - 1){
            to = length(a);
        }

These changes make it so that:

  • Any start index past the length returns an empty Array - this is consistent with current behaviour.
  • Any end index past the length replaces the end index with the length - this is also consistent with current behaviour

So, overall, consistent behaviour with what Array.slice already does (including for negative indexing).
Another fix could be to make getSlot error out or return a default index when something is over-the-length, but I feel that is far more messy.

Contributor

eeue56 commented Sep 13, 2015

Sure -

This only involves adding checks for when the start and end (called from and to) indices are greater than the length of the array - so negative indexes still work as they do already.
The existing behaviour is maintained. The only change here is that when Array.slice is given arguments that goes over the length of the Array passed in, they default to the length of the array instead.

Consider the following -

Array.slice 1 (Array.length array + 1) array 

where Array.length array is > 32.

Because the array has a length greater than M, which is 32 - the default table size, the array height is greater than 0, meaning it skips the tail-case for recursion in Array.sliceRight and Array.sliceLeft -

// Handle leaf level.
        if (a.height === 0)
        {
            var newA = { ctor:'_Array', height:0 };
            newA.table = a.table.slice(0, to);
            return newA;
        }

When the sliceRight function carries on, it uses

        // Slice the right recursively.
        var right = getSlot(to, a);
        var sliced = sliceRight(to - (right > 0 ? a.lengths[right - 1] : 0), a.table[right]);

to get the next slice. Here is where the problem is - when to is bigger than the length of the array, getSlot will return a table index that's greater than the number of tables, causing the next call to sliceRight to take an undefined object as a parameter, cause the object undefined doesn't have property a.height error seen in #349.

The changes I've made make it so that when Array.slice is called, if to is too big then it'll just default to the length of the Array - restricting the indexing done by getSlot and therefore getting the correct table for passing down to the next recursion level.
The same applies for from here too - if from begins from past-the-end of the Array, then it just returns an empty array. As far as I could figure from the documentation, this is consistent with the already existing behaviour, and prevents errors with sliceLeft.

               // if the start of the slice goes past the end of the array, 
        // then just return empty
        if (from >= length(a)){
            return empty;
        }

        // if the end of the slice goes past the end of the, 
        // just make `to` the length of the array
        // -1 since array is 0 based
        if (to >= length(a) - 1){
            to = length(a);
        }

These changes make it so that:

  • Any start index past the length returns an empty Array - this is consistent with current behaviour.
  • Any end index past the length replaces the end index with the length - this is also consistent with current behaviour

So, overall, consistent behaviour with what Array.slice already does (including for negative indexing).
Another fix could be to make getSlot error out or return a default index when something is over-the-length, but I feel that is far more messy.

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Dec 27, 2015

Contributor

Old thread, but I'd really like to see this fixed - Array is unreliable enough to discourage its use, and this is a part of that.

The other part is things like negative indices and making the API nicer. One change I'd like to see is this line, < can become >, so a to of 0 means "slice until the end of the array" -- assuming that won't break anything, which it might.

Contributor

mgold commented Dec 27, 2015

Old thread, but I'd really like to see this fixed - Array is unreliable enough to discourage its use, and this is a part of that.

The other part is things like negative indices and making the API nicer. One change I'd like to see is this line, < can become >, so a to of 0 means "slice until the end of the array" -- assuming that won't break anything, which it might.

@eeue56

This comment has been minimized.

Show comment
Hide comment
@eeue56

eeue56 Dec 27, 2015

Contributor

@mgold - I have plans to rewrite Array with the NRI team from scratch. There's too many bugs in the current implementation to track them all down safely, and the code isn't documented enough to make a sensible approach on it.

Contributor

eeue56 commented Dec 27, 2015

@mgold - I have plans to rewrite Array with the NRI team from scratch. There's too many bugs in the current implementation to track them all down safely, and the code isn't documented enough to make a sensible approach on it.

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Dec 28, 2015

Contributor

👏 👍 Would this be a native module or would you try to do this in Elm?

Also I'm working on something similar for Random, with the goal of obtaining better statistical properties and splittability.

Contributor

mgold commented Dec 28, 2015

👏 👍 Would this be a native module or would you try to do this in Elm?

Also I'm working on something similar for Random, with the goal of obtaining better statistical properties and splittability.

@eeue56

This comment has been minimized.

Show comment
Hide comment
@eeue56

eeue56 Dec 28, 2015

Contributor

It'll be more pure Elm than the current Array implementation, but a lot of it will have to be Native. It's possible to implement it fully in just Elm, but the performance and size overhead would take a hefty toll on any large array. We'll be sure to benchmark things as we go along, though

👍 For alternative Random implementations too

Contributor

eeue56 commented Dec 28, 2015

It'll be more pure Elm than the current Array implementation, but a lot of it will have to be Native. It's possible to implement it fully in just Elm, but the performance and size overhead would take a hefty toll on any large array. We'll be sure to benchmark things as we go along, though

👍 For alternative Random implementations too

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Dec 28, 2015

Contributor

Sounds good. Can I request singleton and reverse functions, and support for (++)? 🙏

Contributor

mgold commented Dec 28, 2015

Sounds good. Can I request singleton and reverse functions, and support for (++)? 🙏

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz May 20, 2016

Member

I think it makes sense to focus on the rewrite in Elm.

Member

evancz commented May 20, 2016

I think it makes sense to focus on the rewrite in Elm.

@evancz evancz closed this May 20, 2016

@eeue56 eeue56 referenced this pull request Dec 3, 2016

Closed

Array slice FIX + TEST #765

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment