New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arithmetic overflow possible in IterableOnceOps.copyToArray #12795
Comments
Would it be worth building an inventory of where things are in terms of out-of-bounds handling (relevant operations, what is accepted, which exceptions for rejections)? And agree on a doc / spec that we can follow? @julienrf do you know if there is anything existing?
nice, til 🖼️ |
I don’t think we have such a spec, except the |
not to be confused with soap opera. I don't think a bug in arithmetic warrants an existential crisis in the API. It's perfectly natural to minimize bounds checking and only fail "on demand". Permissive inputs are well-established for I'm more sympathetic to the argument about
Then the definition of The case is stronger for It's mutating operations that throw instead of silently truncating, such as Perhaps I'm wrong that If It would be nice if these behaviors were summarized in the collections overview, of course. |
@scala/collections |
I think Julien is correct that in general we don't have specifications for what happens when various indices are out of bounds. Treating everything as undefined behavior is unwise, because people will test for the behavior, then assume that the experimentally-determined behavior is the intended behavior, and then be surprised when some other collection of the same type behaves differently. I tried to put some of what I thought were sensible invariants into collections-laws. I don't think I was comprehensive, and anyway, that just helps people working on the standard library avoid unexpected behavior; it still doesn't help users understand what to expect. The most important principle is, I think, to make all the collections have the same behavior, with an occasional exception if absolutely critical for array performance. I don't think Logically, I think val it = iterator
var l = len
var s = start
while (l > 0 && it.hasNext) {
xs(s) = it.next()
s += 1
l -= 1
}
len - l That is, I agree that the asymmetry between start and end when you're specifying the max length is a bad thing. However, the docs say that the copy ends when the array does, and it's behavior that people may count on, and the return So I'd rewrite it like this: def copyToArray[B >: A](xs: Array[B], start: Int, len: Int): Int =
if (len <= 0) 0
else {
var i = start
val it = iterator
val end = if (xs.length - len <= start) xs.length else start + len
while (i < end && it.hasNext) {
xs(i) = it.next()
i += 1
}
i - start
} This works because If As a bonus, in the case where len is zero or negative, and getting the iterator is not a no-op, the code is a touch faster. (Note: code untested.) |
Specifically regarding In particular, So in terms of coherence with the "natural" way to do things with higher-order methods, I think |
I don't want to sound like a bad record, but to confound the matter
further, we have `takeInPlace`, `dropInPlace` which are, like their
immutable namesakes, permissive. And `updated`, just like `update` is not.
I think this is the exact occasion on which Poles use the phrase
English-speakers allegedly find so funny: 'not my circus, not my
monkeys'. Ultimately, all collection methods can be implemented by a call
to the equivalaent method on an iterator, followed by a conversion, and
maybe it would be a good frame of reference. I am not terribly convinced
about the 'existing index distinction', but I agree that a simple
arithmetic overflow issue is certainly not a reason to start a discussion
about a huge `scala.collection` revamp. Actually, I think I have the
biggest issue with `Buffer.remove`, as the stand-out. But I also have a set
of extension methods for collections, for example `update`/`updated` which
take a collection of elements, rather than a single one, and I had real
troubles with deciding what would be the most consistent behaviour with the
existing methods. On one hand, it's a job that `patch` can easily do - but
`patch` requires you to know the `size` of the argument collection if you
want to exactly `splice` the recipient of the call, which we won't in
general know for a `IterableOnce`, so I found a use for it beside `patch`.
Now, I am easily irked by such small inconsistencies and perfectly aware
that I am not normal at that, but this talk is for me just a tip of an
iceberg in how there is no rhyme or reason to how parameters are named to
equivalent methods - which would not be quite as important if Scala didn't
allow to specify ''any'' parameter by name, or that there is no single
convention to how equivalent mutable/immutable methods are named.
BTW., another soapbox, maybe it will sow a seed of doubt in someone of
influence on how Scala evolves: I think it would be much more prudent if
only method parameters annotated with ***@***.***`, or some such, where legal
candidates for named arguments, rather than positional ones. People don't
put a huge lot of thought most of the time into how parameters are named,
and it is unfortunate to be bound by early choices for compatibility. I
don't think we often use it for arbitrary calls, it seems like their place
is with `copy`-like methods, with many parameters, with signatures defined
specifically with named arguments in mind, so I wouldn't find it limiting
myself. Returning to an API with a fresh eye, or with feedback on how
something may be ambiguous, limits such an easy improvement. Then, there is
the case that in an overriding method parameters can be renamed freely, and
while I'd consider it bad practice, it can easily happen if you write the
overridden method manually, rather than asking an IDE to copy exactly the
signature for you. Together it seems like a bad combination.
c):-i Marcin Mościcki
---------------------------------------------------------------------------------------------------
It is, as far as he knows, the only way of coming downstairs,
but sometimes he feels that there really is another way,
if only he could stop bumping for a moment and think of it.
And then he feels that perhaps there isn't.
…On Wed, Aug 9, 2023 at 1:54 AM Ichoran ***@***.***> wrote:
Specifically regarding patch, I think the behavior was chosen to be
identical to, but more efficient than, that which you would get from the
most obvious collections operations.
In particular, xs.patch(n, ys, m) is equivalent to xs.take(n) ++ ys ++
xs.drop(n).drop(m). The double-drop is weird until you notice that the
natural way to split xs is the splitAt method: val (xsL, xsR) =
xs.splitAt(n); xsL ++ ys ++ xsR.drop(m).
So in terms of coherence with the "natural" way to do things with
higher-order methods, I think patch does the right thing. patchInPlace
should match its behavior. If it doesn't, I would consider that a bug.
—
Reply to this email directly, view it on GitHub
<#12795 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANDKYWS4KFRE6EY7KSKHCNDXULGTZANCNFSM6AAAAAAYTE5B2E>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Parameters have been renamed and annotated with
If I write
At the moment, after a couple of months, I don't have a sense of what is the priority issue, in general and in your view. Recently, I was looking at "value discard" warnings and noticed that On parameters, |
Reproduction steps
Scala version: 2.13.10
In
IterableOnceOps
:Problem
There are actually two overflows here, but
xs.length - start
is arguably not an issue, as a negativestart
causes a negativeend
and nothing will be copied (instead of throwing an exception if an overflow had not happened). There is a second one instart + math.min(len, ...)
, however, and on a valid path:The 'new' collection framework is much better in terms of avoiding overflows than the old one, where almost everything was overflowing, but if there was some competition I'm pretty sure I could find a good dozen more.
Iterator.drop
in complex iterator implementations in particular is very vulnerable, and that's a bit of a bummer, because iterators actually can be easily used for collections with more thanInt.MaxValue
elements.I'd like to use this opportunity to climb my soap box and say again I don't like the exact semantics of indices out of range in
copyToArray
. Essentially, they are closely tied to this particular implementation, and troublesome to reproduce to a tee for all possible data sets. A negative start is allowed only if the total number of elements which would be copied, as influenced bylen
,this.size
, andxs.length
, would be zero or less. I think it would be cleaner if it was either always rejected with an exception, or always accepted.While I can see the value of permissive indices in slicing operations, where we are dealing with bounds on collection sizes, using them in a context of specific indices in a sequence (like
indexOfSlice
and some other issues I reported),remove
,insertAll
, etc.),copyToArray
always accepted negativestart
and simply ignored-start
first elements in the collection, it would be actually quite useful.The worst offender, by far, though, is, in my eyes,
patch
. The policy of clipping the index of the start of the patch, can easily lead to quiet bugs, without any beneficial use cases I can see. If, as I suggested above, instead theindices
were not modified, but the part of theelems
(patch) collection falling outside of the patched (this) collection was ignored, it would be both more logical and useful.Ehm, sorry.
The text was updated successfully, but these errors were encountered: