New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design of array constructors #24595

Open
Sacha0 opened this Issue Nov 13, 2017 · 55 comments

Comments

Projects
None yet
@Sacha0
Member

Sacha0 commented Nov 13, 2017

Intro

This post is an attempt to consolidate/review/analyze the several ongoing, disjoint conversations on array construction. The common heart of these discussions is that the existing patchwork of array construction methods leaves something to be desired. A more coherent model that is pleasant to use, adequately general and flexible, largely consistent both within and across array types, and consists of orthogonal, composable parts would be fantastic.

Some particular design objectives / razors might include:

  • Given an unfamiliar array type, you should have a reasonable sense of how to construct an instance with desired contents without manual/method/code sleuthing.

  • Reading an incantation that constructs an array of unfamiliar type, you should be able to largely deduce the array's type and contents without manual/method/code sleuthing.

  • The general tools for array construction should be discoverable, and writing *common operations via those general tools should be sufficiently pleasant and concise that: (1) pressure to write ad hoc convenience methods does not escalate to the point where such methods proliferate; and (2) antipatterns for array construction do not emerge to avoid (or in ignorance of) the general tools.

(*Common operations include constructing: (1) an array uniformly initialized from a value; (2) an array filled from an iterable, or from a similar object defining the array's contents such as I; (3) one array from another; and (4) an uninitialized array.)

Via a tour of the relevant issues and with the above in mind, let's explore the design space.

Tour of the issues

Let's start with...

  • #16029, "Merge collect() into Vector()". The crux: Some array types have constructor methods that, given (solely) a tuple or series of integer arguments specifying shape, produce an uninitialized instance of the given shape. Due to method signature collisions, those constructor methods prevent such array types from supporting construction from certain iterables (particularly tuples and integers). Illustration with Vector: Vector(x) should be able to construct a Vector from an arbitrary HasLength iterable x (as with e.g. Vector(1:4), which intuitively yields [1, 2, 3, 4]). But this cannot work for tuples now, as e.g. Vector{Float64}((2,)) instead constructs an uninitialized Vector{Float64} of length two.

The prevailing idea for fixing the preceding issue is to: (1) deprecate uninitialized-array constructors that accept (solely) a tuple or series of integer arguments as shape, removing the method signature collision; and (2) replace those uninitialized-array constructors with something else. Two broad replacement proposals exist:

  1. Introduce a new generic function, say (*modulo spelling) blah(T, shape...) for type T and tuple or series of integers shape, that returns an uninitialized Array with element type T and shape shape. This approach is an extension of the existing collection of Array convenience constructors inherited from other languages including ones, zeros, eye, rand, and randn.

(* Please note that blah is merely a short placeholder for whatever name comes out of the relevant ongoing bikeshed. The eventual name is not important here :).)

  1. Introduce Array{T}(blah, shape...) constructors where blah signals that the caller does not care what the return's contents are. These constructors would be specific instances of a more general model that extends and unifies the existing constructor model. That more general model is discussed further below.

The first proposal

The first proposal leads us to...

  • #11557, "Functions that return arrays with eltype as input should use container type instead?". The crux: ones, zeros, eye, rand, randn, and the proposed blah(T, shape...) all produce Arrays. How do we generalize these functions to array types broadly? Two approaches exist:

The de facto approach is introduction of ad hoc perturbations on these function names for each new array type: Devise an obscure prefix associated with your array type, and introduce *ones, *zeros, *eye, *rand, *randn, and hypothetically *blah functions with * your prefix. This approach fails all three razors above: Failing the first razor, to construct an instance of an array type that follows this approach, you have to discover that the array type takes this approach, figure out the associated prefix, and then hope the methods you find do what you expect. Failing the second razor, when you encounter the unfamiliar bones function in code, you might guess that function either carries out spooky divination rituals, or constructs a b full of ones (whatever b refers to). Along similar lines, does spones populate all entries in a sparse matrix with ones, or only some set of stored/nonzero entries (and if so which)? Failing the third razor, the very nature of this approach is proliferation of ad hoc convenience functions and is itself an antipattern. On the other hand, this approach's upside is that it sometimes involves a bit less typing (though often also not, see below). Nonetheless, this approach is fraught.

So what's the other approach? #11557 started off by discussing that other approach: ones, zeros, eye, rand, and randn typically accept a result element type as either first or second argument, for example ones(Int, (3, 3)) and rand(MersenneTwister(), Int, (3, 3)). That argument could instead be an array type, for example ones(MyArray{Int}, (3, 3)) and rand(MersenneTwister(), MyArray{Int}, (3, 3)). This approach is enormously better than the last: It could mostly pass the first and second razors above. But it nonetheless fails the third razor, and exhibits other shortcomings (mostly inherited from the existing convenience constructors). Let's look at some of those shortcomings:

  • Default element type ambiguity: When element type isn't specified, for example as in eye(MyArray, (3, 3)) or ones(MyArray, (3, 3)), what should the returned array's element type be? Should that default element type be consistent across array types, or allowed to vary? At present these functions yield Float64 by default, which is a reasonable (useful) choice when running on modern CPUs. But other defaults may be more appropriate for array types associated with other hardware or applications, for example Float16 or Float32 for array types / contexts associated with GPUs. And one could also argue that Int is a more canonical type independent of context, or that Bool usually provides better promotion behavior, and so on. (This shortcoming to some degree violates the second razor.)

  • ones (and, to lesser degree, eye) element type ambiguity: As #24444 highlights, whether ones(MyArray{T}, shape...) returns element type T's multiplicative identity (one(T)) or additive generator (oneunit(T)) is ambiguous. Of course one or the other can be chosen and documented. But choosing one, ones(MyArray{T}, shape...) can no longer consistently return a MyArray{T}, as for some types typeof(one(T)) does not coincide with T (e.g. one(1meter) == 1 != 1meter). And as demonstrated in #24444, with either choice some subset of users's expectations will be violated and use cases unsatisfied, creating pressure for ad hoc solutions or additional value-names. eye(MyArray{T}, shape...)'s element type should less ambiguously be one(T), which mitigates the latter issue but runs into the former. (This shortcoming to some degree violates both the first and second razors.)

  • zeros element type ambiguity: Prior to #16116, whether one(T) returned a multiplicative identity or additive generator for T was ambiguous. #20268 resolved this ambiguity by introducing oneunit(T) as the additive generator for T and affirming one(T) as a multiplicative identity. zero suffers from a similar issue, though likely less important in practice: Is zero(T) the additive identity or a sort of multiplicative zero for T? To illustrate, is 3meters * zero(1meters) 0meters^2 or 0meters? Consequently, zeros suffers from an ambiguity analogous to that described above for ones.

  • Handling values without an associated function: To construct a MyArray of 1s, you call ones(MyArray{Int}, (3, 3)). To construct a MyArray of 0s, you call zeros(MyArray{Int}, (3, 3)). To construct a MyArray containing the identity matrix, you call eye(MyArray{Int}, (3, 3)). Great so far. But how do you construct a MyArray of 2s, or -1s, or containing I/2? If you are used to these convenience constructors, perhaps you respectively call 2*ones(MyArray{Int}, (3, 3)), -ones(MyArray{Int}, (3, 3)), and eye(MyArray{Int}, (3, 3))/2. Or in the first two cases perhaps you call fill!(blah(MyArray{Int}, (3, 3)), [2|-1]) for mutable and fill!-supporting MyArray, limiting your code's scope. If you want to avoid generating a temporary, you probably use the fill! incantation. But these incantations are less pleasant than ones or zeros, so perhaps you give your common values names: twos(MyArray{Int}, (3, 3)). And to avoid the temporary in the eye call, perhaps you roll a halfeye(MyArray{Int}, (3, 3)) function to avoid allocating the temporary. Overall, antipatterns emerge and ad hoc functions proliferate. And as demonstrated in #24444 (comment) and discussed elsewhere, this issue bears out in practice and is widespread. (This shortcoming violates the third razor.)

  • Two disjoint, incongruous, and overlapping models are necessary: To construct an array from another array, or from an iterable or similar content specifier, you have to switch from these functions to constructors. So users must be familiar with two disjoint, incongruous, and non-orthogonal models.

  • Minor type argument position inconsistency: The position of these functions' type argument varies, requiring method sleuthing to figure out the correct signature. Examples: ones(MyArray{Int}, (3, 3)) versus rand(RNG, MyArray{Int}, (3, 3)).

Each of these shortcomings is perhaps acceptable considered in isolation. But considering these shortcomings simultaneously, this approach becomes a shaky foundation on which to build a significant component of the language.

In part motivated by these and other considerations, #11557 and concurrent discussion turned to...

The second proposal

... which is to introduce (modulo spelling of blah, please see above) Array{T}(blah, shape...) constructors, where blah indicates the caller does not care what the return's contents are. These constructors immediately generalize to arbitrary array types as in MyArray{T}(blah, shape_etc...), and would be a specific instance of a more general model that extends the existing constructor model:

The existing constructor model allows you to write, for example, Vector(x) for x any of 1:4, Base.OneTo(4), or [1, 2, 3, 4] (to construct the Vector{Int} [1, 2, 3, 4]), or similarly SparseVector(x) (to build the equivalent SparseVector). To the limited degree this presently works broadly, the model is MyArray[{...}](contentspec) where contentspec, for example some other array, iterable, or similar object, defines the resulting array's contents.

The more general extension of this model is MyArray[{...}](contentspec[, modifierspec...]). Roughly, contentspec defines the result's contents, while modifierspec... (if given) provides qualifications, e.g. shape.

What does this look like in practice?

For the most part you would use constructors as you do now, with few exceptions. Let's go through the common construction operations mentioned above:

  1. (Constructing uninitialized arrays.) To build an uninitialized MyArray{T}, where now you write e.g. MyArray{T}(shape...), instead you would write MyArray{T}(blah, shape...). (#24400 explored this possibity for Arrays, and inevitably became a bikeshed of the spelling of blah :).)

  2. (Constructing one array from another.) Constructing one array from another, as in e.g. Vector(x) or SparseVector(x) for x being [1, 2, 3, 4], would work just as before.

  3. (Constructing an array filled from an iterable, or from a similar object defining the array's contents such as I.) What is possible now, for example Vector(x) for x either 1:4 or Base.One(4), would work as before. But where e.g. Array[{T,N}](tuple) now fails or produces an uninitialized array depending on T, N, and tuple, such signatures could work as for any other iterable. And additional possibilities become natural: Constructing Arrays from HasShape generators is one nice example. Another, already on master (#24372), is Matrix[{T}](I, m, n) (alternatively Matrix[{T}](I, (m, n))), which constructs a Matrix[{T}] of shape (m, n) containing the identity, and is equivalent to eye([T, ]m[, n]) with fewer ambiguities.

Great so far. Now what about perhaps the most common operation, i.e. constructing an array uniformly initialized from a value? Under the general model above, this operation should of course roughly be MyArray[{T}](it, shape...) where it is an iterable repeating the desired value. But this incantation should: (a) be fairly short and pleasant to type, lest ad hoc constructors for particular array types and values proliferate to avoid using the general model; and ideally (b) mesh naturally with convenience constructors for Arrays.

Triage came up with two broad spelling possibilities. The first spelling possibility led to...

  • #24389, "constructors for Array from zeros/ones". The crux: Make ones and zeros iterable, allowing e.g. MyArray([ones|zeros], shape...). At first blush this spelling seems reasonable: It's fairly short/pleasant, satisfying (a). And it ties to the ones/zeros convenience constructors, somewhat satisfying (b) (caveat being the slightly unnatural reversed identifier ordering as in e.g. ones(T, shape...) vs MyArr{T}(ones, shape...)). But further consideration reveals that this spelling foists most shortcomings of the first design proposal (that is, the e.g. ones(Int, ...) -> ones(MyArray{Int}, ...) proposal described above) onto this second design proposal. Specifically, the "Default element type ambiguity", "ones/eye/zeros element type ambiguity", and "handling values without an associated function" shortcomings described above all apply here as well. Sad razors.

The second spelling possibility is MyArray(Rep(v), shape...) modulo spelling of Rep(v), where Rep(v) is some convenient alias for Iterators.Repeated(v) with v any desired value. (Another possible spelling of Rep(v) discussed in triage is Fill(v), which dovetails beautifully with the fill convenience constructor for the same purpose specific to Arrays. Independent of the iterator's name, this spelling is a clean generalization of fill from Arrays to arrays generally.) In practice this would look like MyArray(Rep(1), shape...) (instead of MyArray{Int}(ones, shape...)). This spelling possesses some distinct advantages:

  • By nature of requiring a value, this spelling suffers from neither the "default element type ambiguity" nor the "ones/eye/zeros elementy type ambiguity" described above.

  • By nature of accommodating any value, this spelling avoids the "handling values without an associated function" issue and the consequent antipatterns and ad hoc method proliferation.

  • By nature of requiring and accepting a value, this spelling is frequently more compact and efficient than equivalents with the other spelling: Consider MyArray(Rep(1.0im), shape...) versus im*MyArray{Complex{Float64}}(ones, shape...), or MyArray(Rep(1f0/ℯ), shape...) versus MyArray{Float32}(ones, shape...)/ℯ.

  • This spelling is a composition of well-defined, fundamental tools that, once learned, can be deployed to good effect elsewhere. In contrast, the other spelling is ad hoc and a bit of a pun.

Great. With this latter spelling, overall this second proposal appears to satisfy both the broad design objectives and three razors at the top, and avoids the shortcomings of the first proposal.

What else? Convenience constructors

Convenience constructor are an important part of this discussion and about which there is much to consider. But that topic I will leave for another post. Thanks for reading! :)

@rfourquet

This comment has been minimized.

Show comment
Hide comment
@rfourquet

rfourquet Nov 16, 2017

Contributor

One point concerning rand to take into account is the "value specification", which is often encoded with a value (e.g. 1:10) instead of a type. FWIW, independantly of your effort and other's to solve this problem, I have been thinking on how to solve the more narrow rand's problem (and get rid of sprand!)... which is at the same time not-so-simple on its own as we (at least I) want allow also generating Sets or Dicts (which require specifying the value specification for a Pair...). I didn't plan to speak about it before PR'ing it, but thought I should mention it here.

Contributor

rfourquet commented Nov 16, 2017

One point concerning rand to take into account is the "value specification", which is often encoded with a value (e.g. 1:10) instead of a type. FWIW, independantly of your effort and other's to solve this problem, I have been thinking on how to solve the more narrow rand's problem (and get rid of sprand!)... which is at the same time not-so-simple on its own as we (at least I) want allow also generating Sets or Dicts (which require specifying the value specification for a Pair...). I didn't plan to speak about it before PR'ing it, but thought I should mention it here.

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Nov 16, 2017

Member

I much look forward to your thoughts on rand, @rfourquet! :)

Member

Sacha0 commented Nov 16, 2017

I much look forward to your thoughts on rand, @rfourquet! :)

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Nov 16, 2017

Member

Convenience constructors: eye

A brief review of triage's discussions of eye.

Shortcomings

  1. A Matrix{T} is usually a poor representation of the identity operator/matrix: eye(T, m, n) requires O(mn) storage (where in principle constant storage should do). eye(T, m, n) + ... usually involves O(mn) operations (where in principle O(min(m, n)) operations should do). eye(T, m, n) * ... usually involves O(mnk) operations (where in principle either constant or O(nk) operations should do). Base includes better representations of the identity operator/matrix for a variety of purposes, including UniformScalings, Diagonals, and SparseMatrixCSCs; these representations should be preferred, but eye often boxes these representations out in practice.

  2. Where a Matrix{T} is a reasonable representation of the identity operator/matrix, eye provides no way to efficiently construct a scaled identity operator/matrix: One must write, e.g., 2*eye(T, m, n), which involves a temporary and O(mn) unnecessary operations. (Promotion in such expressions can also require a bit of care.)

  3. Whether eye(T, m, n)'s element type should be T's multiplicative identity one(T) or additive generator oneunit(T) depends on context: Interpreting eye(T, m, n) as a concrete representation of the identity operator/matrix for use in multiplication, T's multiplicative identity one(T) seems the right choice. But where eye(T, m, n) appears (frequently) in addition, T's additive generator might instead be the right choice. Standardizing on one or the other is of course possible, but: (1) each choice fails to satisfy some use cases; and (2) choosing one(T), eye(T, m, n)'s result element type cannot be T where T's multiplicative identity one(T) is not of type T.

  4. As in the OP, eye does not generalize well to non-Array array types.

  5. eye is a terrible pun.

Resolution

  1. Following the second proposal in the OP, introduce array constructors that accept a UniformScaling first argument and shape-specifying trailing arguments, for example (Array|Matrix)[{T}](S::UniformScaling, shape...) for Arrays. Then Matrix(2I, m, n), for example, serves as a direct replacement for 2*eye(Int, m, n). Such constructors avoid shortcomings 2-5.

  2. Deprecate eye in favor of UniformScalings and UniformScaling-accepting constructors (hopefully addressing shortcoming 1 by encouraging better representation choices / making the usually poor Matrix{T} choice less attractive).

Best!

Member

Sacha0 commented Nov 16, 2017

Convenience constructors: eye

A brief review of triage's discussions of eye.

Shortcomings

  1. A Matrix{T} is usually a poor representation of the identity operator/matrix: eye(T, m, n) requires O(mn) storage (where in principle constant storage should do). eye(T, m, n) + ... usually involves O(mn) operations (where in principle O(min(m, n)) operations should do). eye(T, m, n) * ... usually involves O(mnk) operations (where in principle either constant or O(nk) operations should do). Base includes better representations of the identity operator/matrix for a variety of purposes, including UniformScalings, Diagonals, and SparseMatrixCSCs; these representations should be preferred, but eye often boxes these representations out in practice.

  2. Where a Matrix{T} is a reasonable representation of the identity operator/matrix, eye provides no way to efficiently construct a scaled identity operator/matrix: One must write, e.g., 2*eye(T, m, n), which involves a temporary and O(mn) unnecessary operations. (Promotion in such expressions can also require a bit of care.)

  3. Whether eye(T, m, n)'s element type should be T's multiplicative identity one(T) or additive generator oneunit(T) depends on context: Interpreting eye(T, m, n) as a concrete representation of the identity operator/matrix for use in multiplication, T's multiplicative identity one(T) seems the right choice. But where eye(T, m, n) appears (frequently) in addition, T's additive generator might instead be the right choice. Standardizing on one or the other is of course possible, but: (1) each choice fails to satisfy some use cases; and (2) choosing one(T), eye(T, m, n)'s result element type cannot be T where T's multiplicative identity one(T) is not of type T.

  4. As in the OP, eye does not generalize well to non-Array array types.

  5. eye is a terrible pun.

Resolution

  1. Following the second proposal in the OP, introduce array constructors that accept a UniformScaling first argument and shape-specifying trailing arguments, for example (Array|Matrix)[{T}](S::UniformScaling, shape...) for Arrays. Then Matrix(2I, m, n), for example, serves as a direct replacement for 2*eye(Int, m, n). Such constructors avoid shortcomings 2-5.

  2. Deprecate eye in favor of UniformScalings and UniformScaling-accepting constructors (hopefully addressing shortcoming 1 by encouraging better representation choices / making the usually poor Matrix{T} choice less attractive).

Best!

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Nov 18, 2017

Member

Convenience constructors: ones, zeros, and fill

ones, zeros, and fill each construct Arrays filled with some value. This post consolidates/reviews/analyzes discussion of this group of convenience constructors from github, slack, and triage.

Shortcomings of ones and zeros

  1. ones element type ambiguity: As #24444 highlights, whether ones(T, shape...) returns an Array filled with T's multiplicative identity (one(T)) or additive generator (oneunit(T)) is ambiguous. Standardizing on one or the other is of course possible, but: (1) as #24444 demonstrates, each choice violates some subset of users' expectations and fails to satisfy some use cases; and (2) choosing one(T), one(T, shape...)'s element type cannot be T where T's multiplicative identity one(T) is not of type T. Introducing a oneunits(T, shape...) function (that constructs Arrays of oneunit(T)s) and standardizing ones(T, shape...) to construct Arrays of one(T)s addresses point (1) but not (2), and the proliferation of (value-)names strongly indicates a missing abstraction.

  2. zeros element type ambiguity: Prior to #16116, whether one(T) returned a multiplicative identity or additive generator for T was ambiguous. #20268 resolved this ambiguity by introducing oneunit(T) as the additive generator for T and affirming one(T) as a multiplicative identity. zero suffers from a similar issue, though likely less important in practice: Is zero(T) the additive identity or a sort of multiplicative zero for T? To illustrate, is 3meters * zero(1meters) 0meters^2 or 0meters? Consequently, even with disambiguation of zero, zeros suffers from an ambiguity analogous to that described above for ones: Does zeros(T, shape...) return an Array filled with T's additive identity or instead some sort of multiplicative zero? As with ones, standardizing on one or the other is of course possible, but: (1) each choice will violate some subset of users' expectations and fail to satisfy some use cases; and (2) choosing a multiplicative zero for T, zeros(T, shape...)'s element type cannot be T where that multiplicative zero for T is not of type T. Bifurcating zeros into two functions, one for each choice, addresses point (1) but not (2), and the proliferation of (value-)names strongly indicates a missing abstraction.

  3. Handling values without an associated function: How do you construct an Array filled with 2s, or 1.0 + ims, or in general any value other than a zero or one? If you are used to ones/zeros, perhaps you respectively call 2ones(Int, (m, n)) and complex.(ones(m, n), ones(m, n)). (These examples are taken from base, as noted in #24444 (comment).) Or to avoid the temporary, perhaps you respectively call fill!([ones|zeros](Int, (m, n)), 2) and fill!([ones|zeros](Complex{Float64}, (m, n)), 1.0 + im). But such incantations are less pleasant than ones and zeros, so perhaps you give your common values names: twos(Int, (m, n)) etc. Overall, antipatterns emerge and ad hoc functions proliferate. As demonstrated in #24444 (comment) and discussed elsewhere, this issue bears out in practice and is widespread: more than half of ones calls in the wild are ad hoc fills.

  4. Simultaneous generalization pressure and poor generalizability: As the OP lays out, ones/zeros do not generalize well to other array types. But ones/zeros's considerable mindshare from Arrays creates pressure for generalization to other array types, in practice yielding the poor de facto generalization described at length in the OP (bones/bzeros, dones/dzeros, spones/spzeros, etc).

  5. Other questionable use of ones: Likely by nature of its considerable mindshare and being the shortest Array constructor, ones sees use where it shouldn't (beyond simple ad hoc fills). A few examples from base:
    dc = d + im*convert(Vector{elty}, ones(n)) instead of e.g. dc = d .+ elty(1)im or dc = d .+ Complex{elty}(im) (ad hoc convert and broadcast).

    dc = d + im*convert(Vector{elty}, ones(n))

    (1:size(A,1)).*ones(Int,size(A,2))' instead of repmat(1:size(A,1), 1, size(A,2)) (ad hoc repmat/repeat).
    iall = (1:size(A,1)).*ones(Int,size(A,2))'

    isequal(Array(sparse(complex.(ones(5,5), ones(5,5)))), complex.(ones(5,5), ones(5,5))) where introducing a name is better as e.g. in A = fill(1.0+im, 5, 5); isequal(Array(sparse(A)), A).
    @test isequal(Array(sparse(complex.(ones(5,5), ones(5,5)))), complex.(ones(5,5), ones(5,5)))

    ones(2,3) * ones(2,3)' instead of fill(3., 2, 2) (perhaps an "unusual ad hoc fill").
    @test isequal(ones(2,3) * ones(2,3)', [3. 3.; 3. 3.])

  6. Questionable use of zeros: While zeros certainly has legitimate uses, as @StefanKarpinski argues in #24444 (comment), due to its mindshare and brevity zeros sees use where something else would serve better.

  • (An observation related to 3, 5, and 6 from the analysis in #24444 (comment): Of the ~30% of ones calls in the wild that are reasonably semantically ones, an appreciable fraction of such calls appear in concatenations, where some lazy equivalent would be better. A similar statement holds for zeros.)
  1. Non-orthogonality / overlap with ...

fill

Subsuming ones and zeros, and obviating the antipatterns and ad hoc methods described in shortcoming 3, fill is the missing abstraction mentioned explicitly in shortcomings 1-2 and implicitly in shortcoming 3. Moreover, fill possesses some distinct advantages over ones and zeros:

  • By nature of requiring a value, fill avoids the element type ambiguities of ones/zeros (shortcomings 1-2).

  • By nature of accommodating any value, fill avoids the "handling values without an associated function" quandary and the consequent antipatterns and ad hoc method proliferation (shortcoming 3).

  • fill generalizes to other array types better than ones/zeros: As described in the OP, generalizations of ones/zeros exacerbate those functions' element type ambiguities and introduce additional default element type ambiguities. Generalizations of fill avoid those ambiguities. Moreover, fill dovetails nicely with the OP's second proposal for addressing the broader array construction design issue. Consequently, fill may produce less generalization pressure than ones/zeros.

  • For constructing Arrays filled with a value other than some zero or one, fill is (always?) more efficient and (usually? almost always?) more compact than equivalents using ones/zeros. For example, consider fill(1im, shape...) versus im*ones(Int, shape...) or fill(1f0/ℯ, shape...) versus ones(Float32, shape...)/ℯ.

fill's primary perceived downside is length. Though for constructing Arrays filled with Float64(0) or Float64(1) fill is marginally longer than ones/zeros (e.g. fill(1., shape...) versus ones(shape...)), for other types admitting literals fill is usually shorter (e.g. fill(1f0, shape...) versus ones(Float32, shape...)). For types not admitting literals, fill and zeros/ones are comparable in length (e.g. fill(Float16(0), shape...) versus zeros(Float16, shape...)), with a slight edge to zeros/ones. But the edge of course goes to fill for values other than a zero or one.

At present ones/zeros seem to box fill out of mindshare in the wild.

Resolution Options

  1. Better specify ones/zeros. See challenges/pitfalls described above in shortcomings 1-2. Fails to address any of shortcomings 3-7.

  2. Move ones/zeros to MATLAB compat. Addresses shortcomings 1-4 and 7. Addresses shortcomings 5-6 insofar as those shortcomings do not or only partially shift to fill.

  3. Deprecate ones/zeros methods other than ones(shape...) and zeros(shape...). Reasoning: For Float64, the additive generator and multiplicative identity coincide, and the additive identity and multiplicative zero also coincide. So these methods do not suffer from shortcomings 1-2. Though shortcomings 3-7 remain, by reducing ones/zeros's scope: (a) awareness and use of fill may improve, perhaps mitigating shortcoming 3; (b) ones/zeros generalization pressure may decrease, perhaps mitigating shortcoming 4; and (c) ones/zeros may become less attractive for questionable use, perhaps mitigating shortcomings 5-6.

What about methods of ones/zeros accepting an array (instance, not type) first argument?

Triage (uniformly?) found these methods questionable. And the analysis in #24444 (comment) and similar analysis of base suggests these methods rarely appear in the wild. So independent of what happens with the more common ones/zeros methods, triage favors deprecating these methods.

Thanks for reading! :)

Member

Sacha0 commented Nov 18, 2017

Convenience constructors: ones, zeros, and fill

ones, zeros, and fill each construct Arrays filled with some value. This post consolidates/reviews/analyzes discussion of this group of convenience constructors from github, slack, and triage.

Shortcomings of ones and zeros

  1. ones element type ambiguity: As #24444 highlights, whether ones(T, shape...) returns an Array filled with T's multiplicative identity (one(T)) or additive generator (oneunit(T)) is ambiguous. Standardizing on one or the other is of course possible, but: (1) as #24444 demonstrates, each choice violates some subset of users' expectations and fails to satisfy some use cases; and (2) choosing one(T), one(T, shape...)'s element type cannot be T where T's multiplicative identity one(T) is not of type T. Introducing a oneunits(T, shape...) function (that constructs Arrays of oneunit(T)s) and standardizing ones(T, shape...) to construct Arrays of one(T)s addresses point (1) but not (2), and the proliferation of (value-)names strongly indicates a missing abstraction.

  2. zeros element type ambiguity: Prior to #16116, whether one(T) returned a multiplicative identity or additive generator for T was ambiguous. #20268 resolved this ambiguity by introducing oneunit(T) as the additive generator for T and affirming one(T) as a multiplicative identity. zero suffers from a similar issue, though likely less important in practice: Is zero(T) the additive identity or a sort of multiplicative zero for T? To illustrate, is 3meters * zero(1meters) 0meters^2 or 0meters? Consequently, even with disambiguation of zero, zeros suffers from an ambiguity analogous to that described above for ones: Does zeros(T, shape...) return an Array filled with T's additive identity or instead some sort of multiplicative zero? As with ones, standardizing on one or the other is of course possible, but: (1) each choice will violate some subset of users' expectations and fail to satisfy some use cases; and (2) choosing a multiplicative zero for T, zeros(T, shape...)'s element type cannot be T where that multiplicative zero for T is not of type T. Bifurcating zeros into two functions, one for each choice, addresses point (1) but not (2), and the proliferation of (value-)names strongly indicates a missing abstraction.

  3. Handling values without an associated function: How do you construct an Array filled with 2s, or 1.0 + ims, or in general any value other than a zero or one? If you are used to ones/zeros, perhaps you respectively call 2ones(Int, (m, n)) and complex.(ones(m, n), ones(m, n)). (These examples are taken from base, as noted in #24444 (comment).) Or to avoid the temporary, perhaps you respectively call fill!([ones|zeros](Int, (m, n)), 2) and fill!([ones|zeros](Complex{Float64}, (m, n)), 1.0 + im). But such incantations are less pleasant than ones and zeros, so perhaps you give your common values names: twos(Int, (m, n)) etc. Overall, antipatterns emerge and ad hoc functions proliferate. As demonstrated in #24444 (comment) and discussed elsewhere, this issue bears out in practice and is widespread: more than half of ones calls in the wild are ad hoc fills.

  4. Simultaneous generalization pressure and poor generalizability: As the OP lays out, ones/zeros do not generalize well to other array types. But ones/zeros's considerable mindshare from Arrays creates pressure for generalization to other array types, in practice yielding the poor de facto generalization described at length in the OP (bones/bzeros, dones/dzeros, spones/spzeros, etc).

  5. Other questionable use of ones: Likely by nature of its considerable mindshare and being the shortest Array constructor, ones sees use where it shouldn't (beyond simple ad hoc fills). A few examples from base:
    dc = d + im*convert(Vector{elty}, ones(n)) instead of e.g. dc = d .+ elty(1)im or dc = d .+ Complex{elty}(im) (ad hoc convert and broadcast).

    dc = d + im*convert(Vector{elty}, ones(n))

    (1:size(A,1)).*ones(Int,size(A,2))' instead of repmat(1:size(A,1), 1, size(A,2)) (ad hoc repmat/repeat).
    iall = (1:size(A,1)).*ones(Int,size(A,2))'

    isequal(Array(sparse(complex.(ones(5,5), ones(5,5)))), complex.(ones(5,5), ones(5,5))) where introducing a name is better as e.g. in A = fill(1.0+im, 5, 5); isequal(Array(sparse(A)), A).
    @test isequal(Array(sparse(complex.(ones(5,5), ones(5,5)))), complex.(ones(5,5), ones(5,5)))

    ones(2,3) * ones(2,3)' instead of fill(3., 2, 2) (perhaps an "unusual ad hoc fill").
    @test isequal(ones(2,3) * ones(2,3)', [3. 3.; 3. 3.])

  6. Questionable use of zeros: While zeros certainly has legitimate uses, as @StefanKarpinski argues in #24444 (comment), due to its mindshare and brevity zeros sees use where something else would serve better.

  • (An observation related to 3, 5, and 6 from the analysis in #24444 (comment): Of the ~30% of ones calls in the wild that are reasonably semantically ones, an appreciable fraction of such calls appear in concatenations, where some lazy equivalent would be better. A similar statement holds for zeros.)
  1. Non-orthogonality / overlap with ...

fill

Subsuming ones and zeros, and obviating the antipatterns and ad hoc methods described in shortcoming 3, fill is the missing abstraction mentioned explicitly in shortcomings 1-2 and implicitly in shortcoming 3. Moreover, fill possesses some distinct advantages over ones and zeros:

  • By nature of requiring a value, fill avoids the element type ambiguities of ones/zeros (shortcomings 1-2).

  • By nature of accommodating any value, fill avoids the "handling values without an associated function" quandary and the consequent antipatterns and ad hoc method proliferation (shortcoming 3).

  • fill generalizes to other array types better than ones/zeros: As described in the OP, generalizations of ones/zeros exacerbate those functions' element type ambiguities and introduce additional default element type ambiguities. Generalizations of fill avoid those ambiguities. Moreover, fill dovetails nicely with the OP's second proposal for addressing the broader array construction design issue. Consequently, fill may produce less generalization pressure than ones/zeros.

  • For constructing Arrays filled with a value other than some zero or one, fill is (always?) more efficient and (usually? almost always?) more compact than equivalents using ones/zeros. For example, consider fill(1im, shape...) versus im*ones(Int, shape...) or fill(1f0/ℯ, shape...) versus ones(Float32, shape...)/ℯ.

fill's primary perceived downside is length. Though for constructing Arrays filled with Float64(0) or Float64(1) fill is marginally longer than ones/zeros (e.g. fill(1., shape...) versus ones(shape...)), for other types admitting literals fill is usually shorter (e.g. fill(1f0, shape...) versus ones(Float32, shape...)). For types not admitting literals, fill and zeros/ones are comparable in length (e.g. fill(Float16(0), shape...) versus zeros(Float16, shape...)), with a slight edge to zeros/ones. But the edge of course goes to fill for values other than a zero or one.

At present ones/zeros seem to box fill out of mindshare in the wild.

Resolution Options

  1. Better specify ones/zeros. See challenges/pitfalls described above in shortcomings 1-2. Fails to address any of shortcomings 3-7.

  2. Move ones/zeros to MATLAB compat. Addresses shortcomings 1-4 and 7. Addresses shortcomings 5-6 insofar as those shortcomings do not or only partially shift to fill.

  3. Deprecate ones/zeros methods other than ones(shape...) and zeros(shape...). Reasoning: For Float64, the additive generator and multiplicative identity coincide, and the additive identity and multiplicative zero also coincide. So these methods do not suffer from shortcomings 1-2. Though shortcomings 3-7 remain, by reducing ones/zeros's scope: (a) awareness and use of fill may improve, perhaps mitigating shortcoming 3; (b) ones/zeros generalization pressure may decrease, perhaps mitigating shortcoming 4; and (c) ones/zeros may become less attractive for questionable use, perhaps mitigating shortcomings 5-6.

What about methods of ones/zeros accepting an array (instance, not type) first argument?

Triage (uniformly?) found these methods questionable. And the analysis in #24444 (comment) and similar analysis of base suggests these methods rarely appear in the wild. So independent of what happens with the more common ones/zeros methods, triage favors deprecating these methods.

Thanks for reading! :)

@andyferris

This comment has been minimized.

Show comment
Hide comment
@andyferris

andyferris Nov 18, 2017

Contributor

First, thank you very much @Sacha0 for writing up such complete and well considered notes here - extremely useful, as it is warranted by the complexity.

Overall, I agree with the thrust and the razors and so-on (very handy set-up in the OP). I have a couple of orthogonal thoughts so I'll split them up into separate posts:

First, regarding convenience functions like ones and zeros and rand and fill:

  • I see the ambiguity of additive one/zero and multiplicative one/zero as a distraction here. one and zero were designed for fields - types which are monads under + and * and have additive and multiplicate identities. For things which aren't monads under * we need oneunit (and potentially zerounit or zeromult or something). To me it is clear that we would want to have oneunits (and zerounits or whatever) if ones and zeros are to continue existing (it just seems completely inconsistent otherwise).

  • How come we never had rands and randns and randexps, to be consistent with zero vs zeros, etc?

  • Random thought (no pun intended): Probably already thought of but what about rand(1:10, size...) -> fill(Rand(1:10), size...) where Rand(1:10) constructs an infinite iterable of random numbers drawn from 1:10. Similarly for Randn etc. EDIT: Sorry, this applies to constructors rand(1:10, size...) -> Array(Rand(1:10), size...) not fill.

  • Sometimes I'd like to be able to fill an associative with a value, like initializing a Dict with given keys and initial values of 0 or Vector{T}() or whatever. In general it would be great if the decisions made here are compatible with Associative (I'll talk about that more below).

Contributor

andyferris commented Nov 18, 2017

First, thank you very much @Sacha0 for writing up such complete and well considered notes here - extremely useful, as it is warranted by the complexity.

Overall, I agree with the thrust and the razors and so-on (very handy set-up in the OP). I have a couple of orthogonal thoughts so I'll split them up into separate posts:

First, regarding convenience functions like ones and zeros and rand and fill:

  • I see the ambiguity of additive one/zero and multiplicative one/zero as a distraction here. one and zero were designed for fields - types which are monads under + and * and have additive and multiplicate identities. For things which aren't monads under * we need oneunit (and potentially zerounit or zeromult or something). To me it is clear that we would want to have oneunits (and zerounits or whatever) if ones and zeros are to continue existing (it just seems completely inconsistent otherwise).

  • How come we never had rands and randns and randexps, to be consistent with zero vs zeros, etc?

  • Random thought (no pun intended): Probably already thought of but what about rand(1:10, size...) -> fill(Rand(1:10), size...) where Rand(1:10) constructs an infinite iterable of random numbers drawn from 1:10. Similarly for Randn etc. EDIT: Sorry, this applies to constructors rand(1:10, size...) -> Array(Rand(1:10), size...) not fill.

  • Sometimes I'd like to be able to fill an associative with a value, like initializing a Dict with given keys and initial values of 0 or Vector{T}() or whatever. In general it would be great if the decisions made here are compatible with Associative (I'll talk about that more below).

@rfourquet

This comment has been minimized.

Show comment
Hide comment
@rfourquet

rfourquet Nov 18, 2017

Contributor

what about rand(1:10, size...) -> fill(Rand(1:10), size...)

But this would create an array containing (aliases of) only one Rand(1:10) object, unless fill handles Rand specially, which I would dislike. I have often wanted an equivalent of C++'s std::generate, which fills a container by invoking a function, and started implementing a similar functionality in julia, until I realized that a mutating version of that can be achieved by broadcast (a=zeros(10); a .= rand.()), and a non-mutating version via Generator or list comprehensions: [rand(1:10) for x=..., y=...]. Unfortunately the latter is less performant (than rand(1:10, size...)), because it doesn't factor out a costly computation which happens within rand(1:10), so having an Iterators.randstream([rgn], [dims]) would make sense. But it's not immediately trivial to supress the rand(1:10, size...) API while retaining performance.

Contributor

rfourquet commented Nov 18, 2017

what about rand(1:10, size...) -> fill(Rand(1:10), size...)

But this would create an array containing (aliases of) only one Rand(1:10) object, unless fill handles Rand specially, which I would dislike. I have often wanted an equivalent of C++'s std::generate, which fills a container by invoking a function, and started implementing a similar functionality in julia, until I realized that a mutating version of that can be achieved by broadcast (a=zeros(10); a .= rand.()), and a non-mutating version via Generator or list comprehensions: [rand(1:10) for x=..., y=...]. Unfortunately the latter is less performant (than rand(1:10, size...)), because it doesn't factor out a costly computation which happens within rand(1:10), so having an Iterators.randstream([rgn], [dims]) would make sense. But it's not immediately trivial to supress the rand(1:10, size...) API while retaining performance.

@andyferris

This comment has been minimized.

Show comment
Hide comment
@andyferris

andyferris Nov 18, 2017

Contributor

It seems to me that some of the issues facing array constructors are also shared by similar, which I have been thinking about an awful lot lately. They share an ambiguity where the size might be confused for the keys or values of the output container.

I'll start with the signature similar(array, T, size...). It seems to me that size... is actually a placeholder for the indices and keys of the output array. For an offset vector, you would probably be best to specify the indices rather than just the size - such as similar(array, T, 0:(n-1)) for a zero-based vector of n elements. Thus I would assert that similar(array, T, 3) is just shorthand for similar(array, T, Base.OneTo(3)). For multidimensional arrays perhaps the third argument is just the keys of the array and similar(array, T, indices...) is shorthand for similar(array, T, cartesian_product(indices...)) where cartesian_product returns an AbstractCartesianRange of the indices.... (Going even further than this, I had been suggesting a version of similar for dictionaries such as similar(olddict, ValueType, new_keys) that returns a container with junk values and given keys).

I would also argue that a generic interface for array constructors should be accepting their keys - with indices... and size... simply as shorthand. (Similarly for fill and zeros and rand...) The advantages are that the expected constructor pattern will follow through to offset arrays (and potentially even to associatives). In fact, I'd probably make it one of your razors that the design should naturally work for offset arrays.

Contributor

andyferris commented Nov 18, 2017

It seems to me that some of the issues facing array constructors are also shared by similar, which I have been thinking about an awful lot lately. They share an ambiguity where the size might be confused for the keys or values of the output container.

I'll start with the signature similar(array, T, size...). It seems to me that size... is actually a placeholder for the indices and keys of the output array. For an offset vector, you would probably be best to specify the indices rather than just the size - such as similar(array, T, 0:(n-1)) for a zero-based vector of n elements. Thus I would assert that similar(array, T, 3) is just shorthand for similar(array, T, Base.OneTo(3)). For multidimensional arrays perhaps the third argument is just the keys of the array and similar(array, T, indices...) is shorthand for similar(array, T, cartesian_product(indices...)) where cartesian_product returns an AbstractCartesianRange of the indices.... (Going even further than this, I had been suggesting a version of similar for dictionaries such as similar(olddict, ValueType, new_keys) that returns a container with junk values and given keys).

I would also argue that a generic interface for array constructors should be accepting their keys - with indices... and size... simply as shorthand. (Similarly for fill and zeros and rand...) The advantages are that the expected constructor pattern will follow through to offset arrays (and potentially even to associatives). In fact, I'd probably make it one of your razors that the design should naturally work for offset arrays.

@andyferris

This comment has been minimized.

Show comment
Hide comment
@andyferris

andyferris Nov 18, 2017

Contributor

I'm not sure if this is the best forum for brainstorming, but to me when a function argument is ambiguous then I would consider giving it a keyword argument (which hopefully will be type stable soon!). Assuming we want to deal with offset arrays and indices and keys here somewhere, compare the signatures in this gist.

Contributor

andyferris commented Nov 18, 2017

I'm not sure if this is the best forum for brainstorming, but to me when a function argument is ambiguous then I would consider giving it a keyword argument (which hopefully will be type stable soon!). Assuming we want to deal with offset arrays and indices and keys here somewhere, compare the signatures in this gist.

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Nov 18, 2017

Member

Much thanks for reading and contributing your thoughts! :)

Regarding generalization of rand/randn/randexp

The working plan is to follow the OP's second proposal, i.e. MyArray(randstream, otherspecs...) for randstream any random stream. (The above independent convergence on this idea is happily confidence inspiring :).)

Regarding keyword arguments

Usually being clearer, more explicit, and less ambiguous, keyword arguments are fantastic. I would be delighted to see the OP's second proposal support keyword argument signatures, particularly where positional arguments do not scale well or are ambiguous.

In those simple cases where positional arguments are clear / unambiguous, positional arguments' brevity makes them pleasant to work with (particularly when, for example, hacking at the REPL). The OP's third razor strongly suggests their inclusion in those cases.

Wonderfully, positional and keyword arguments need not be in tension! We can have the best of both worlds by supporting one, the other, or both as appropriate and to best effect. And fortunately we can expand the set of supported signatures at any time, for example in the 1.x series. So working through the details of when/where to support positional and/or keyword arguments is not pressing.

Regarding the nature of trailing arguments to array constructors

Concerning what precisely the trailing arguments to array constructors should be (that is, what otherspecs... in the OP's second proposal's MyArray[{...}](contentspec[, otherspecs...]) model should be), I intentionally left otherspecs... somewhat ambiguous and thus flexible. Working out what otherspecs... might be in general will require substantial time and experimentation I wager. But fortunately, for reasons analogous to those given at the end of the preceding subsection, working out the details on this front is not pressing.

Regarding similar, allocate, storage traits, and related topics

Overhauling similar, introducing allocate, exploring storage traits, and related proposals are important topics warranting extensive consideration and discussion. Such proposals have a long and broad design and implementation horizon. And as much as I look forward to that discussion and its potential fruits, I would like to keep this thread tightly focused on array constructors in the narrow sense, and particularly on associated near-term considerations. Discussion of these other topics perhaps would be best consolidated in #18161.

This thread's focus

With time before 1.0 being scarce, I advocate we keep this thread tightly focused on what must happen by 1.0 to improve / enable future improvement of the design of array constructors in a narrow sense. So far, the relevant consensus action items seem to be:

  • Introduce Array[{T}](blah, shape...) constructors.
  • Deprecate Array[{T}](shape...) constructors in favor of those Array[{T}](blah, shape...) constructors.
  • Deprecate eye in favor of UniformScaling and UniformScaling-accepting array constructors.
  • Deprecate {ones|zeros}(A::AbstractArray, ...) methods.

Edit: More tasks:

  • Migrate BitArray to uninitialized-accepting constructors.
  • Migrate RowVector to uninitialized-accepting constructors.
  • Migrate GenericArray to uninitialized-accepting constructors.
  • Migrate OffsetArray to uninitialized-accepting constructors (base's implementation).

Other consensus items? Thanks and best!

Member

Sacha0 commented Nov 18, 2017

Much thanks for reading and contributing your thoughts! :)

Regarding generalization of rand/randn/randexp

The working plan is to follow the OP's second proposal, i.e. MyArray(randstream, otherspecs...) for randstream any random stream. (The above independent convergence on this idea is happily confidence inspiring :).)

Regarding keyword arguments

Usually being clearer, more explicit, and less ambiguous, keyword arguments are fantastic. I would be delighted to see the OP's second proposal support keyword argument signatures, particularly where positional arguments do not scale well or are ambiguous.

In those simple cases where positional arguments are clear / unambiguous, positional arguments' brevity makes them pleasant to work with (particularly when, for example, hacking at the REPL). The OP's third razor strongly suggests their inclusion in those cases.

Wonderfully, positional and keyword arguments need not be in tension! We can have the best of both worlds by supporting one, the other, or both as appropriate and to best effect. And fortunately we can expand the set of supported signatures at any time, for example in the 1.x series. So working through the details of when/where to support positional and/or keyword arguments is not pressing.

Regarding the nature of trailing arguments to array constructors

Concerning what precisely the trailing arguments to array constructors should be (that is, what otherspecs... in the OP's second proposal's MyArray[{...}](contentspec[, otherspecs...]) model should be), I intentionally left otherspecs... somewhat ambiguous and thus flexible. Working out what otherspecs... might be in general will require substantial time and experimentation I wager. But fortunately, for reasons analogous to those given at the end of the preceding subsection, working out the details on this front is not pressing.

Regarding similar, allocate, storage traits, and related topics

Overhauling similar, introducing allocate, exploring storage traits, and related proposals are important topics warranting extensive consideration and discussion. Such proposals have a long and broad design and implementation horizon. And as much as I look forward to that discussion and its potential fruits, I would like to keep this thread tightly focused on array constructors in the narrow sense, and particularly on associated near-term considerations. Discussion of these other topics perhaps would be best consolidated in #18161.

This thread's focus

With time before 1.0 being scarce, I advocate we keep this thread tightly focused on what must happen by 1.0 to improve / enable future improvement of the design of array constructors in a narrow sense. So far, the relevant consensus action items seem to be:

  • Introduce Array[{T}](blah, shape...) constructors.
  • Deprecate Array[{T}](shape...) constructors in favor of those Array[{T}](blah, shape...) constructors.
  • Deprecate eye in favor of UniformScaling and UniformScaling-accepting array constructors.
  • Deprecate {ones|zeros}(A::AbstractArray, ...) methods.

Edit: More tasks:

  • Migrate BitArray to uninitialized-accepting constructors.
  • Migrate RowVector to uninitialized-accepting constructors.
  • Migrate GenericArray to uninitialized-accepting constructors.
  • Migrate OffsetArray to uninitialized-accepting constructors (base's implementation).

Other consensus items? Thanks and best!

@dlfivefifty

This comment has been minimized.

Show comment
Hide comment
@dlfivefifty

dlfivefifty Nov 30, 2017

Contributor

Hmm, I think I agree. If keywords become fast, how about

Array(n; fill=x) # returns a length `n` `Vector{typeof(x)}` with entries `x`
Contributor

dlfivefifty commented Nov 30, 2017

Hmm, I think I agree. If keywords become fast, how about

Array(n; fill=x) # returns a length `n` `Vector{typeof(x)}` with entries `x`

@StefanKarpinski StefanKarpinski removed the triage label Nov 30, 2017

@dlfivefifty

This comment has been minimized.

Show comment
Hide comment
@dlfivefifty

dlfivefifty Dec 2, 2017

Contributor

I've noticed the following deprecation:

zeros(a::AbstractArray) is deprecated, use fill!(similar(a), 0) instead

I don't think the replacement for zeros(a) is compatible with working with other array types. In an ApproxFun.jl branch, I went with Array(Zeros, A) and BandedMatrix(Zeros, A), with the latter automatically getting its size and bandwidths from A whenever A implements the banded matrix interface. But I'm not sure that's a particularly good solution.

Perhaps if A::AbstractArray we could deprecate zeros(A) with

Array(A; fill=0)

Then BandedMatrix(A; fill=0) would get both its size and bandwidths from A.

Contributor

dlfivefifty commented Dec 2, 2017

I've noticed the following deprecation:

zeros(a::AbstractArray) is deprecated, use fill!(similar(a), 0) instead

I don't think the replacement for zeros(a) is compatible with working with other array types. In an ApproxFun.jl branch, I went with Array(Zeros, A) and BandedMatrix(Zeros, A), with the latter automatically getting its size and bandwidths from A whenever A implements the banded matrix interface. But I'm not sure that's a particularly good solution.

Perhaps if A::AbstractArray we could deprecate zeros(A) with

Array(A; fill=0)

Then BandedMatrix(A; fill=0) would get both its size and bandwidths from A.

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Dec 2, 2017

Member

@dlfivefifty, I would expect similar(A::BandedMatrix) to return a BandedMatrix similar to A in all respects but for data, including shape and bandwidths. The banded matrix types in Base (namely Diagonal, Bidiagonal, Tridiagonal, and SymTridiagonal) behave in this manner, and with that behavior fill!(similar(A), 0) serves as a direct replacement for zeros(A::AbstractArray) on these types:

julia> fill!(similar(Tridiagonal([1,2,3], [1,2,3,4], [1,2,3])), 0)
4×4 Tridiagonal{Int64,Array{Int64,1}}:
 0  0  ⋅  ⋅
 0  0  0  ⋅
 ⋅  0  0  0
 ⋅  ⋅  0  0

How does similar(A::BandedMatrix) behave?

Alternative rewrites for zeros(A::AbstractArray) exist and may serve better in various situations, for example zero(A), fill!(copy(A), 0), or (forthcoming) fillstored!([copy|similar](A), 0). zeros(A::AbstractArray) rewrite questions having popped up a couple times now (#24444 (comment), #24656), so I have made a note to expand the deprecation warnings with additional rewrites and explanation.

Note that this discussion highlights the ambiguity in zeros(A::AbstractArray), and that zeros(A::AbstractArray, ...) = fill!(similar(A, ...), 0) was how this function was defined in Base :). Thanks and best!

Member

Sacha0 commented Dec 2, 2017

@dlfivefifty, I would expect similar(A::BandedMatrix) to return a BandedMatrix similar to A in all respects but for data, including shape and bandwidths. The banded matrix types in Base (namely Diagonal, Bidiagonal, Tridiagonal, and SymTridiagonal) behave in this manner, and with that behavior fill!(similar(A), 0) serves as a direct replacement for zeros(A::AbstractArray) on these types:

julia> fill!(similar(Tridiagonal([1,2,3], [1,2,3,4], [1,2,3])), 0)
4×4 Tridiagonal{Int64,Array{Int64,1}}:
 0  0  ⋅  ⋅
 0  0  0  ⋅
 ⋅  0  0  0
 ⋅  ⋅  0  0

How does similar(A::BandedMatrix) behave?

Alternative rewrites for zeros(A::AbstractArray) exist and may serve better in various situations, for example zero(A), fill!(copy(A), 0), or (forthcoming) fillstored!([copy|similar](A), 0). zeros(A::AbstractArray) rewrite questions having popped up a couple times now (#24444 (comment), #24656), so I have made a note to expand the deprecation warnings with additional rewrites and explanation.

Note that this discussion highlights the ambiguity in zeros(A::AbstractArray), and that zeros(A::AbstractArray, ...) = fill!(similar(A, ...), 0) was how this function was defined in Base :). Thanks and best!

@dlfivefifty

This comment has been minimized.

Show comment
Hide comment
@dlfivefifty

dlfivefifty Dec 2, 2017

Contributor

Sorry, let me clarify: sometimes when I call zeros(A::BandedMatrix) I’m actually after a Matrix of the same dimensions as A. But sometimes I want a BandedMatrix with the same dimensions and bandwidths as A. fill!(similar(A), 0) only gives me the latter.

Contributor

dlfivefifty commented Dec 2, 2017

Sorry, let me clarify: sometimes when I call zeros(A::BandedMatrix) I’m actually after a Matrix of the same dimensions as A. But sometimes I want a BandedMatrix with the same dimensions and bandwidths as A. fill!(similar(A), 0) only gives me the latter.

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Dec 2, 2017

Member

Ah, I see :). fill!(Matrix(A), 0) for the latter operation? Best!

Member

Sacha0 commented Dec 2, 2017

Ah, I see :). fill!(Matrix(A), 0) for the latter operation? Best!

@dlfivefifty

This comment has been minimized.

Show comment
Hide comment
@dlfivefifty

dlfivefifty Dec 2, 2017

Contributor

No because that copies all the entries first

Contributor

dlfivefifty commented Dec 2, 2017

No because that copies all the entries first

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Dec 2, 2017

Member

You mean fill!(Matrix{eltype(A)}(size(A)), 0) (on stable, and with uninitialized in the Matrix constructor on master)?

Member

Sacha0 commented Dec 2, 2017

You mean fill!(Matrix{eltype(A)}(size(A)), 0) (on stable, and with uninitialized in the Matrix constructor on master)?

@dlfivefifty

This comment has been minimized.

Show comment
Hide comment
@dlfivefifty

dlfivefifty Dec 2, 2017

Contributor

That's fine for Matrix, though very long, but not for BandedMatrix where one would need to test if A satisfies the banded matrix interface (isbanded(A)) to decide whether to call `bandwidth or not.

And in ApproxFun I'm doing code like

for TYP in (:Matrix, :BandedMatrix, :RaggedMatrix, :BlockMatrix, :BandedBlockBandedMatrix, :BlockBandedMatrix, :AlmostBandedMatrix)
    @eval function convert(::Type{$TYP}, A::YetAnotherType)
          ret = $TYP(Zeros, A)
          # populate ret
          ret
    end
end

I see no way to do this general pattern where each matrix type has different parameters using something like fill!(Matrix{eltype(A)}(size(A)), 0).

Contributor

dlfivefifty commented Dec 2, 2017

That's fine for Matrix, though very long, but not for BandedMatrix where one would need to test if A satisfies the banded matrix interface (isbanded(A)) to decide whether to call `bandwidth or not.

And in ApproxFun I'm doing code like

for TYP in (:Matrix, :BandedMatrix, :RaggedMatrix, :BlockMatrix, :BandedBlockBandedMatrix, :BlockBandedMatrix, :AlmostBandedMatrix)
    @eval function convert(::Type{$TYP}, A::YetAnotherType)
          ret = $TYP(Zeros, A)
          # populate ret
          ret
    end
end

I see no way to do this general pattern where each matrix type has different parameters using something like fill!(Matrix{eltype(A)}(size(A)), 0).

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Dec 2, 2017

Member

Thanks for the clarifying code! I think I see what you are after now; I touched on an extension to the OP's second proposal in #11557 (comment) that is quite similar to what you are looking for and I agree could be lovely and useful. These ideas also head into #18161 territory. Note that the operations you are after were not covered by zeros(A::AbstractArray)'s contract. Best!

Member

Sacha0 commented Dec 2, 2017

Thanks for the clarifying code! I think I see what you are after now; I touched on an extension to the OP's second proposal in #11557 (comment) that is quite similar to what you are looking for and I agree could be lovely and useful. These ideas also head into #18161 territory. Note that the operations you are after were not covered by zeros(A::AbstractArray)'s contract. Best!

@mschauer

This comment has been minimized.

Show comment
Hide comment
@mschauer

mschauer Dec 12, 2017

Contributor

In #16029 (comment), @nalimilan proposed to capture the cases where collect(X) could possibly return something else than an Array depending on X by AbstractArray(X), which is understood to collect to the most appropriate mutable container type (with Array as a fallback).

There is a similar subproposal above for e.g. SomeArray(Rep(NaN), A). The difference is that SomeArray(...) collects to SomeArrays and AbstractArray(...) collects to whatever it can infer from the argument or Iterator-traits. A possible use case is

AbstractArray(NaN for a in A)

which can return a MVector filled with NaNs (e.g.) when A is a static vector and this is known via a trait.

The introduction of AbstractArray(...) with this meaning allows to recover the generality which is lost by replacing collect with Vector as non-breaking change even later on, so one can proceed with #16029 .

Contributor

mschauer commented Dec 12, 2017

In #16029 (comment), @nalimilan proposed to capture the cases where collect(X) could possibly return something else than an Array depending on X by AbstractArray(X), which is understood to collect to the most appropriate mutable container type (with Array as a fallback).

There is a similar subproposal above for e.g. SomeArray(Rep(NaN), A). The difference is that SomeArray(...) collects to SomeArrays and AbstractArray(...) collects to whatever it can infer from the argument or Iterator-traits. A possible use case is

AbstractArray(NaN for a in A)

which can return a MVector filled with NaNs (e.g.) when A is a static vector and this is known via a trait.

The introduction of AbstractArray(...) with this meaning allows to recover the generality which is lost by replacing collect with Vector as non-breaking change even later on, so one can proceed with #16029 .

@Sacha0

This comment has been minimized.

Show comment
Hide comment
@Sacha0

Sacha0 Dec 12, 2017

Member

There is a similar subproposal above for e.g. SomeArray(Rep(NaN), A). The difference is that SomeArray(...) collects to SomeArrays and AbstractArray(...) collects to whatever it can infer from the argument or Iterator-traits.

Please note that AbstractArray(Rep(NaN), A) in that subproposal provides precisely the behavior you seek in AbstractArray(NaN for a in A) :).

Member

Sacha0 commented Dec 12, 2017

There is a similar subproposal above for e.g. SomeArray(Rep(NaN), A). The difference is that SomeArray(...) collects to SomeArrays and AbstractArray(...) collects to whatever it can infer from the argument or Iterator-traits.

Please note that AbstractArray(Rep(NaN), A) in that subproposal provides precisely the behavior you seek in AbstractArray(NaN for a in A) :).

@mschauer

This comment has been minimized.

Show comment
Hide comment
@mschauer

mschauer Dec 12, 2017

Contributor

They go together, like Array(it, shape...) and Array(it) have both their place.

Contributor

mschauer commented Dec 12, 2017

They go together, like Array(it, shape...) and Array(it) have both their place.

@RalphAS

This comment has been minimized.

Show comment
Hide comment
@RalphAS

RalphAS Mar 4, 2018

A recent discussion on discourse shows that many in the user community want a convenience constructor for the uninitialized case. This is qualitatively different from ones() etc., and the third razor in the OP seems to imply that it should be done, properly, in Base. Is there some reason it couldn't simply be a callable instance?

(uninitialized)(T::DataType,dims...)=Array{T}(uninitialized,dims)

RalphAS commented Mar 4, 2018

A recent discussion on discourse shows that many in the user community want a convenience constructor for the uninitialized case. This is qualitatively different from ones() etc., and the third razor in the OP seems to imply that it should be done, properly, in Base. Is there some reason it couldn't simply be a callable instance?

(uninitialized)(T::DataType,dims...)=Array{T}(uninitialized,dims)
@JeffBezanson

This comment has been minimized.

Show comment
Hide comment
@JeffBezanson

JeffBezanson Mar 4, 2018

Member

I would also accept renaming it to uninit. That is one long word.

Member

JeffBezanson commented Mar 4, 2018

I would also accept renaming it to uninit. That is one long word.

@andyferris

This comment has been minimized.

Show comment
Hide comment
@andyferris

andyferris Mar 4, 2018

Contributor

c.f. #undef - if we're looking for a new name for these things, it could be nice to normalize terminology.

Contributor

andyferris commented Mar 4, 2018

c.f. #undef - if we're looking for a new name for these things, it could be nice to normalize terminology.

@crstnbr

This comment has been minimized.

Show comment
Hide comment
@crstnbr

crstnbr Mar 4, 2018

Contributor

But uninit and #undef mean different things, right? AFAIU uninit means I don't care about the initialization whereas #undef represents not filled at all (truly empty array).

Contributor

crstnbr commented Mar 4, 2018

But uninit and #undef mean different things, right? AFAIU uninit means I don't care about the initialization whereas #undef represents not filled at all (truly empty array).

@andyferris

This comment has been minimized.

Show comment
Hide comment
@andyferris

andyferris Mar 4, 2018

Contributor

#undefs can only occur as uninitialized values of Arrays (or fields of structs), which AFAICT are set to zero pointers by the compiler (zero pointers could also come from structs and arrays returned by foreign calls, I suppose). The main difference is for isbits types whose uninitialised value is whatever RAM happens to be there, since no pointer is involved. It may well be that a large enough difference exists such that two terminologies is better than one, but maybe it’s worth thinking about.

I was just thinking it might be nice to be able to use something like uninitialized in struct inner constructors (i.e. Expr(:new, ...) expressions where I can indicate exactly which fields to leave uninitialized - but then maybe that would need to be a language keyword or something since if it is an object I should be able to throw it inside a struct?).

Contributor

andyferris commented Mar 4, 2018

#undefs can only occur as uninitialized values of Arrays (or fields of structs), which AFAICT are set to zero pointers by the compiler (zero pointers could also come from structs and arrays returned by foreign calls, I suppose). The main difference is for isbits types whose uninitialised value is whatever RAM happens to be there, since no pointer is involved. It may well be that a large enough difference exists such that two terminologies is better than one, but maybe it’s worth thinking about.

I was just thinking it might be nice to be able to use something like uninitialized in struct inner constructors (i.e. Expr(:new, ...) expressions where I can indicate exactly which fields to leave uninitialized - but then maybe that would need to be a language keyword or something since if it is an object I should be able to throw it inside a struct?).

@ChrisRackauckas

This comment has been minimized.

Show comment
Hide comment
@ChrisRackauckas

ChrisRackauckas Mar 7, 2018

Contributor

Bringing up #25107 since it seems a lot of this proposal mentions shape, when shape isn't enough information to fully characterize the structure of many abstract arrays.

Contributor

ChrisRackauckas commented Mar 7, 2018

Bringing up #25107 since it seems a lot of this proposal mentions shape, when shape isn't enough information to fully characterize the structure of many abstract arrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment