Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Make [a, b] non-concatenating #8599

Merged
merged 15 commits into from
Feb 9, 2015
Merged

WIP: Make [a, b] non-concatenating #8599

merged 15 commits into from
Feb 9, 2015

Conversation

jakebolewski
Copy link
Member

Brings @nolta's mn/sanecatbranch up to date. Addresses #3737 #2488. Part of #7941.

@ivarne ivarne added breaking This change will break code needs decision A decision on this change is needed labels Oct 6, 2014
Conflicts:
	base/LineEdit.jl
	base/show.jl
	src/julia-parser.scm
	test/collections.jl
@JeffBezanson
Copy link
Member

Awesome! Fixed my conflicts.

@johnmyleswhite
Copy link
Member

+1

@JeffBezanson
Copy link
Member

Looks like this also fixes #8541?

@JeffBezanson
Copy link
Member

This parses x[i,j;k] as (typed_vcat x (parameters k) i j). It should probably be (ref x (parameters k) i j).

@JeffBezanson
Copy link
Member

Although really the problem is that x[i;j] is ambiguous. It could be a vcat, or a ref with 1 argument and 1 parameter.

@JeffBezanson
Copy link
Member

Maybe we should get rid of "typed cat" syntax as well. I wonder how much it is used?

@jakebolewski
Copy link
Member Author

Almost never (scanning over all registered packages):

/Users/jacobbolewski/.julia/v0.3/MathProgBase/test/mixintprog.jl => 1
/Users/jacobbolewski/.julia/v0.3/TimeData/test/dataframe_extensions.jl => 2
/Users/jacobbolewski/.julia/v0.3/DataArrays/test/literals.jl => 4
/Users/jacobbolewski/Julia/julia/test/reduce.jl => 3
/Users/jacobbolewski/.julia/v0.3/MAT/test/read.jl => 1
/Users/jacobbolewski/Julia/julia/test/arrayops.jl => 1

@joehuchette
Copy link
Member

In the MathProgBase test, we use "typed cat" to make a Float64 matrix from Int literals:

Float64[1 2 3 4 5]

So it's kind of a corner case (and not strictly necessary), but the syntax is nice for it.

@mlubin
Copy link
Member

mlubin commented Oct 8, 2014

It wouldn't be much of a loss to have to change that particular case to [1.0 2.0 3.0 4.0 5.0]

@JeffBezanson
Copy link
Member

That syntax just doesn't quite seem general enough to be worth it.

@joehuchette
Copy link
Member

I think I agree, but it's not clear to me why it would potentially be removed either. Just to avoid the ambiguity in parsing x[i;j]? It's also consistent with the behavior for vectors: Float64[1,2,3]

@JeffBezanson
Copy link
Member

While we're making breaking changes here anyway, I think we really should
remove all the ambiguities and weirdness from this syntax. Not totally sure
what exactly to do though.

@jakebolewski
Copy link
Member Author

My hatred of this array syntax does grow.

Some non-obvious behaviors:

x = 1:10
....hundred lines later....
y = [x;]  # what the hell does this do?
# lets play spot the semicolon....
for Mi7647 in {Symmetric(diagm(1.0:3.0)), Hermitian(diagm(1.0:3.0)),
          Hermitian(diagm(complex(1.0:3.0))), SymTridiagonal([1.0:3.0;], zeros(2))}
...
# unrelated but another unreadable example
[3 3 + 4im 4 + 5im 6im 7 + 8im 4 + 3im 9 4im]

There seems to be a tension between wanting concise syntax, wanting syntax that allows you to sanely overload getindex typ[x,y; args...] -> getindex(typ, x,y args..) (although this gets abused currently because we are not allowed to overload .), and what is sane to read.

@StefanKarpinski
Copy link
Member

We should consider getting rid of the space-separated syntax in side of array literals while we're at it.

@toivoh
Copy link
Contributor

toivoh commented Oct 8, 2014

Yeah, if we could get rid of the space sensitive syntax that would be great. Not sure how it could be done though?

@tkelman
Copy link
Contributor

tkelman commented Oct 8, 2014

We should consider getting rid of the space-separated syntax in side of array literals while we're at it.

Yes please, I'd be hugely in favor of that. I feel like semicolons are a little bit underutilized inside array literals. Would making [1, 2, 3;] the syntax for a row array be too similar to a 1-d vector, or too ambiguous with the new getindex with kwargs syntax? (edit: also #8615 means something's screwy with the way that works right now...)

@pao
Copy link
Member

pao commented Oct 8, 2014

Dissenting opinion on removal of space-separation: DCMs everywhere will weep softly at needing to have visible syntax separating their elements.

[ 0 1 0
 -1 0 0
  0 0 1]

vs.

[ 0, 1, 0;
 -1, 0, 0;
  0, 0, 1]

(This has been a message brought to you by Julia Users for the Protection of Small, Dense Matrices.)

@Jutho
Copy link
Contributor

Jutho commented Oct 8, 2014

While I am typically inclined towards more consistent behavior, even if this means stepping away from familiar (from other software) conventions, in this case I tend to agree with @pao. In particular, if [1,2,3] is a vector (i.e. a column / vertical object), then I would start doubting whether

[ 0, 1, 0;
 -1, 0, 0;
  0, 0, 1]

is a matrix with 0,1,0 as it first row (as it visually looks), or 0,1,0 as its first column. That confusion is removed by just using spaces as separators within a row, despite the problems it brings along.

I am no longer convinced by my own argument ...

@ivarne
Copy link
Member

ivarne commented Oct 8, 2014

The pretty matrix input format seems like it can be solved in other ways, that doesn't make [] so heavily overloaded. A string macro would be pretty easy

a = Matrix"""
 0 1 0
-1 0 0
 0 0 1
"""

And it could also be made to support picking variables from the environment (with or without the $ prefix)

@johnmyleswhite
Copy link
Member

Another +1 for removing space sensitivity, despite the concerns about conflating rows and columns.

I think dealing with tensions between whether inputs are rows or columns is going to be big problem for generalizing to n-dimensional inputs.

@JeffBezanson
Copy link
Member

Interestingly this change actually makes the space-sensitive syntax worse, since there is no 1-element version of it: [x;] is 1-element vcat, but [x] is a 1-element array, and obviously so is [x ].

Here's a weird idea: we could require tab as a separator here, making TSV nearly valid input syntax.

@pao
Copy link
Member

pao commented Oct 8, 2014

Here's a weird idea: we could require tab as a separator here, making TSV nearly valid input syntax.

Worse than invisible is invisible but indistinguishable. After wrestling with YAML files and Makefiles, I don't want any of that.

The string input syntax proposed by @ivarne is interesting, though.

@JeffBezanson
Copy link
Member

I should raise the issue that this is a breaking change, and not just a deprecation. In the current state this branch has no behavior changes, but prints warnings. After merging, we should leave it like that for a little while, then flip the switch. For this to be worth it 0.4 needs the new behavior enabled.

@mlubin
Copy link
Member

mlubin commented Feb 9, 2015

+1 to merge. Better now than later.

@mbauman
Copy link
Member

mbauman commented Feb 9, 2015

This is awesome. +1 to merging and taking off the training wheels once packages have adjusted. 🍰 for everyone involved here.

use promote_typeof in mixed-type cat functions
@johansigfrids
Copy link

Why can't [itr] be a shorthand for collect(itr)? Truth is, I always read [a:b] as a shorthand for [i for i in a:b].

@johnmyleswhite
Copy link
Member

Because you can index by iterators.

@Jutho
Copy link
Contributor

Jutho commented Feb 9, 2015

julia/test/perf/cat/perf.jl certainly needs to be patched when/before this gets merged.

@kmsquire
Copy link
Member

kmsquire commented Feb 9, 2015

Because you can index by iterators.

I think John meant to say that you can index by ranges.

@StefanKarpinski
Copy link
Member

@johansigfrids: [x] is a corner case where the syntaxes for vector construction ([a, b, c]) vertical concatenation ([a; b; c]) horizontal concatenation ([a b c]) and horizontal/vertical concatenation ([a b; c d]) all intersect since they are distinguished by what punctuation is used between values. For vector construction, the result should always be the vector containing x; for concatenation syntax, if x is some kind of collection, the result should be a vector containing the elements of x. My argument was that since concatenating a single thing is not really sensible, we should use collect(x) and consider [x] to belong to the vector construction syntax.

JeffBezanson added a commit that referenced this pull request Feb 9, 2015
WIP: Make [a, b] non-concatenating
@JeffBezanson JeffBezanson merged commit 075f84d into master Feb 9, 2015
@IainNZ
Copy link
Member

IainNZ commented Feb 9, 2015

I should raise the issue that this is a breaking change, and not just a deprecation. In the current state this branch has no behavior changes, but prints warnings. After merging, we should leave it like that for a little while, then flip the switch. For this to be worth it 0.4 needs the new behavior enabled.

Maybe for 0.4-rc1, or would that be too late?

@garborg
Copy link
Contributor

garborg commented Feb 12, 2015

Re: collect(r) being idiomatic and making more sense than [r;], I agree, but collect(r) is 40-100% slower than [r;] (for UnitRanges) at the moment. collect already seems more useful for generic code as, e.g. [filter;] returns Array{Filter,1}.

@JeffBezanson
Copy link
Member

JeffBezanson commented Feb 13, 2015 via email

@lostanlen
Copy link

FWIW, this change causes the "Don't overuse splicing" style advice (https://julia.readthedocs.org/en/latest/manual/style-guide/#don-t-overuse) to be outdated.

@pao
Copy link
Member

pao commented Apr 28, 2015

Thanks for the work you've been doing so far on the documentation, @lostanlen!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code needs decision A decision on this change is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.