length, numel, count, strlen #1939

Closed
StefanKarpinski opened this Issue Jan 8, 2013 · 16 comments

Projects

None yet

6 participants

@StefanKarpinski
Member

See discussion here: #1916. There are a couple of things going on here.

  1. we should probably get rid of numel.
  2. count should be a reducer for giving the number of elements in a collection.
  3. for types providing length give count that way.
  4. maybe use count in place of strlen?

There may be other things to consider here.

@stevengj
Member
stevengj commented Jan 9, 2013

I don't understand why there should be a separate count function that is identical to length for arrays but also handles other container types. Why not just call it length?

@StefanKarpinski
Member

The idea is that if you're argument is something that needs to be consumed in order to count the number of elements (say a task or a stream), then if you call length and have no more elements left you're going to be rather unhappy.

@johnmyleswhite
Member

I think the case of a non-rewindable iterable argues for calling the function count!.

@StefanKarpinski
Member

Hmm. That feels like a serious rathole.

@JeffBezanson
Member

Decision: length will give the number of elements you get when iterating over something, endof or similar will give the last index (for use by end in indexing).

@StefanKarpinski
Member

We seem to have resolved that length should mean "the number of things in a collection", implying that length(str) should give the number number of characters, and therefore replace the current meaning of strlen(str) and strlen should be deprecated. The other function that is needed is endof(x) which gives the last valid key into x. In the case of vectors this will simply be length(x), but for strings it should return the last valid byte index into the string. An interesting example is sorted dictionaries where endof(sd) would return the last key value in sorted order so that sd[end] gives the last value in sorted order; there is, however, no corresponding syntax for sd[start(sd)] which give the first value in sorted order.

@StefanKarpinski
Member

numel should also be deprecated in favor of length.

@StefanKarpinski
Member

Related to #1454.

@ViralBShah
Member

length now has a meaning that will completely confound matlab users. For that reason, numel was an unambiguous name.

@StefanKarpinski
Member

Saying that this will confound Matlab users is a bit of a stretch. What's a use case where someone from Matlab would actually use length and get confused? This doesn't change length's behavior at all, by the way – this is what length has meant for months. The only thing I did was deprecate numel.

@ViralBShah
Member

Ok, since we have had this behaviour for a long time, we have been confounding Matlab users for a long time. :-)

@JeffBezanson
Member

max(size(A)) is a pretty strange function and calling it length makes very little sense.

@GlenHertz
Contributor

I agree unless A is of type rectangle. Then that length makes sense. "width" would be the min(size(A)).

@johnmyleswhite
Member

I agree that makes sense, but it seems too quirky to allow in a language with arbitrary rank tensors.

@GlenHertz
Contributor

Totally agree it is a bad idea in general. I should have been more clear. My minor point was that length may not be the best name. I think 'count' is less confusing but at this point I don't really care which way it goes. I expected 'length' to return nrow(array) since length of a shape is usually the size of one dimension.

@StefanKarpinski
Member

I would be fine with count returning the number of elements in an iterable – possibly by consuming them – and have length be the number of elements in non-consumed, vector-like things (including vectors and strings). However, I suspect that might be a little too particular.

@JeffBezanson JeffBezanson added a commit that closed this issue Jan 12, 2013
@JeffBezanson JeffBezanson add endof(), giving the last index (used by A[end]), now crucial for …
…strings

make length(String) give character count. string[end] now also works.
closes #1939, closes #1454
51410ae
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment