Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string length() does not give iterable item count #1454

Closed
carlobaldassi opened this issue Oct 26, 2012 · 12 comments
Closed

string length() does not give iterable item count #1454

carlobaldassi opened this issue Oct 26, 2012 · 12 comments
Labels
bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@carlobaldassi
Copy link
Member

This works:

julia> ["xyz"...]
3-element Char Array:
 'x'
 'y'
 'z'

This segfaults:

julia> ["é"...]
Segmentation fault (core dumped)

My guess is that [x...] should use strlen instead of length in this case.

@staticfloat
Copy link
Sponsor Member

gdb backtrace:

julia> ["é"...]

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x00000001000498a0 in jl_method_table_assoc_exact ()
(gdb) bt
#0  0x00000001000498a0 in jl_method_table_assoc_exact ()
#1  0x0000000100049b9a in jl_apply_generic ()
#2  0x000000010004e66e in jl_f_apply ()
#3  0x000000010007fb14 in do_call ()
#4  0x000000010007eaa9 in eval ()
#5  0x000000010008a004 in jl_toplevel_eval_flex ()
#6  0x000000010004e8f7 in jl_f_top_eval ()
#7  0x0000000101d5aded in ?? ()
#8  0x0000000100049c65 in jl_apply_generic ()
#9  0x0000000101d13d92 in ?? ()
#10 0x0000000101d13a2c in ?? ()
#11 0x0000000100049c65 in jl_apply_generic ()
#12 0x00000001000017be in true_main ()
#13 0x00000001000839c3 in julia_trampoline ()
#14 0x0000000100001b7b in main ()
(gdb) 

@carlobaldassi
Copy link
Member Author

I'll add that comprehensions get the length wrong but don't segfault:

julia> [x for x in "ẍÿƶ"]
7-element Any Array:
 'ẍ'   
 'ÿ'   
 'ƶ'   
 #undef
 #undef
 #undef
 #undef

I wonder if it's reasonable to define an iteration_length or iterlen function which defaults to length in general, strlen for strings, and possibly x->nothing (or something similar) for iterators for which the number of iterations can't be known in advance (so that things like comprehensions and [x...] can use dynamically growing memory).

@JeffBezanson
Copy link
Sponsor Member

This is probably one reason we used to have a separate numel. @StefanKarpinski what do you think?

@JeffBezanson
Copy link
Sponsor Member

BTW, the segfault is also due only to length giving the wrong value for this case.

@StefanKarpinski
Copy link
Sponsor Member

Ho-hum. I'm not really sure what to do here. Didn't you want to rename strlen anyway?

@JeffBezanson
Copy link
Sponsor Member

The only use of length for strings seems to be in done for strings. If one always called done instead of doing i > length(s), perhaps length could return the number of characters.

@StefanKarpinski
Copy link
Sponsor Member

I have to think about this because it radically changes what was a pretty well-thought-through, coherent view of strings – i.e. that 1:length(s) is the range of indices but there's not a character for every index. Also, keep in mind that any use of end to index into a string is implicitly also a call to length(s). However, unless you wrap it in thisind, chances are that usage is broken. I think this may need a bit of a revamp...

@JeffBezanson
Copy link
Sponsor Member

If we could fix both those things it would be fantastic. Otherwise, we could just bring back numel as the same as length for everything but strings.

@StefanKarpinski
Copy link
Sponsor Member

Hmm. This is a tricky problem.

JeffBezanson added a commit that referenced this issue Oct 28, 2012
@vtjnash
Copy link
Sponsor Member

vtjnash commented Nov 8, 2012

@JeffBezanson isn't this fixed? (with extensions to comprehensions covered by open issue #1457 referenced above)

@JeffBezanson
Copy link
Sponsor Member

Not entirely, since the title of this issue is still true afaik. Either
length or some other function should reliably give an item count for
collections where it makes sense. Or there could be a separate function to
give the last valid index.
On Nov 8, 2012 1:47 AM, "Jameson Nash" notifications@github.com wrote:

@JeffBezanson https://github.com/JeffBezanson isn't this fixed? (with
extensions to comprehensions covered by open issue #1457https://github.com/JuliaLang/julia/issues/1457referenced above)


Reply to this email directly or view it on GitHubhttps://github.com//issues/1454#issuecomment-10178237.

@JeffBezanson
Copy link
Sponsor Member

See #1939.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

5 participants