# findfirst(A) returns index of first non-zero element, or 0 #925

Merged
merged 5 commits into from Jun 19, 2012

## Conversation

Projects
None yet
4 participants
Member

### HarlanH commented Jun 12, 2012

 per mailing list discussion
 Harlan Harris ``` findfirst(A) gives index of first non-zero element ``` `or 0 if none found. Per dev-list discussion.` `7c38668`
Member

### pao commented Jun 12, 2012

 Better MATLAB syntactic compatibility would be to have a two-argument `find()`, where the second argument is the number of elements to find. Whether we want that or not is up for discussion, of course, though I personally favor the two-argument approach since we have multiple dispatch. http://www.mathworks.com/help/techdoc/ref/find.html
Member

### pao commented Jun 12, 2012

 Oh, just saw your post on -dev. Those are good points. Got to think about that.
Member

### HarlanH commented Jun 13, 2012

 Yeah, I see the appeal of having findfirst-like behavior in find(), but am not currently convinced it's a good idea. On a related note, findfirst() in this pull request always finds the first non-zero element. That always works, as you can do something like `findfirst([1,2,3,4,5] .== 3)`, returning the index 3, but it'd be better to have a two-argument version of findfirst, like `findfirst([1,2,3,4,5], 3)` that does a true short-circuit for speed. I'm going to update this pull request to include that form as well...
 Harlan Harris ``` findfirst(A, 3) and findfirst(A, isodd) forms ``` `19caba5`
Member

### HarlanH commented Jun 13, 2012

 also added findfirst(A, function) form, which tests until the function returns true. Note that this is syntactically the opposite of the map() and filter() functions, which might suggest that findfirst(function, A) would be better. If so, then perhaps findfirst(3, A) would be better than the current form. Consistency is an unobtainable virtue.
Owner

### JeffBezanson commented Jun 14, 2012

 Hate to rain on the parade, and admittedly this is an obscure case, but this is ambiguous in the case of an array of functions. Considering both that and the usual convention for higher-order functions, it's probably better to use `findfirst(function, A)`. Then we can also get rid of all the `T`s and `{T}`s.
Member

### HarlanH commented Jun 14, 2012

 Oh, so it is! OK, so you're suggesting the signatures be: `findfirst(testf::function, A::StridedArray)` `findfirst{T}(v::T, A::StridedArray{T})` `findfirst(A::StridedArray)` And maybe also expanding find likewise, as: `find(testf::function, A::StridedArray)` `find{T}(v::T, A::StridedArray{T})` `find(A::StridedArray)` (And findn too, but like I said, I don't understand the metaprogramming for that, so someone smarter than I will have to do the work...)
Owner

### JeffBezanson commented Jun 16, 2012

 If you flip them both, the ambiguity is still there. At the same time, having `find(A, v)` do something different than the same call in matlab is confusing. I would just keep the function-argument version for now.
Owner

### ViralBShah commented Jun 17, 2012

 I prefer using the 2 argument version of find as in matlab, and making the second argument be either a number or a function. The 3 argument form that matlab supports is a bit ugly, and perhaps we can do better.
Owner

### Harlan Harris added some commits Jun 19, 2012

 Harlan Harris ``` Merge branch 'master' of git://github.com/JuliaLang/julia ``` ```Conflicts: base/array.jl``` `b585dbc` Harlan Harris ``` fix merge issue (again); 2-arg find() ``` ```Conflicts: base/array.jl``` `a670754`
Member

### HarlanH commented Jun 19, 2012

 I dealt with the merge issue. Jeff, looks like you replaced the zero(T) with 0 in your earlier changes to find? I also created two-argument forms of find(), to match findfirst(). I'm not sure if the algorithm I used is the most efficient or not. It makes only one pass over the source array, instead of two as the one-argument cases do, but it grows a target array from scratch before copying it (to get rid of the padding). I should probably do some timings to find out. I also probably didn't do the copy in quite the right way.
Owner

### JeffBezanson commented Jun 19, 2012

 In the timings I've done of this sort of thing, it's generally better to determine the result size first if doing so is reasonably cheap. An extra constant-space pass over the array is better than allocating extra space. I replaced `zero(T)` with `0` just because it is nicer and probably not really different performance-wise.
Member

### HarlanH commented Jun 19, 2012

 I did some simple performance testing. If the array you're iterating over is sparse (with respect to the test item), the one-pass method with a growing array is twice as fast. The functional method doesn't get in-lined, so it's an order of magnitude slower. `findfirst(x)` is vastly faster than `find(x)[1]`, of course. ```# build a 10M array, then time various find operations on it x = [zeros(10000), 1] x = [x, x] # 20K x = [x, x] # 40K x = [x, x] # 80K x = [x, x] # 160K x = [x, x] # 320K x = [x, x] # 640K x = [x, x] # 1.2M x = [x, x] # 2.4M x = [x, x] # 4.8M x = [x, x] # 9.6M function f1(x) for i=1:10 y = find(x) end end function f2(x) for i = 1:10 y = find(x, 1.0) end end function f3(x) for i = 1:10 y = find(x, y->y==1) end end function ff1(x) for i = 1:10 y = findfirst(x) end end function ff2(x) for i = 1:10 y = findfirst(x,1.0) end end function ff3(x) for i = 1:10 y = findfirst(x,y->y==1) end end @time f1(x) @time f1(x) @time f2(x) @time f2(x) @time f3(x) @time f3(x) @time ff1(x) @time ff1(x) @time ff2(x) @time ff2(x) @time ff3(x) @time ff3(x)``` ``````julia> @time f1(x) elapsed time: 0.4995710849761963 seconds julia> @time f1(x) elapsed time: 0.49610280990600586 seconds julia> @time f2(x) elapsed time: 0.27991819381713867 seconds julia> @time f2(x) elapsed time: 0.2588460445404053 seconds julia> @time f3(x) elapsed time: 7.236483097076416 seconds julia> @time f3(x) elapsed time: 7.2504119873046875 seconds julia> @time ff1(x) elapsed time: 0.00019311904907226562 seconds julia> @time ff1(x) elapsed time: 0.000308990478515625 seconds julia> @time ff2(x) elapsed time: 0.008929967880249023 seconds julia> @time ff2(x) elapsed time: 0.00028395652770996094 seconds julia> @time ff3(x) elapsed time: 0.006864070892333984 seconds julia> @time ff3(x) elapsed time: 0.005472898483276367 seconds ``````
 Harlan Harris ``` test cases for two-arg find() ``` `b020893`

### JeffBezanson added a commit that referenced this pull request Jun 19, 2012

 JeffBezanson `Merge pull request #925 from HarlanH/master` `findfirst(A) returns index of first non-zero element, or 0` `7816b4a`

Closed