Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trim and winsor now maintain original order #546

Merged
merged 8 commits into from
Feb 17, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 14 additions & 13 deletions src/robust.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ and lowest elements removed. To compute the trimmed mean of `x` use

# Example
```julia
julia> trim([1,2,3,4,5], prop=0.2)
julia> trim([5,2,4,3,1], prop=0.2)
3-element Array{Int64,1}:
2
3
4
3
```
"""
function trim(x::AbstractVector; prop::Real=0.0, count::Integer=0)
Expand All @@ -44,10 +44,10 @@ function trim!(x::AbstractVector; prop::Real=0.0, count::Integer=0)
0 <= count < n/2 || throw(ArgumentError("count must satisfy 0 ≤ count < length(x)/2."))
end

partialsort!(x, (n-count+1):n)
partialsort!(x, 1:count)
deleteat!(x, (n-count+1):n)
deleteat!(x, 1:count)
ix = vcat(partialsortperm(x, 1:count), partialsortperm(x, (n-count+1):n))
AsafManela marked this conversation as resolved.
Show resolved Hide resolved
sort!(ix)
ix = unique(ix) # can be replaced by unique! starting julia 1.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you know indices are sorted, a simple loop comparing values to previous ones and skipping them will probably be faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried this?

deleteat!(x, ix)

return x
end
Expand All @@ -62,13 +62,13 @@ mean of `x` use `mean(winsor(x))`.

# Example
```julia
julia> winsor([1,2,3,4,5], prop=0.2)
julia> winsor([5,2,3,4,1], prop=0.2)
5-element Array{Int64,1}:
2
4
2
3
4
4
2
```
"""
function winsor(x::AbstractVector; prop::Real=0.0, count::Integer=0)
Expand All @@ -92,10 +92,11 @@ function winsor!(x::AbstractVector; prop::Real=0.0, count::Integer=0)
0 <= count < n/2 || throw(ArgumentError("count must satisfy 0 ≤ count < length(x)/2."))
end

partialsort!(x, (n-count+1):n)
partialsort!(x, 1:count)
x[1:count] .= x[count+1]
x[n-count+1:end] .= x[n-count]
ix = partialsortperm(x, 1:(count+1))
AsafManela marked this conversation as resolved.
Show resolved Hide resolved
x[ix[1:count]] .= x[ix[count+1]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
x[ix[1:count]] .= x[ix[count+1]]
x[view(ix, 1:count)] .= x[ix[count+1]]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps because now this would be a view of a view, julia doesn't like this and it kill the process

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's certainly not supposed to happen. Can you try starting Julia with --check-bounds=yes? Otherwise it would be worth reporting.


ix = partialsortperm(x, (n-count):n)
x[ix[2:end]] .= x[ix[1]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
x[ix[2:end]] .= x[ix[1]]
x[view(ix, 2:end)] .= x[ix[1]]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above


return x
end
Expand Down
23 changes: 12 additions & 11 deletions test/robust.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,38 +3,39 @@ using Test

### Trimming outliers

@test trim([1,2,3,4,5,6,7,8], prop=0.1) == [1,2,3,4,5,6,7,8]
@test trim([1,2,3,4,5,6,7,8], prop=0.2) == [2,3,4,5,6,7]
@test trim([8,2,3,4,5,6,7,1], prop=0.1) == [8,2,3,4,5,6,7,1]
@test trim([8,2,3,4,5,6,7,1], prop=0.2) == [2,3,4,5,6,7]
@test trim([1,2,3,4,5,6,7,8,9], prop=0.4) == [4,5,6]
@test trim([1,2,3,4,5,6,7,8], count=1) == [2,3,4,5,6,7]
@test trim([8,7,6,5,4,3,2,1], count=1) == [7,6,5,4,3,2]
@test trim([1,2,3,4,5,6,7,8,9], count=3) == [4,5,6]


@test_throws ArgumentError trim([])
@test_throws ArgumentError trim([1,2,3,4,5], prop=0.5)

@test trim!([1,2,3,4,5,6,7,8], prop=0.1) == [1,2,3,4,5,6,7,8]
@test trim!([1,2,3,4,5,6,7,8], prop=0.2) == [2,3,4,5,6,7]
@test trim!([8,2,3,4,5,6,7,1], prop=0.1) == [8,2,3,4,5,6,7,1]
@test trim!([8,2,3,4,5,6,7,1], prop=0.2) == [2,3,4,5,6,7]
@test trim!([1,2,3,4,5,6,7,8,9], prop=0.4) == [4,5,6]
@test trim!([1,2,3,4,5,6,7,8], count=1) == [2,3,4,5,6,7]
@test trim!([8,7,6,5,4,3,2,1], count=1) == [7,6,5,4,3,2]
@test trim!([1,2,3,4,5,6,7,8,9], count=3) == [4,5,6]

@test_throws ArgumentError trim!([])
@test_throws ArgumentError trim!([1,2,3,4,5], prop=0.5)

@test winsor([1,2,3,4,5,6,7,8], prop=0.1) == [1,2,3,4,5,6,7,8]
@test winsor([1,2,3,4,5,6,7,8], prop=0.2) == [2,2,3,4,5,6,7,7]
@test winsor([8,2,3,4,5,6,7,1], prop=0.1) == [8,2,3,4,5,6,7,1]
@test winsor([8,2,3,4,5,6,7,1], prop=0.2) == [7,2,3,4,5,6,7,2]
@test winsor([1,2,3,4,5,6,7,8,9], prop=0.4) == [4,4,4,4,5,6,6,6,6]
@test winsor([1,2,3,4,5,6,7,8], count=1) == [2,2,3,4,5,6,7,7]
@test winsor([8,7,6,5,4,3,2,1], count=1) == [7,7,6,5,4,3,2,2]
@test winsor([1,2,3,4,5,6,7,8,9], count=3) == [4,4,4,4,5,6,6,6,6]

@test_throws ArgumentError winsor([])
@test_throws ArgumentError winsor([1,2,3,4,5], prop=0.5)

@test winsor!([1,2,3,4,5,6,7,8], prop=0.1) == [1,2,3,4,5,6,7,8]
@test winsor!([1,2,3,4,5,6,7,8], prop=0.2) == [2,2,3,4,5,6,7,7]
@test winsor!([8,2,3,4,5,6,7,1], prop=0.1) == [8,2,3,4,5,6,7,1]
@test winsor!([8,2,3,4,5,6,7,1], prop=0.2) == [7,2,3,4,5,6,7,2]
@test winsor!([1,2,3,4,5,6,7,8,9], prop=0.4) == [4,4,4,4,5,6,6,6,6]
@test winsor!([1,2,3,4,5,6,7,8], count=1) == [2,2,3,4,5,6,7,7]
@test winsor!([8,7,6,5,4,3,2,1], count=1) == [7,7,6,5,4,3,2,2]
@test winsor!([1,2,3,4,5,6,7,8,9], count=3) == [4,4,4,4,5,6,6,6,6]

@test_throws ArgumentError winsor!([])
Expand Down