Skip to content
This repository has been archived by the owner on May 21, 2022. It is now read-only.

rescale! and center! along obsdim #30

Merged
merged 12 commits into from
Apr 16, 2017
Merged

rescale! and center! along obsdim #30

merged 12 commits into from
Apr 16, 2017

Conversation

abieler
Copy link
Contributor

@abieler abieler commented Apr 4, 2017

It might be useful to also have the feature rescaling functions working on both dimensions.
Currently the tests are failing, but I figured to check with you first if you like the general idea.

@Evizero
Copy link
Member

Evizero commented Apr 4, 2017

Hi! Really cool that you are doing this. I agree that we should allow for the obsdim to be specified. In fact we already have a "system" for doing so which we use at MLDataPatterns.jl (that package will be the new back-end for MLDataUtils for all data subsetting, k-folds etc).

Would be cool if you could adapt the code to that "system". I describe the general way of doing it here: joshday/OnlineStats.jl#40 (comment) . For most code we allow any order array, but it would already be a big improvement to just have code for vectors an matrices

edit: ObsDim is defined in LearnBase.jl here: https://github.com/JuliaML/LearnBase.jl/blob/master/src/LearnBase.jl#L318-L395

@abieler
Copy link
Contributor Author

abieler commented Apr 4, 2017

Cool. I ll definitely try to adapt to that scheme!

@abieler
Copy link
Contributor Author

abieler commented Apr 10, 2017

somewhat like this?

Copy link
Member

@Evizero Evizero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Thanks for this. Made a few comments. Other than that the main thing missing are a few tests

@@ -1,55 +1,130 @@
"""
`μ = center!(X[, μ])`
`μ = center!(X, obsdim[, μ])`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obsdim should be depicted as the last parameter and also optional

center!(X, mu, obsdim)
end

function center!{T}(X::AbstractVector{T}, bosdim::ObsDim.Constant{1})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in bosdim, although you could just write , ::ObsDim.Constant{1}

X[j, i] = X[j, i] - μ[j]
end
function center!(X, mu; obsdim=LearnBase.default_obsdim(X))
center!(X, mu, LearnBase.obs_dim(obsdim))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current default indentation for Julia is 4 spaces. so each indentation level should be 4 spaces more

"""
`μ, σ = rescale!(X[, μ, σ])`
`μ, σ = rescale!(X, obsdim[, μ, σ])`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obsdim should be listed as the last parameter and also as optional

@abieler
Copy link
Contributor Author

abieler commented Apr 11, 2017

Thanks for the comments, I ll add some tests later on

Copy link
Member

@Evizero Evizero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Nice work. One last request

for i = 1:n
X[j, i] = X[j, i] - μ[j]
function center!(X, μ; obsdim=LearnBase.default_obsdim(X))
center!(X, μ, LearnBase.obs_dim(obsdim))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change

LearnBase.obs_dim(obsdim)

everywhere to

convert(LearnBase.ObsDimension, obsdim)

(I deprecated the former just a few days ago)

@@ -127,14 +127,14 @@ function rescale!(X::AbstractMatrix, μ::AbstractVector, σ::AbstractVector, ::O
μ, σ
end

function rescale!(X::AbstractVector, μ::AbstractVector, σ::AbstractVector, ::ObsDim.Constant{1})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i figured this would include ObsDim.First() and .Last(). No?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, since X is a vector, the fallback method you implemented transforms ObsDim.Last to ObsDim.Constant{1} anyway.

I think here ::ObsDim.Constant{1} was cleaner than ::ObsDim.Constant (which would also allow ObsDim.Constant{2}()), because if the data X is a vector, then 1 is the only sensible dimension to choose. i.e. I'd expect a method error if I use ObsDim.Constant{2}().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. i ll undo my improvements :)

@Evizero
Copy link
Member

Evizero commented Apr 15, 2017

This is optional, but if you feel up to it, it would be cool to update the corresponding section in the documentation: https://raw.githubusercontent.com/JuliaML/MLDataUtils.jl/master/docs/data/feature.rst

@Evizero
Copy link
Member

Evizero commented Apr 16, 2017

lgtm. Ready to merge when you are

@abieler abieler closed this Apr 16, 2017
@Evizero Evizero reopened this Apr 16, 2017
@Evizero Evizero merged commit 4cd7a1c into JuliaML:master Apr 16, 2017
@Evizero
Copy link
Member

Evizero commented Apr 16, 2017

thanks!

@abieler
Copy link
Contributor Author

abieler commented Apr 17, 2017

thanks for all the comments! also learned about singleton types in the process. :)
how do you feel about support for dataframes and datatables? worth looking at or do you want to keep this for arrays only?

@Evizero
Copy link
Member

Evizero commented Apr 17, 2017

how do you feel about support for dataframes and datatables

I will add a DataFrames dependency in the next update (see dev branch), so I am open to the idea.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants