New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add step_relu #76

Merged
merged 12 commits into from Jul 27, 2017

Conversation

Projects
None yet
4 participants
@alexpghayes
Contributor

alexpghayes commented Jul 15, 2017

Saw #74 and realized it wasn't all that different from step_intercept so gave it a go.

Just saw the consistency tweaks to step_intercept and tried to use the same style here. Also tweaked the step_intercept tests following the changes to warnings/error messages.

Realizing I'm just submitting a PR out of the blue and not sure if this kind of thing helps or is a pain to manage. In general, are you open to PRs to address open issues? Is this the kind of thing where it's good to comment on the issue first to claim it, or something along those lines?

@codecov-io

This comment has been minimized.

codecov-io commented Jul 15, 2017

Codecov Report

Merging #76 into master will increase coverage by 0.12%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #76      +/-   ##
==========================================
+ Coverage   92.61%   92.73%   +0.12%     
==========================================
  Files          44       45       +1     
  Lines        2923     2975      +52     
==========================================
+ Hits         2707     2759      +52     
  Misses        216      216
Impacted Files Coverage Δ
R/relu.R 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3edceec...74d70ae. Read the comment docs.

@topepo

This comment has been minimized.

Collaborator

topepo commented Jul 15, 2017

This is great; it's pretty much what I was going to put in. Good MARS writeup. I'm glad someone understood my rambling.

Can you make a few changes:

  • sync to the master branch; I changed a few minor things for the intercept code, including the tests that you have in this PR.
  • don't replace the existing values (and thus default the role to "predictor")
  • add a prefix option that defaults to "left_relu_" or "right_relu_" depending on the value of reverse? This is similar to step_pca to automatically naming the new columns.
  • add notes to contributors.md for your help

We need a print method too but I can take care of that.

@alexpghayes

This comment has been minimized.

Contributor

alexpghayes commented Jul 15, 2017

Okay, think that does it. Still having trouble running R CMD Check locally, so some of those commits are single line changes or the like. I read about rebase but couldn't tell if it would make a mess so decided against for the time being.

@alexpghayes

This comment has been minimized.

Contributor

alexpghayes commented Jul 15, 2017

We can avoid some unnecessary copying of data by using tidyeval to call mutate once, rather than select followed by mutate_all followed by assignment. However, my best current shot at this gives a name ignored because a LHS was supplied warning, which I presume isn't ideal.

#' @importFrom dplyr select_vars tbl_vars mutate
#' @importFrom purrr map
#' @importFrom rlang expr sym
bake.step_relu <- function(object, newdata, ...) {
  col_names <- dplyr::select_vars(dplyr::tbl_vars(newdata), !!!object$terms)

  mut_exprs <- purrr::map(col_names, function(col) {
    new_col <- paste0(object$prefix, col)
    rlang::expr(
      !!rlang::sym(new_col) :=
        relu(!!rlang::sym(col), object$shift, object$reverse, object$smooth)
    )
  })

  suppressWarnings(dplyr::mutate(newdata, !!!mut_exprs))
}

For easy comparison, bake.step_relu without tidyeval:

#' @importFrom dplyr select mutate_all select_vars tbl_vars funs
bake.step_relu <- function(object, newdata, ...) {
  col_names <- dplyr::select_vars(dplyr::tbl_vars(newdata), !!!object$terms)
  new_cols <- paste0(object$prefix, col_names)

  newdata[, new_cols] <- dplyr::mutate_all(
    dplyr::select(newdata, col_names),
    dplyr::funs(relu),
    object$shift,
    object$reverse,
    object$smooth)
  as_tibble(newdata)
}
@topepo

This comment has been minimized.

Collaborator

topepo commented Jul 23, 2017

@lionel-, @hadley Do you have any thoughts on @alexpghayes's versions of bake.step_relu?

@lionel-

thanks for your contribution @alexpghayes

@topepo No particular opinion on the inclusion of relu. Is the package recipes going to include all transformation functions needed for any modelling task?

R/relu.R Outdated
#' The rectified linear transformation is used in the Multivariate Adaptive
#' Regression Splines as basis function to fit piecewise linear functions to
#' data in a strategy similar to that employeed in tree based models. The
#' transformation is a population choice as an activation function in many

This comment has been minimized.

@lionel-

lionel- Jul 24, 2017

Collaborator

popular

R/relu.R Outdated
if (!is.numeric(x))
stop("step_relu can only be applied to numeric data.", call. = FALSE)
if (reverse) shifted <- shift - x

This comment has been minimized.

@lionel-

lionel- Jul 24, 2017

Collaborator

The branches should at least be placed on their own line and ideally wrapped in curly brackets.

R/relu.R Outdated
x
}
#' @importFrom dplyr select mutate_all select_vars tbl_vars funs

This comment has been minimized.

@lionel-

lionel- Jul 24, 2017

Collaborator

No need to import these functions if you are calling them with a namespace-qualifier as you do below

@topepo

This comment has been minimized.

Collaborator

topepo commented Jul 24, 2017

@lionel- I was hoping to get feedback on this part:

We can avoid some unnecessary copying of data by using tidyeval to call mutate once, rather than select followed by mutate_all followed by assignment. However, my best current shot at this gives a name ignored because a LHS was supplied warning, which I presume isn't ideal.

@lionel-

This comment has been minimized.

Collaborator

lionel- commented Jul 24, 2017

oops sorry.

The data in question is just a list of pointers (a data frame) so the current implementation should be fine in this regard. The issue is that mutate_all() is notoriously slow for wide data frames. Cf this performance comparison with a simple map: https://github.com/hadley/dplyr/files/867620/colwise-dplyr-purrr-nogroups.pdf. So if you have more than 50 terms in the design df, that might become an issue.

@alexpghayes this expression is not valid:

rlang::expr(
  !! rlang::sym(new_col) :=
    relu(!! rlang::sym(col), object$shift, object$reverse, object$smooth)
)

That's because you're trying to assign a parameter name outside of a call. E.g. it's like you were trying to write nm = expr at the REPL. This would actually work but only because = is also an alias for the assignment operator <- when called at top-level as opposed to within a call. Since := is not treated as an alias to <- you cannot call it at top-level.

I would just assign the names the usual way:

make_relu_call <- function(col) {
  rlang::expr(relu(!! rlang::sym(col), object$shift, object$reverse, object$smooth))
})

names(col_names) <- paste0(object$prefix, col_names)
exprs <- purrr::map(col_names, make_relu_call)

The expression builder could also be:

make_relu_call <- function(col) {
  rlang::lang("relu", rlang::sym(col), object$shift, object$reverse, object$smooth)
})

In the latter case all parameters are inlined within the call without having to !! everything.

I suggest using #' @import rlang to avoid these rlang:: prefixes.

@lionel-

This comment has been minimized.

Collaborator

lionel- commented Jul 24, 2017

In the latter case all parameters are inlined within the call without having to !! everything.

This should also produce a more informative call stack when an error is raised. The relevant calls will include the actual parameters rather than object$param.

R/relu.R Outdated
#' @section Connection to MARS:
#'
#' The rectified linear transformation is used in the Multivariate Adaptive
#' Regression Splines as basis function to fit piecewise linear functions to

This comment has been minimized.

@alexpghayes

alexpghayes Jul 24, 2017

Contributor

as a

This comment has been minimized.

@topepo

topepo Jul 26, 2017

Collaborator

@alexpghayes We didn't get all of that =]

@alexpghayes

This comment has been minimized.

Contributor

alexpghayes commented Jul 27, 2017

Think this should do it.

Thanks @lionel- for the feedback - didn't know about passing named lists to mutate but glad I do now. Also appreciate the comment about expressions, that makes sense. I'm greatly enjoying learning more about tidyeval/rlang.

@topepo

This comment has been minimized.

Collaborator

topepo commented Jul 27, 2017

Thanks!

@topepo topepo merged commit d70d46a into tidymodels:master Jul 27, 2017

3 checks passed

codecov/patch 100% of diff hit (target 92.61%)
Details
codecov/project 92.73% (+0.12%) compared to 3edceec
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment