Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add step_relu #76

Merged
merged 12 commits into from Jul 27, 2017
Merged

add step_relu #76

merged 12 commits into from Jul 27, 2017

Conversation

@alexpghayes
Copy link
Contributor

@alexpghayes alexpghayes commented Jul 15, 2017

Saw #74 and realized it wasn't all that different from step_intercept so gave it a go.

Just saw the consistency tweaks to step_intercept and tried to use the same style here. Also tweaked the step_intercept tests following the changes to warnings/error messages.

Realizing I'm just submitting a PR out of the blue and not sure if this kind of thing helps or is a pain to manage. In general, are you open to PRs to address open issues? Is this the kind of thing where it's good to comment on the issue first to claim it, or something along those lines?

@codecov-io
Copy link

@codecov-io codecov-io commented Jul 15, 2017

Codecov Report

Merging #76 into master will increase coverage by 0.12%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #76      +/-   ##
==========================================
+ Coverage   92.61%   92.73%   +0.12%     
==========================================
  Files          44       45       +1     
  Lines        2923     2975      +52     
==========================================
+ Hits         2707     2759      +52     
  Misses        216      216
Impacted Files Coverage Δ
R/relu.R 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3edceec...74d70ae. Read the comment docs.

@topepo
Copy link
Member

@topepo topepo commented Jul 15, 2017

This is great; it's pretty much what I was going to put in. Good MARS writeup. I'm glad someone understood my rambling.

Can you make a few changes:

  • sync to the master branch; I changed a few minor things for the intercept code, including the tests that you have in this PR.
  • don't replace the existing values (and thus default the role to "predictor")
  • add a prefix option that defaults to "left_relu_" or "right_relu_" depending on the value of reverse? This is similar to step_pca to automatically naming the new columns.
  • add notes to contributors.md for your help

We need a print method too but I can take care of that.

@alexpghayes
Copy link
Contributor Author

@alexpghayes alexpghayes commented Jul 15, 2017

Okay, think that does it. Still having trouble running R CMD Check locally, so some of those commits are single line changes or the like. I read about rebase but couldn't tell if it would make a mess so decided against for the time being.

@alexpghayes
Copy link
Contributor Author

@alexpghayes alexpghayes commented Jul 15, 2017

We can avoid some unnecessary copying of data by using tidyeval to call mutate once, rather than select followed by mutate_all followed by assignment. However, my best current shot at this gives a name ignored because a LHS was supplied warning, which I presume isn't ideal.

#' @importFrom dplyr select_vars tbl_vars mutate
#' @importFrom purrr map
#' @importFrom rlang expr sym
bake.step_relu <- function(object, newdata, ...) {
  col_names <- dplyr::select_vars(dplyr::tbl_vars(newdata), !!!object$terms)

  mut_exprs <- purrr::map(col_names, function(col) {
    new_col <- paste0(object$prefix, col)
    rlang::expr(
      !!rlang::sym(new_col) :=
        relu(!!rlang::sym(col), object$shift, object$reverse, object$smooth)
    )
  })

  suppressWarnings(dplyr::mutate(newdata, !!!mut_exprs))
}

For easy comparison, bake.step_relu without tidyeval:

#' @importFrom dplyr select mutate_all select_vars tbl_vars funs
bake.step_relu <- function(object, newdata, ...) {
  col_names <- dplyr::select_vars(dplyr::tbl_vars(newdata), !!!object$terms)
  new_cols <- paste0(object$prefix, col_names)

  newdata[, new_cols] <- dplyr::mutate_all(
    dplyr::select(newdata, col_names),
    dplyr::funs(relu),
    object$shift,
    object$reverse,
    object$smooth)
  as_tibble(newdata)
}

@topepo
Copy link
Member

@topepo topepo commented Jul 23, 2017

@lionel-, @hadley Do you have any thoughts on @alexpghayes's versions of bake.step_relu?

Copy link
Collaborator

@lionel- lionel- left a comment

thanks for your contribution @alexpghayes

@topepo No particular opinion on the inclusion of relu. Is the package recipes going to include all transformation functions needed for any modelling task?

R/relu.R Outdated
#' The rectified linear transformation is used in the Multivariate Adaptive
#' Regression Splines as basis function to fit piecewise linear functions to
#' data in a strategy similar to that employeed in tree based models. The
#' transformation is a population choice as an activation function in many
Copy link
Collaborator

@lionel- lionel- Jul 24, 2017

popular

R/relu.R Outdated
if (!is.numeric(x))
stop("step_relu can only be applied to numeric data.", call. = FALSE)

if (reverse) shifted <- shift - x
Copy link
Collaborator

@lionel- lionel- Jul 24, 2017

The branches should at least be placed on their own line and ideally wrapped in curly brackets.

R/relu.R Outdated
x
}

#' @importFrom dplyr select mutate_all select_vars tbl_vars funs
Copy link
Collaborator

@lionel- lionel- Jul 24, 2017

No need to import these functions if you are calling them with a namespace-qualifier as you do below

@topepo
Copy link
Member

@topepo topepo commented Jul 24, 2017

@lionel- I was hoping to get feedback on this part:

We can avoid some unnecessary copying of data by using tidyeval to call mutate once, rather than select followed by mutate_all followed by assignment. However, my best current shot at this gives a name ignored because a LHS was supplied warning, which I presume isn't ideal.

@lionel-
Copy link
Collaborator

@lionel- lionel- commented Jul 24, 2017

oops sorry.

The data in question is just a list of pointers (a data frame) so the current implementation should be fine in this regard. The issue is that mutate_all() is notoriously slow for wide data frames. Cf this performance comparison with a simple map: https://github.com/hadley/dplyr/files/867620/colwise-dplyr-purrr-nogroups.pdf. So if you have more than 50 terms in the design df, that might become an issue.

@alexpghayes this expression is not valid:

rlang::expr(
  !! rlang::sym(new_col) :=
    relu(!! rlang::sym(col), object$shift, object$reverse, object$smooth)
)

That's because you're trying to assign a parameter name outside of a call. E.g. it's like you were trying to write nm = expr at the REPL. This would actually work but only because = is also an alias for the assignment operator <- when called at top-level as opposed to within a call. Since := is not treated as an alias to <- you cannot call it at top-level.

I would just assign the names the usual way:

make_relu_call <- function(col) {
  rlang::expr(relu(!! rlang::sym(col), object$shift, object$reverse, object$smooth))
})

names(col_names) <- paste0(object$prefix, col_names)
exprs <- purrr::map(col_names, make_relu_call)

The expression builder could also be:

make_relu_call <- function(col) {
  rlang::lang("relu", rlang::sym(col), object$shift, object$reverse, object$smooth)
})

In the latter case all parameters are inlined within the call without having to !! everything.

I suggest using #' @import rlang to avoid these rlang:: prefixes.

@lionel-
Copy link
Collaborator

@lionel- lionel- commented Jul 24, 2017

In the latter case all parameters are inlined within the call without having to !! everything.

This should also produce a more informative call stack when an error is raised. The relevant calls will include the actual parameters rather than object$param.

R/relu.R Outdated
#' @section Connection to MARS:
#'
#' The rectified linear transformation is used in the Multivariate Adaptive
#' Regression Splines as basis function to fit piecewise linear functions to
Copy link
Contributor Author

@alexpghayes alexpghayes Jul 24, 2017

as a

Copy link
Member

@topepo topepo Jul 26, 2017

@alexpghayes We didn't get all of that =]

@alexpghayes
Copy link
Contributor Author

@alexpghayes alexpghayes commented Jul 27, 2017

Think this should do it.

Thanks @lionel- for the feedback - didn't know about passing named lists to mutate but glad I do now. Also appreciate the comment about expressions, that makes sense. I'm greatly enjoying learning more about tidyeval/rlang.

@topepo
Copy link
Member

@topepo topepo commented Jul 27, 2017

Thanks!

@topepo topepo merged commit d70d46a into tidymodels:master Jul 27, 2017
3 checks passed
@github-actions
Copy link

@github-actions github-actions bot commented Feb 24, 2021

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants