New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add blog post on for vs apply #125
Conversation
@Bisaloo I agree with the arguments, which not only apply for The only inconvenience I have found in other languages is that debugger support tended to be poorer than for other constructs of the language. For instance, debugging functional code in Java used to be a little harder, since the stack traces weren't that informative and the debugger sometimes had trouble diving into the lambdas/closures of the code. Although those problemas have been solved in recent years. I wonder if something equivalent happens in R. If so, it would be important to mention it in the article. |
posts/for-vs-apply/index.qmd
Outdated
title: "Why use `apply()` instead of `for` loops?" | ||
subtitle: "Going beyond the debunked performance argument." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a really concise read and conveyed the message quite quickly.
My only comment is that the current title might undersell the article as many people might have read articles with a similar title but different content. I would entitle it as "Beyond the need for speed: Lesser-known reasons to prefer apply()
functions over for
loops" or something along those lines to draw out the content of the article. I was trying to play on words with "need for speed".
@Bisaloo - Would it be useful to touch upon where users should be cautious with rincewind <- function(x) match.call()
rincewind(1L)
#> rincewind(x = 1L)
lapply(1L, rincewind)
#> [[1]]
#> FUN(x = X[[i]])
lapply(1L, function(x) rincewind(i))
#> [[1]]
#> rincewind(x = i) This means The other thing I wondered is whether it would be worth combining points (1) and (2) as they are both to do with "grokability" of the code. I could be persuaded either way here. Small comment specific to the example in (1) - is it worth doing a different example where there's not a vectorised alternative ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Bisaloo for writing this, it's a comprehensive but concise read.
Perhaps some points that could be added since we are on the topic in this post already:
- Iterating over multiple list-like objects, as I find
mapply(x_list, y_list, FUN = fun)
easier to read with less visual noise than the equivalent for loop. Caveats about return types of course. - Related to above, the case of nested for loops and nested functionals - I find the latter easier to parse as well.
- A bit more about using functional programming to replace difficult to read, error-prone or inefficient for loops; e.g. replacing loop based data merging with
Reduce()
, especially when intermediate products are also required. I find it's easy to make mistakes in indexing in such cases. - A caveat on not getting carried away looking for a 'clever' functional-based solution where a for loop would suffice - I find myself spending extra time thinking about these and recommending them to others even when the case for them is weak.
Thanks all for the great comments! @jpavlich
I considered adding a note about other languages but opted not to because I'm not familiar enough to claim enough with assurance (excepted for python). Would you like to propose some text for Java?
Yes, I think this is true in R, for reasons related to what @TimTaylor mentioned. This is already visible in error messages, even before using debugging features: f <- function(x) {
if (x == 10) {
stop("x cannot be 10")
}
return(TRUE)
}
lapply(1:20, f)
#> Error in FUN(X[[i]], ...): x cannot be 10 Created on 2023-10-26 with reprex v2.0.2 @jamesmbaazamThanks, I like that idea! I am proposing to split it into title + subtitle to keep it short. I see two options: title: "Beyond the need for speed: `apply()` vs for loops"
subtitle: "Lesser-known reasons to prefer apply() functions over for loops" or title: "Lesser-known reasons to prefer `apply()` over for loops"
subtitle: "Beyond the need for speed 🏎️" Which one is better in your opinion? @TimTaylor
This is a good point but I wonder if that wouldn't be too much info to add for a quite niche & advanced case. I would propose to keep it either as a footnote, or as a comment on a post after publication. How does this sound?
Yes, they both relate to grokability but I believe they are conceptually quite different. The first relates to the comprehension of programming concepts, while the second has to do with the practical aspect of mental load. I really want to drive the point home for both and believe they both get more importance as distinct categories.
I have split it into two examples in 46a4b53: one more realistic without a vectorized alternative, and just a quick note that lintr can suggest vectorized alternatives for some @pratikunterwegs
Thanks for the note. I have added examples in 563b4ed to illustrate better the benefit of apply() in complex cases.
This is a good suggestion but outside the scope of this post IMO but
I disagree with this one, especially in the case of package development. While it may indeed take slightly longer sometimes (especially if one is not used to using these tools), the improvement in maintainability over years quickly compensates this initial investment. |
A footnote sounds good and could refer to the One other thing I was thinking about was functions that may (or are more likely to) error. This can be important for longer running functions with lapply() being all or nothing. Things like |
Hmmm I'm torn. I really like the first but it seems too long when you combine the title and subtitle. The second is more concise and makes me wonder if we need the subtitle. |
and use an example without vectorized alternative
Co-authored-by: James Azam <jamesmbaazam@users.noreply.github.com>
This sounds like a great topic for a follow-up post 😉
Thanks for the suggestions! Following your feedback, I have removed the subtitle. Puns and references are nice but in this specific case, I was afraid it would give the impression that speed is indeed a component in the equation. |
Fix #23
R
Right before merging:
date
field has been updatedblueprints
to link to this post_freeze/
folder is up-to-date