Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For loops and conditions vs. apply and logical subsetting #250

Open
dmi3kno opened this issue Nov 23, 2016 · 10 comments
Open

For loops and conditions vs. apply and logical subsetting #250

dmi3kno opened this issue Nov 23, 2016 · 10 comments

Comments

@dmi3kno
Copy link
Contributor

@dmi3kno dmi3kno commented Nov 23, 2016

I am of strong opinion that introducing for loops and if else statements early in the teaching program makes more harm than good to R education. I understand that this is done out of desire to maintain consistency between how Python and R are taught, but I would argue that the approach to teaching (and using) the two languages should be different.

I would argue that for loops and if else statements need to be shelved under the "Advanced" topics towards the end of the teaching lesson and instead a sections on *apply family of functions and logical subsetting should be introduced. R is positioning itself as a vectorized language (even though under the hood it might be running highly optimized for loops in C++), but the R programmer is encouraged to think in terms of vectors and data frames. Introduction of non-vectorized operators breaks that frame and positions R for failure due to apriori lost argument on speed and efficiency.

Again, if you agree, I volunteer to handpick material on *apply operators from another lesson (I will not introduce any new concepts but rather repackages what is already in the very good Software Carpentry curriculum) and rework logical subsetting (and, perhaps, mention vectorized ifelse() function) to cater for the need to teach implementation of conditional logic (branching) in R.

As I said, the loop and branching sections are good, but only as an advanced topic towards the end of the lesson material.

@RMHogervorst
Copy link

@RMHogervorst RMHogervorst commented Nov 25, 2016

I agree.

@chendaniely
Copy link
Contributor

@chendaniely chendaniely commented Mar 27, 2017

I agree that Python and R are different languages with different domains and should be treated accordingly.

My only concern is if we remove for loops until the very end of the lesson, then the entire lesson will need to be restructured.
The current lesson gets users to load data, create a plot, then automatically create multiple plots from multiple datasets. This requires a for loop (unless I am mistaken).

The rationale behind the current lesson order is that for loops and conditionals are fundamental in programming (in other languages). Since this is the programming with r lesson and not a data analysis in r lesson, the entire lesson is more focused on programming concepts, rather than data analysis concepts.

I am more than happy to continue this discussion, as it does set a foundation for our students.

@RMHogervorst
Copy link

@RMHogervorst RMHogervorst commented Mar 28, 2017

I guess you could use a apply action in stead of a for loop, but that is just a for loop in hiding. Since that does not really make a difference but would introduce new functions . So that doesn't help in the software carpentry lessons.
I agree with the programming with r vs data analyses in r rationale.

@dmi3kno
Copy link
Contributor Author

@dmi3kno dmi3kno commented Mar 28, 2017

My argument, then, is that there's no such thing as "beginner programmer in R". There's only "beginner analyst in R". It is very rare instance when for-loops need to be written and those shall be reserved to non-rectangular data types. For everything else R has an awesome functional programming toolbox with base::*apply and purrr::map_* families which (although rely on C++ for loops) emphasize the functional aspect of it and hide away the implementation details (which do more harm than good to beginners). This is highly philosophical discussion and I am ready to give in on changing the lesson, if you guys confirm that you taught R with for loops and you tried introducing apply instead and you liked the former better.

@katrinleinweber
Copy link
Contributor

@katrinleinweber katrinleinweber commented Feb 1, 2018

Hello! Having myself been enchanted by purrr::map() last year, how about a …-suppl-….Rmd that compares and contrasts it and base::apply?

@katrinleinweber
Copy link
Contributor

@katrinleinweber katrinleinweber commented Mar 27, 2018

Related to #276, because both readr & purrr are in the tidyverse.

@dmi3kno
Copy link
Contributor Author

@dmi3kno dmi3kno commented Mar 27, 2018

R is evolving so fast that I no longer want to stand by base::apply(). It should be purrr::map() all the way. We tried teaching it in SWC Oslo and it works like a charm. Highly recommend watching Hadley's cupcake rant video: https://www.youtube.com/watch?v=GyNqlOjhPCQ

Also, plenty of resources for teaching purrr, not least by Jenny Bryan

fmichonneau pushed a commit that referenced this issue Jun 19, 2018
Favicons and logos
@katrinleinweber
Copy link
Contributor

@katrinleinweber katrinleinweber commented Jun 30, 2018

I got another comment offline about this and am absolutely convinced we should rewrite 03-loops-R.Rmd (and drop 15-supp-loops-in-depth.Rmd, or merge into into the former).

Contributions welcome! Some inspiration thanks to @jennybc: Thinking inside the box (45min webinar).

@CodeRThane
Copy link

@CodeRThane CodeRThane commented Feb 2, 2021

I've only been coding for about 6 months now, but here are my 2 cents:
I think for loops (or the apply/map functions) should be taught early in R. My first R project after that involved working with over 100 CSV files. From the data carpentry lesson I had taken, I knew how to work with one CSV at a time, but I had to spend a lot of time on the internet to figure out how to use the apply function before I could make much progress on the project.

I guess what I'm trying to say is that for loops are an important functional tool that programmers need to have at their fingertips, and I think it should be taught early on.

@HaoZeke
Copy link
Member

@HaoZeke HaoZeke commented Feb 2, 2021

The argument made by @dmi3kno makes sense to me as well. @CodeRThane, the idiomatic R method is to use purrr::map() or apply instead of for loops as you noticed. I'd be happy to see some concrete PRs for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants