Skip to content
This repository has been archived by the owner. It is now read-only.

Scared straight! The attach() version #94

Closed
jennybc opened this issue Sep 17, 2015 · 9 comments
Closed

Scared straight! The attach() version #94

jennybc opened this issue Sep 17, 2015 · 9 comments

Comments

@jennybc
Copy link
Member

@jennybc jennybc commented Sep 17, 2015

I tried to explain why I don't like to attach() data.frames in class today and I don't think I was very compelling. Yet I feel strongly that this practice, while tempting, ultimately causes more grief than good.

Seeking good links or scary stories to make this case more persuasive. Comment below!

@minisciencegirl
Copy link

@minisciencegirl minisciencegirl commented Sep 17, 2015

Google R Style Manual strongly advises against attach( ).
http://google-styleguide.googlecode.com/svn/trunk/Rguide.xml#attach

Nicholas Horton actually wrote a blog post on how attaching can lead to mayhem:
http://sas-and-r.blogspot.ca/2011/05/to-attach-or-not-attach-that-is.html

Note from Jenny: the blog post above is from 2011, which speaks to the timeless quality of this debate, but also know that other attach()-avoidance strategies are available to us now.

@mikoontz
Copy link

@mikoontz mikoontz commented Sep 17, 2015

This is a brilliant idea!

My story isn't really about a particularly disastrous incident, but rather about the first ~3 months of learning R and how attach() made that process more difficult.

Like many people, I started to learn R by tinkering with code copied from internet tutorials and StackOverflow answers. Even when people 'properly' used attach() by pairing it with a detach(), that critical step was too easily lost in transcription to my code. My attach() calls would build up into a formidable attach() monster in my environment and, both in reading my code and executing it, I would find it impossible to keep up with which objects I was actually referencing.

I remember being really frustrated with bright red messages about 'masked objects' when trying to work with data.frames having common column names but code that was perfectly happy to run without error. So while there wasn't a disaster per se, I was definitely set back in learning a general R workflow by having to relearn how to appropriately access components of R objects such that I could understand the code as-written and the code as-executed simultaneously.

@cathyxijuan
Copy link

@cathyxijuan cathyxijuan commented Sep 17, 2015

attach() is especially annoying if you are dealing with two different dataset. Then you must keep track of which one you attached..

@dchiu911
Copy link
Member

@dchiu911 dchiu911 commented Sep 17, 2015

attach() can override objects and functions in the global environment if you aren't being smart with how you name things. Useful alternative is to use the with(object, ...) wrapper instead.

@mdscheuerell
Copy link

@mdscheuerell mdscheuerell commented Sep 17, 2015

I commonly hear from students that they don't like typing long names from data frames, and attach() lets them of the hook. But, given a data frame like

> names(plantGrowthExpt)
> "plantHeight" "phosphorusConc" "nitrogenConc" "wateringRegime"

those same students can never explain why something like

> attach(plantGrowthExpt)
> CV <- sqrt(var(plantHeight))/mean(plantHeight)
...
> detach(plantGrowthExpt)

is better than

> pGE <- plantGrowthExpt
> names(pGE) <- c("Ht","P","N","H2O")
> CV <- sqrt(var(pGE$Ht))/mean(pGE$Ht)
...

except to say that the new names are not as informative to the naive code reader. Sure, but a few commented lines of defns at the top can cure that. And I point out that they could always do a global find-and-replace with more meaningful variable names when they are satisfied the code is ready.

Also imagine the undesirable case where, for example in time series analysis, someone has a data frame with the variable T to indicate "time" and then

> names(fishCounts)
> "T" "site1" "site2"
> attach(fishCounts)
> times <- T # yikes!
...

@jennybc
Copy link
Member Author

@jennybc jennybc commented Sep 17, 2015

Here's a stackoverflow thread which I haven't read in its entirety:

http://stackoverflow.com/questions/10067680/why-is-it-not-advisable-to-use-attach-in-r-and-what-should-i-use-instead

@jennybc
Copy link
Member Author

@jennybc jennybc commented Sep 17, 2015

@kevinushey
Copy link

@kevinushey kevinushey commented Sep 17, 2015

It might lead people to think that just because an object is attached, it should be modifiable:

attach(mtcars)
mpg <- mpg + 1 # creates new mpg in GlobalEnv -- does not modify mtcars!

It 'leaks' out of the scope it's called in -- normally we try to ensure functions have no (non-explicit) side effects, e.g.

foo <- function() {
    attach(mtcars)
}
foo()
cyl ## still attached -- yuck

And errors prevent a detach from getting called (this could be mitigated by using on.exit but most students don't know about that):

foo <- function() {
    attach(mtcars)
    if (runif(1) < 0.01) stop() ## ouch, we 'leak' mtcars
    detach(mtcars)
}

If you have a non-function variable that masks a function variable, you'll get that variable instead (when that symbol is used not as a function call). You don't even get a masking warning:

foo <- function() {
    print(ls)
}
yikes <- data.frame(ls = 1)
attach(yikes)
foo() ## oops -- i wanted to see the def'n of base::ls

RStudio's diagnostics is not smart enough to understand it (really, a defect in RStudio, but ...)

screen shot 2015-09-17 at 3 35 29 pm

Given the number of base R functions that do non-standard evaluation, I bet 'clever' use of attach could break them.

Also, most students (especially beginners) don't understand (read?) the masking warnings.

@jimhester
Copy link

@jimhester jimhester commented Sep 18, 2015

Forgetting how many (and which) objects you have previously attach()'d and repeatedly calling detach() with no arguments. It will keep detaching packages (including base packages like utils, graphics) until all that is left is base. Then you are stuck in a situation where ?, plot() ect. will no longer work.

with(), subset(), within(), transform() ect. are all better and safer options for doing non-standard evaluation using 'base' functions, I don't think I have ever used attach(), including interactively.

@jennybc jennybc closed this as completed Aug 30, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants