Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start #1

Merged
merged 3 commits into from
Jul 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .github/workflows/render.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
workflow_dispatch:

name: render

jobs:
render:
runs-on: ubuntu-latest
# Only restrict concurrency for non-PR jobs
concurrency:
group: render-${{ github.event_name != 'pull_request' || github.run_id }}
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
isExtPR: ${{ github.event.pull_request.head.repo.fork == true }}
steps:

- name: Configure Git user
run: |
git config --global user.name "$GITHUB_ACTOR"
git config --global user.email "$GITHUB_ACTOR@users.noreply.github.com"

- uses: actions/checkout@v3

- uses: quarto-dev/quarto-actions/setup@v2

- uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true

- uses: r-lib/actions/setup-r-dependencies@v2

- name: Render book
run: quarto render

- name: Deploy to GitHub Pages
if: contains(env.isExtPR, 'false')
id: gh-pages-deploy
uses: JamesIves/github-pages-deploy-action@v4
with:
branch: gh-pages
folder: _book
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.Rproj.user
.Rhistory
.RData
.Ruserdata

/.quarto/
_book
16 changes: 16 additions & 0 deletions Frustration-One-Year-With-R.Rproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Version: 1.0

RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: Sweave
LaTeX: pdfLaTeX

AutoAppendNewline: Yes
StripTrailingWhitespace: Yes
4,993 changes: 1 addition & 4,992 deletions README.md

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
project:
type: book
book:
title: "Frustration: One Year With R"
author: Reece Goding
description: "What follows is an account of my experiences from about one year of roughly daily R usage."
chapters:
- index.qmd
- general_feelings.qmd
- what_r_does_right.qmd
- what_r_does_wrong.qmd
- the_tidyverse.qmd
- conclusion.qmd
- feedback.qmd
11 changes: 11 additions & 0 deletions conclusion.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@


# Conclusion

If I were being generous, I would say that R teaches you some great lessons about functional programming while being a useful DSL and that its biggest fault is that it tries to do too much, ultimately becoming brutally inconsistent. I'd also say that the Tidyverse is a useful set of packages that, while unable to fix R and certainly not a panacea, do a lot to improve it within their specific domains. However, I'm not that generous.

The most damning thing about R is that much of [*The R Inferno*](https://www.burns-stat.com/pages/Tutor/R_inferno.pdf) still holds true. The fact that a nearly decade-old document that credibly compared R to a journey in to Hell is still a useful reference manual speaks volumes about the language's attitude to change. To put it plainly, **R won't change**. If something about R frustrates you today, **it always will**. That's what kills the language for me. The popularity of the Tidyverse proves that R is broken and the continuing validity of the [*The R Inferno*](https://www.burns-stat.com/pages/Tutor/R_inferno.pdf) proves that it will stay that way. You may be able to put a blanket of sanity on top of it, as the best packages try to, but you won't fix it. Unless you find said packages so useful that they make R worth it, I find it impossible to argue against jumping ship. My ultimate conclusion on R is that it's good, but doomed by the unshifting weight of the countless little problems that I've documented here. Personally, I'm going to give Python a shot and I wouldn't blame you for doing the same. Let's hope that I don't end up writing a document of this size complaining about that.

All that being said, I have no intention of uninstalling R or going out of my way to avoid it. I'd gladly use it professionally and I've learned enough of its semantic semtex to get really damn good at using R to do in few lines what other languages would do in many. I wasn't joking when I said that it's the best desktop calculator that I've ever used. But would I recommend learning it to anyone else? Absolutely not. We can do so much better.


32 changes: 32 additions & 0 deletions feedback.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@


# Feedback

In late March 2022, this article suddenly exploded overnight. It was briefly [in the top 10 on Hacker News](https://twitter.com/HackerNewsTop10/status/1506221752191492103) and got about 20,000 views in one day. This came as quite a shock to me, given that I wasn't even done proofreading yet. I've read through as much of the online commentary on this article as I can find. The comments on [the Hacker News page](https://news.ycombinator.com/item?id=30764505) are by far the most in-depth. You can find a fair bit on Twitter and Reddit as well, but you'd have to go looking. I found that Twitter had the most positive reception, Reddit was more negative and Hacker News was mixed.

In April, I finished the proofreading and [sent this off to R-devel](https://stat.ethz.ch/pipermail/r-devel/2022-April/081605.html). I am very thankful for their comments and their sincere efforts to help me. I hope that my replies didn't come off as half hearted. I was simply unequipped to handle to mass of feedback.

I won't single out any particular commenters, but there are some ideas and trends that I feel are worth addressing:

- From the negative feedback, I can't help but wonder if some of my examples were too trivial or too petty. This document would have been a lot easier to read and write if I only mentioned big issues. I've made some minor edits to address this, but it's hard to judge. A master would never make some of the mistakes that my [subsetting](#subsetting) section warns against, but does that mean I shouldn't even mention those issues?

- I find it interesting to note what hasn't been criticised. The following examples stand out to me:

- Although many said they found my '`"es"` in `"test"`' challenge a bit too easy (I think they missed my point), I could only find one person who made any attempt at my [`mapply()` challenge](#mapply-challenge).
- If my complaints about R not telling you what [the dangers of non-standard evaluation](#non-standard-evaluation) are had simple resolutions (e.g. a hyperlink), then I'd expect someone to have provided them. I've yet to see any feedback on this topic. The same is true of what I've said of [generic functions](#generic-functions-again).
- Despite some fair criticism of my section on [subsetting](#subsetting), I don't think that anyone mentioned any disagreements with what I've said about the [the vector recycling](#vectorization-again).
- I don't think that anyone disagreed with what I said about R's [documentation and error messages](#r-wont-help-you).

It could just be luck that these parts weren't mentioned anywhere that I found, but you'll forgive me for concluding that the lack of criticism implies my points were very strong.

- A very common objection was that my [Ignorance section](#ignorance) invalidates much of my commentary. Of course, said ignorance makes me unable to know if they're right or not. The two most common criticisms were that my lack of expertise in the Tidyverse and/or `data.table` mean that I've got nothing worthwhile to say and that using R as a programming language rather than a statistics tool is fundamentally wrong. All of these criticisms are partly correct. Using R for interactive data analysis is very different from trying to program with it, so such users simply won't encounter many of the issues that I've mentioned. Similarly, swapping base R for the Tidyverse automatically nullifies many of my complaints. You can even go through my table of contents and cross sections off. `dplyr` and its tibble-focus already knock off most of my complaints about base R's variable manipulations, data types, subsetting rules, and vector rules. Don't get me wrong, the Tidyverse has its own problems. For example, I'd hate to develop anything reliant on the Tidyverse's unstable API. However, if you're doing a run-once piece of analysis, then it's probably great. It's just a shame to see so much of R replaced by its packages.

- December 2022 update: I'm further in to my career as a software developer than I was when I wrote this and I'm starting to put more and more weight on the above point. Every R developer I've encountered professionally has been an exclusive user of the Tidyverse. I remember one person whose only memory of her R training was that you "*need to type `library(tidyverse)` before anything works*". I have a growing suspicion that using R for anything other than the Tidyverse is steadily -- and perhaps even correctly -- becoming seen as simply wrong. It's unfortunate that I sincerely love many parts of it.

- I've perhaps undersold just how good R can be at what it's specialised for. [This chain of Hacker News comments](https://news.ycombinator.com/item?id=30765409) seems to get across something that I haven't. I've certainly said that R is a large mathematics and statistics tool that is easy to extend and has clear Scheme inspiration, but the sum of those comments seems to say it better. As for the idea that R is a ['Worse is Better'](https://www.dreamsongs.com/RiseOfWorseIsBetter.html) language, I find it appealing but I don't feel qualified to judge. If anything was 'Worse is Better', then it was probably S (which would make R "almost the right thing", in that essay's terms). However, I'm not historically knowledgeable enough to know key factors like how simple S's early implementations were. I hear that it was very easy to get running on Unix?

- I never made it clear that I understand why backwards compatibility is a priority for R. For example, R code appears in a lot of science papers and you don't want such code to become unrunnable or to change meaning.

As a final point, making the changes to this document to reflect the changes coming in what I presume to be R version 4.1.4 has forced me to question my points about R being unable to change. I've not changed my mind yet, but time will tell. They certainly prove that R can change, but I think the real issue might be that it can't *fundamentally* change.


14 changes: 14 additions & 0 deletions general_feelings.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@


# General Feelings

My overall feelings about R are tough to quantify. As I mentioned near the start, its ultimate problem is the sum of its little problems. However, if I must speak generally, then I think that the problem with R is that it's always some mix of the following:

1. A statistics language with countless useful libraries and an excellent collection of mathematical tools.
2. A Scheme-inspired language that tries to be functional while maintaining a C-like syntax.
3. Decades of haphazard patches for S.
4. A collection of [semantic semtex](https://wiki.c2.com/?SemanticSemtex) that is powerful in the hands of a master and crippling in the hands of a novice.

When it's anything but #3, R is great. Statisticians and mathematicians love it for #1 and programmers love it for #2 and #4. If it weren't for #3, R would be an amazing -- albeit, domain-specific -- language, but #3 is such a big factor that it makes the language unpredictable, inconsistent, and infuriating. Mixed with #4, it makes being an R novice hellish. It gives me little doubt that R is not the ideal tool for many of the jobs that it wants to do, but #1 and #2 leave me with equally little doubt that R can be a very good tool.


Loading
Loading