Skip to content

Commit

Permalink
r
Browse files Browse the repository at this point in the history
  • Loading branch information
tlevine committed Jul 10, 2013
1 parent dbf40cf commit b4d8e5b
Show file tree
Hide file tree
Showing 8 changed files with 646 additions and 32 deletions.
3 changes: 3 additions & 0 deletions !/about/index.html
Expand Up @@ -152,6 +152,9 @@ <h2 id="other-things">Other things</h2>
<p><a href="http://www.guardian.co.uk/profile/nicola-hughes">Nicola Hughes</a> <!-- notmuch show thread:00000000000215a7 --></p> <p><a href="http://www.guardian.co.uk/profile/nicola-hughes">Nicola Hughes</a> <!-- notmuch show thread:00000000000215a7 --></p>
</blockquote> </blockquote>


<p>He is <a href="http://osrc.dfm.io/tlevine">supposedly</a>
“one of the top 3% most active JavaScript users” on GitHub.</p>

<p>He has superpowers.</p> <p>He has superpowers.</p>


<blockquote> <blockquote>
Expand Down
167 changes: 150 additions & 17 deletions !/feed.xml
Expand Up @@ -2,48 +2,135 @@
<feed xmlns="http://www.w3.org/2005/Atom"> <feed xmlns="http://www.w3.org/2005/Atom">
<id>http://www.thomaslevine.com/</id> <id>http://www.thomaslevine.com/</id>
<title>Thomas Levine</title> <title>Thomas Levine</title>
<updated>2013-07-26T07:00:00Z</updated> <updated>2013-07-10T07:00:00Z</updated>
<link rel="alternate" href="http://www.thomaslevine.com/"/> <link rel="alternate" href="http://www.thomaslevine.com/"/>
<link rel="self" href="http://www.thomaslevine.com/!/feed.xml"/> <link rel="self" href="http://www.thomaslevine.com/!/feed.xml"/>
<author> <author>
<name>Thomas Levine</name> <name>Thomas Levine</name>
<uri>http://www.thomaslevine.com</uri> <uri>http://www.thomaslevine.com</uri>
</author> </author>
<entry> <entry>
<id>tag:www.thomaslevine.com,2013-07-26:/!/magic-r-commands/index.html</id> <id>tag:www.thomaslevine.com,2013-07-10:/!/r-spells-for-data-wizards/index.html</id>
<title type="html">Magic R commands</title> <title type="html">R spells for data wizards</title>
<published>2013-07-26T07:00:00Z</published> <published>2013-07-10T07:00:00Z</published>
<updated>2013-07-26T07:00:00Z</updated> <updated>2013-07-10T07:00:00Z</updated>
<link rel="alternate" href="http://www.thomaslevine.com/!/magic-r-commands/index.html"/> <link rel="alternate" href="http://www.thomaslevine.com/!/r-spells-for-data-wizards/index.html"/>
<content type="html">&lt;p&gt;I’ve never come up with a good way for learning/teaching the cool parts of R. <content type="html">&lt;p&gt;I’ve never come up with a good way for learning/teaching the cool parts of R.
I guess that’s just how R is; it’s all a hack. Anyway, here are some magic I feel like that’s sort of how R is; there’s an awesome way to do everything,
incantations.&lt;/p&gt; but it’s all very specific and kind of hacky.&lt;/p&gt;


&lt;h2 id="general-stuff"&gt;General stuff&lt;/h2&gt; &lt;p&gt;I thought of some magic incantations that you might not find in
introductory R books/documentation/classes and wrote about them below.&lt;/p&gt;

&lt;h2 id="csv"&gt;CSV&lt;/h2&gt;
&lt;p&gt;When loading a CSV, don’t convert strings to factors.&lt;/p&gt; &lt;p&gt;When loading a CSV, don’t convert strings to factors.&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;read.csv('csvsoundsystem.com/soundsystem.csv', stringsAsFactors = F) &lt;pre&gt;&lt;code&gt;read.csv('csvsoundsystem.com/soundsystem.csv', stringsAsFactors = F)
&lt;/code&gt;&lt;/pre&gt; &lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Use ProjectTemplate.&lt;/p&gt; &lt;p&gt;When writing a CSV, don’t add the rownames.&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;library(ProjectTemplate) &lt;pre&gt;&lt;code&gt;write.csv(iris, file = 'iris.csv', row.names = F)
&lt;/code&gt;&lt;/pre&gt; &lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;&lt;code&gt;sqldf&lt;/code&gt; works on R data.frames and on other databases&lt;/p&gt; &lt;h2 id="indexing"&gt;Indexing&lt;/h2&gt;
&lt;p&gt;It’s easy to miss a level of indexing, especially with lists.&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;sqldf('SELECT foo FROM bar') # Use the bar data.frame &lt;pre&gt;&lt;code&gt;str(list(a = 3)[1][[1]])
sqldf('SELECT foo FROM bar', dbname = 'baz.db') # Use the baz.db SQLite database # num 3

str(list(a = 3)[1])
# List of 1
# $ a: num 3

str(list(a = 3))
# List of 1
# $ a: num 3
&lt;/code&gt;&lt;/pre&gt; &lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;&lt;code&gt;mapply&lt;/code&gt; maps along a matrix, passing multiple arguments to the function&lt;/p&gt; &lt;p&gt;You can use character vectors indices.&lt;/p&gt;


&lt;p&gt;Show all factor levels in a ggplot&lt;/p&gt; &lt;pre&gt;&lt;code&gt;row.names(HairEyeColor)
# [1] "Black" "Brown" "Red" "Blond"

row.names(HairEyeColor) &amp;lt;- c('Pink', 'Blue', 'Green', 'Clear')
HairEyeColor['Pink',,]
# Sex
# Eye Male Female
# Brown 32 36
# Blue 11 9
# Hazel 10 5
# Green 3 2

HairEyeColor[,,'Male']
# Eye
# Hair Brown Blue Hazel Green
# Pink 32 11 10 3
# Blue 53 50 25 15
# Green 10 10 7 7
# Clear 3 30 5 8
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="factors"&gt;Factors&lt;/h2&gt;
&lt;p&gt;Factor levels are sorted alphabetically by default&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;levels(factor(10:1))
# [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If you want to change that, just create a new factor,
specifying the level order manually.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;factor(parking$GarOrLot, levels = c('G', 'L'))
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And you rename a level or levels like so.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;levels(OrchardSprays&amp;lt;reatment)[3:5] &amp;lt;- c('X', 'Y', 'Z')
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="concatenating-text"&gt;Concatenating text&lt;/h2&gt;
&lt;p&gt;This is how you concatenate text.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;paste('abc', 'def', sep = '')
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In JavaScript, this would be &lt;code&gt;'abc' + 'def'&lt;/code&gt;. Sort of.
R’s &lt;code&gt;paste&lt;/code&gt; is more powerful because supports vectors!
If you pass it vectors, &lt;code&gt;paste&lt;/code&gt; will ordinarily concatenate corresponding elements
across vector.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;paste(c('a','b','c'), 1:3)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If you want to concatenate the elements within a vector,
use &lt;code&gt;collapse&lt;/code&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;paste(c('Pack', 'my', 'box', 'with', 'five', 'dozen', 'liquor', 'jugs.'), collapse = ' ')
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In case that isn’t clear, it would look like this in JavaScript:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;['Pack', 'my', 'box', 'with', 'five', 'dozen', 'liquor', 'jugs.'].join(' ')
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="plotting"&gt;Plotting&lt;/h2&gt;
&lt;p&gt;Show all factor levels in a ggplot.&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;ggplot(iris[1:50,]) + aes(x = Species, y = Sepal.Length) + &lt;pre&gt;&lt;code&gt;ggplot(iris[1:50,]) + aes(x = Species, y = Sepal.Length) +
scale_x_discrete('Species', drop = F) + geom_point() scale_x_discrete('Species', drop = F) + geom_jitter()
&lt;/code&gt;&lt;/pre&gt; &lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Also, in general, &lt;strong&gt;use ggplot&lt;/strong&gt;. Base R graphics are
&lt;a href="http://www.livestream.com/knerd/video?clipId=pla_a5d59285-9399-47dc-aaef-2b9a77142d5e"&gt;more work than they’re worth&lt;/a&gt;,
except maybe if you’re
&lt;a href="http://www.youtube.com/watch?v=rLZDvXPIDa0"&gt;making&lt;/a&gt;
&lt;a href="http://fms.csvsoundsystem.com"&gt;music&lt;/a&gt;
&lt;a href="http://www.youtube.com/watch?v=tcnoBL0tvpc"&gt;videos&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;That said, if you do use base R graphics, try using &lt;code&gt;locator&lt;/code&gt;
when you’re perfecting the layout of base R graphics.&lt;/p&gt;

&lt;h2 id="maintenance"&gt;Maintenance&lt;/h2&gt; &lt;h2 id="maintenance"&gt;Maintenance&lt;/h2&gt;
&lt;p&gt;Update your packages.&lt;/p&gt; &lt;p&gt;Update your packages.&lt;/p&gt;


Expand Down Expand Up @@ -76,6 +163,52 @@ options(continue="+ ")
&lt;pre&gt;&lt;code&gt;Sys.setenv(R_HISTSIZE='100000') &lt;pre&gt;&lt;code&gt;Sys.setenv(R_HISTSIZE='100000')
sink(file = paste('~/.history/r-log-', strftime(Sys.time(), '%F %H:%M:%OS9'), '-', sep = ''), split=T) sink(file = paste('~/.history/r-log-', strftime(Sys.time(), '%F %H:%M:%OS9'), '-', sep = ''), split=T)
&lt;/code&gt;&lt;/pre&gt; &lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="higher-order-functions"&gt;Higher-order functions&lt;/h2&gt;
&lt;p&gt;R’s “apply” functions would be called “maps” in other languages.
If you’re applying along a list or vector, &lt;code&gt;lapply&lt;/code&gt; or &lt;code&gt;sapply&lt;/code&gt;, respectively, are convenient.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;apply&lt;/code&gt; maps along any dimension of an array; you specify the dimension as an argument.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;mapply&lt;/code&gt; maps along a matrix, passing multiple arguments to the function&lt;/p&gt;

&lt;p&gt;&lt;code&gt;rollapply&lt;/code&gt; is really cool. It applies a function with a rolling window.
For example, here’s a rolling z-score that &lt;a href="http://brianabelson.com"&gt;Brian&lt;/a&gt; wrote.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;library(zoo)

roll_z &amp;lt;- function(x){
scores &amp;lt;- z(x)
scores[length(x)]
}

z_change &amp;lt;- rollapply(rnorm(1000), 40, roll_z)
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id="other-stuff"&gt;Other stuff&lt;/h2&gt;
&lt;p&gt;Use ProjectTemplate.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;library(ProjectTemplate)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Use &lt;code&gt;str&lt;/code&gt; to find out something’s type.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;str(ChickWeight)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;sqldf&lt;/code&gt; works both on R data.frames and on other databases&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;sqldf('SELECT foo FROM bar') # Use the bar data.frame
sqldf('SELECT foo FROM bar', dbname = 'baz.db') # Use the baz.db SQLite database
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Use &lt;code&gt;download.file&lt;/code&gt; to download files.&lt;/p&gt;

&lt;p&gt;Sort one thing by another thing.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;iris[order(iris$Sepal.Length),]
cars$speed[order(cars$dist)]
&lt;/code&gt;&lt;/pre&gt;
</content> </content>
</entry> </entry>
<entry> <entry>
Expand Down
12 changes: 10 additions & 2 deletions !/index.html
Expand Up @@ -76,13 +76,21 @@
</nav> </nav>
<header class="title-card"> <header class="title-card">
<h1> <h1>
<a href="socrata-summary/">Analyze all the datasets</a> <a href="r-spells-for-data-wizards/">R spells for data wizards</a>
</h1> </h1>
<div class="date"> <div class="date">
July 07, 2013 July 10, 2013
</div> </div>
</header> </header>
<div class="clearfix" id="links"> <div class="clearfix" id="links">
<div class="link">
<strong>
<a href="socrata-summary/">Analyze all the datasets</a>
</strong>
<footer>
Jul 07, 2013
</footer>
</div>
<div class="link"> <div class="link">
<strong> <strong>
<a href="ear-muffs/">Earmuffs</a> <a href="ear-muffs/">Earmuffs</a>
Expand Down
106 changes: 95 additions & 11 deletions !/magic-r-commands/index.html
Expand Up @@ -76,33 +76,85 @@ <h1>
Magic R commands Magic R commands
</h1> </h1>
<div class='date'> <div class='date'>
July 26, 2013 July 10, 2013
</div> </div>
</header> </header>
<div id='article-wrapper'> <div id='article-wrapper'>
<article> <article>
<p>I’ve never come up with a good way for learning/teaching the cool parts of R. <p>I’ve never come up with a good way for learning/teaching the cool parts of R.
I guess that’s just how R is; it’s all a hack. Anyway, here are some magic I feel like that’s sort of how R is; there’s an awesome way to do everything,
incantations.</p> but it’s all very specific and hacky.</p>


<h2 id="general-stuff">General stuff</h2> <p>I tried to think of some magic incantations that you might not find in introductory
R books/documentation/classes.</p>

<h2 id="csv">CSV</h2>
<p>When loading a CSV, don’t convert strings to factors.</p> <p>When loading a CSV, don’t convert strings to factors.</p>


<pre><code>read.csv('csvsoundsystem.com/soundsystem.csv', stringsAsFactors = F)&#x000A;</code></pre> <pre><code>read.csv('csvsoundsystem.com/soundsystem.csv', stringsAsFactors = F)&#x000A;</code></pre>


<p>Use ProjectTemplate.</p> <p>When writing a CSV, don’t add the rownames.</p>


<pre><code>library(ProjectTemplate)&#x000A;</code></pre> <pre><code>write.csv(iris, file = 'iris.csv', row.names = F)&#x000A;</code></pre>


<p><code>sqldf</code> works on R data.frames and on other databases</p> <h2 id="indexing">Indexing</h2>
<p>It’s easy to miss a level of indexing, especially with lists.</p>


<pre><code>sqldf('SELECT foo FROM bar') # Use the bar data.frame&#x000A;sqldf('SELECT foo FROM bar', dbname = 'baz.db') # Use the baz.db SQLite database&#x000A;</code></pre> <pre><code>str(list(a = 3)[1][[1]])&#x000A;# num 3&#x000A;&#x000A;str(list(a = 3)[1])&#x000A;# List of 1&#x000A;# $ a: num 3&#x000A;&#x000A;str(list(a = 3))&#x000A;# List of 1&#x000A;# $ a: num 3&#x000A;</code></pre>


<p><code>mapply</code> maps along a matrix, passing multiple arguments to the function</p> <p>You can use character vectors indices.</p>

<pre><code>row.names(HairEyeColor)&#x000A;# [1] "Black" "Brown" "Red" "Blond"&#x000A;&#x000A;row.names(HairEyeColor) &lt;- c('Pink', 'Blue', 'Green', 'Clear')&#x000A;HairEyeColor['Pink',,]&#x000A;# Sex&#x000A;# Eye Male Female&#x000A;# Brown 32 36&#x000A;# Blue 11 9&#x000A;# Hazel 10 5&#x000A;# Green 3 2&#x000A;&#x000A;HairEyeColor[,,'Male']&#x000A;# Eye&#x000A;# Hair Brown Blue Hazel Green&#x000A;# Pink 32 11 10 3&#x000A;# Blue 53 50 25 15&#x000A;# Green 10 10 7 7&#x000A;# Clear 3 30 5 8&#x000A;</code></pre>

<h2 id="factors">Factors</h2>
<p>Factor levels are sorted alphabetically by default</p>

<pre><code>levels(factor(10:1))&#x000A;# [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"&#x000A;</code></pre>

<p>If you want to change that, just create a new factor,
specifying the level order manually.</p>

<pre><code>factor(parking$GarOrLot, levels = c('G', 'L'))&#x000A;</code></pre>

<p>And you rename a level or levels like so.</p>

<pre><code>levels(OrchardSprays&lt;reatment)[3:5] &lt;- c('X', 'Y', 'Z')&#x000A;</code></pre>

<h2 id="concatenating-text">Concatenating text</h2>
<p>This is how you concatenate text.</p>

<pre><code>paste('abc', 'def', sep = '')&#x000A;</code></pre>

<p>In JavaScript, this would be <code>'abc' + 'def'</code>. Sort of.
R’s <code>paste</code> is more powerful because supports vectors!
If you pass it vectors, <code>paste</code> will ordinarily concatenate corresponding elements
across vector.</p>

<pre><code>paste(c('a','b','c'), 1:3)&#x000A;</code></pre>

<p>If you want to concatenate the elements within a vector,
use <code>collapse</code></p>

<pre><code>paste(c('Pack', 'my', 'box', 'with', 'five', 'dozen', 'liquor', 'jugs.'), collapse = ' ')&#x000A;</code></pre>

<p>In case that isn’t clear, it would look like this in JavaScript:</p>

<pre><code>['Pack', 'my', 'box', 'with', 'five', 'dozen', 'liquor', 'jugs.'].join(' ')&#x000A;</code></pre>


<p>Show all factor levels in a ggplot</p> <h2 id="plotting">Plotting</h2>
<p>Show all factor levels in a ggplot.</p>


<pre><code>ggplot(iris[1:50,]) + aes(x = Species, y = Sepal.Length) +&#x000A; scale_x_discrete('Species', drop = F) + geom_point()&#x000A;</code></pre> <pre><code>ggplot(iris[1:50,]) + aes(x = Species, y = Sepal.Length) +&#x000A; scale_x_discrete('Species', drop = F) + geom_jitter()&#x000A;</code></pre>

<p>Also, in general, <strong>use ggplot</strong>. Base R graphics are
<a href="http://www.livestream.com/knerd/video?clipId=pla_a5d59285-9399-47dc-aaef-2b9a77142d5e">more work than they’re worth</a>,
except maybe if you’re
<a href="http://www.youtube.com/watch?v=rLZDvXPIDa0">making</a>
<a href="http://fms.csvsoundsystem.com">music</a>
<a href="http://www.youtube.com/watch?v=tcnoBL0tvpc">videos</a>.</p>

<p>That said, if you do use base R graphics, try using <code>locator</code>
when you’re perfecting the layout of base R graphics.</p>


<h2 id="maintenance">Maintenance</h2> <h2 id="maintenance">Maintenance</h2>
<p>Update your packages.</p> <p>Update your packages.</p>
Expand All @@ -126,6 +178,38 @@ <h2 id="rprofile">.Rprofile</h2>
<p>Save your command history and output</p> <p>Save your command history and output</p>


<pre><code>Sys.setenv(R_HISTSIZE='100000')&#x000A;sink(file = paste('~/.history/r-log-', strftime(Sys.time(), '%F %H:%M:%OS9'), '-', sep = ''), split=T)&#x000A;</code></pre> <pre><code>Sys.setenv(R_HISTSIZE='100000')&#x000A;sink(file = paste('~/.history/r-log-', strftime(Sys.time(), '%F %H:%M:%OS9'), '-', sep = ''), split=T)&#x000A;</code></pre>

<h2 id="higher-order-functions">Higher-order functions</h2>
<p>R’s “apply” functions would be called “maps” in other languages.
If you’re applying along a list or vector, <code>lapply</code> or <code>sapply</code>, respectively, are convenient.</p>

<p><code>apply</code> maps along any dimension of an array; you specify the dimension as an argument.</p>

<p><code>mapply</code> maps along a matrix, passing multiple arguments to the function</p>

<p><code>rollapply</code> is really cool. It applies a function with a rolling window.
For example, here’s a rolling z-score that <a href="http://brianabelson.com">Brian</a> wrote.</p>

<pre><code>library(zoo)&#x000A;&#x000A;roll_z &lt;- function(x){&#x000A; scores &lt;- z(x)&#x000A; scores[length(x)]&#x000A;}&#x000A;&#x000A;z_change &lt;- rollapply(rnorm(1000), 40, roll_z)&#x000A;</code></pre>

<h2 id="other-stuff">Other stuff</h2>
<p>Use ProjectTemplate.</p>

<pre><code>library(ProjectTemplate)&#x000A;</code></pre>

<p>Use <code>str</code> to find out something’s type.</p>

<pre><code>str(ChickWeight)&#x000A;</code></pre>

<p><code>sqldf</code> works both on R data.frames and on other databases</p>

<pre><code>sqldf('SELECT foo FROM bar') # Use the bar data.frame&#x000A;sqldf('SELECT foo FROM bar', dbname = 'baz.db') # Use the baz.db SQLite database&#x000A;</code></pre>

<p>Use <code>download.file</code> to download files.</p>

<p>Sort one thing by another thing.</p>

<pre><code>iris[order(iris$Sepal.Length),]&#x000A;cars$speed[order(cars$dist)]&#x000A;</code></pre>
</article> </article>
</div> </div>
<div id='pagination'> <div id='pagination'>
Expand Down

0 comments on commit b4d8e5b

Please sign in to comment.