Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Welcome to the course-fish497-2018 wiki!
This will serve as a communal knowledge base for the course.
A course specific cheatsheet. Please edit at will!! 😄 (yes you can use emoticons)
Reading in Data
example using remote data
acacia <- read.csv("http://www.esapubs.org/archive/ecol/E095/064/ACACIA_DREPANOLOBIUM_SURVEY.txt", sep="\t", na.strings = "dead")
Loading a package
This would need to be done each time you need the package
You could also load
library(tidyverse) since tidyverse contains both dplyr and ggplot.
Commenting within scripts
# is used to leave informative comments within an r script. Comments are especially useful when looking at scripts written in the past.
library(dplyr) ###loading the dplyr package for use
To determine the number of rows and columns
To determine either the first six rows (head) or the last six rows (tail)
To examine the structure of your table: class, length, content
Summary statistics for each column/variable
Takes the output of one function and sends it directly to next using
- Mac : cmd + shift + M
- PC : ctrl + shift + M
So, instead of lengthy, step-by-step or hard to read, nested functions, you can use
%>% to funnel the output into the proceeding function
acacia %>% filter(height > 1) %>% ### keep rows where height is greater than 1, then... select(year, treatment, site) ### keep only year, treatment, and site columns
dplyr makes use of simple verbs which help translate your thoughts into code.
Filter rows with
filter: subset rows within a data frame
filter(acacia, height == 1) ### Keeps rows where height is equal to 1
Add new columns with
mutate(acacia, area = width* length) ### adds a new row, "area".
Group rows by specific variable with
ggplot(data = <DATASET>) + <geom_FUNCTION>(mapping = aes(<MAPPINGS>))
To label x and y axis
aes(x = example, y = example)
To add color to the plot
aes(color = example)
Making x and y orientation in ggplot: use the aes function.
ggplot(data = yourdataset, aes(x = xaxis, y = yaxis))
Using geom_point you can change the size, color, and alpha (or transparency)of your plot's points.
allows you to rename your labels to something more useful.
allows you to scale your axis to logs.
ggplot(yourdataset, aes(x = xaxis, y = yaxis)) + geom_point(size = 5, color = "red", alpha = 0.5) + labs(x = "X Label", y = "Y Label", title = "X and Y Axis") + scale_x_log10("X Label") + scale_y_log10("Y Label")
Arguments in aes()
When using aes() to define mappings of a plot, select the columns to be used for the x and y axis (or just the x axis for a histogram).
ggplot(<DATASET>) + geom_point(mappings = aes(x = <COLUMN>, y = <COLUMN>))
To separate the data by formatting to a third variable, include that WITHIN the aes() function. If manually changing the formatting without adding another variable, list that argument AFTER the aes() function.
ggplot(<DATASET>) + geom_point(mappings = aes(x = <COLUMN>, y = <COLUMN>, color = <COLUMN>)) ggplot(<DATASET>) + geom_point(mappings = aes(x = <COLUMN>, y = <COLUMN>), color = "red")
In ggplot2, if you are building a histogram and you want to rotate the x-axis labels to a 90 degree angle:
ggplot(df, aes(x,y))+ theme(axis.text.x=element_text(angle=90,hjust=1)
Bins in geom_hist()
The bins are automatically set to 30 in the histogram, but sometimes this doesn't always show the data clearly to change the bins we have to adjust it in the mapping of geom_hist()
ggplot(data, mapping = aes(x, fill = color)) + geom_histogram(mapping = NULL, stat = "bin", bins = #)
Committing an Edited Script
After editing your script, make sure you save it! It should be saved to a "scripts" folder in your directory.
The script file should appear in your Git window with a blue "M", indicating the file is modified since the last commit. To commit the new version, check the staging box, click commit, add descriptive comments, and then click commit.
Don't forget to upload your commits to your repository by pushing!
If you're having trouble reading in a file...
Have you created an Rproj, and is it open? (If not, the data frame can't be opened via a relative path)
Is the data frame name in parentheses?
Is the data frame saved in a directory within your project?
If it's saved in a directory within your project directory (such as a "data" folder), you might have to put "data/" before the data frame name.
Where did my commit and commit message go??
If you want to see where you added your commits to new changes you've made in git, click on the "History" button next to "Changes". You can also see your changes by clicking the "Diff" button in your Git panel.
Utilizing Git for version control
Pushing commits to Git allows for cloud-based storage that can be retrieved by pulling the project to a local computer. Pushing should be done often in order to ensure that edits can be reversed, while pulling is useful for collaboration with others in a group.