andreaodell edited this page May 12, 2018 · 34 revisions

Welcome to the course-fish497-2018 wiki!

This will serve as a communal knowledge base for the course.

A course specific cheatsheet. Please edit at will!! 😄 (yes you can use emoticons)


Reading in Data

example using remote data

acacia <- read.csv("http://www.esapubs.org/archive/ecol/E095/064/ACACIA_DREPANOLOBIUM_SURVEY.txt", sep="\t", na.strings = "dead")

Loading a package

This would need to be done each time you need the package

library(dplyr)

You could also load library(tidyverse) since tidyverse contains both dplyr and ggplot.

Commenting within scripts

a # is used to leave informative comments within an r script. Comments are especially useful when looking at scripts written in the past.

library(dplyr)   ###loading the dplyr package for use

data.frame Inspections

Size:

To determine the number of rows and columns

nrow(acacia)
ncol(acacia)

Content:

To determine either the first six rows (head) or the last six rows (tail)

head(acacia)
tail(acacia)

Summary

To examine the structure of your table: class, length, content

str(acacia)

Summary statistics for each column/variable

summary(acacia)

dplyr

Pipes

Takes the output of one function and sends it directly to next using %>%

shortcut

  • Mac : cmd + shift + M
  • PC : ctrl + shift + M

So, instead of lengthy, step-by-step or hard to read, nested functions, you can use %>% to funnel the output into the proceeding function

acacia %>%
  filter(height > 1) %>%            ### keep rows where height is greater than 1, then...
  select(year, treatment, site)     ### keep only year, treatment, and site columns

Data Manipulation

dplyr makes use of simple verbs which help translate your thoughts into code.

  1. Filter rows with filter : subset rows within a data frame
filter(acacia, height == 1)    ### Keeps rows where height is equal to 1
  1. Add new columns with mutate
mutate(acacia, area = width* length)   ### adds a new row, "area".
  1. Group rows by specific variable with group_by
group_by(acacia, height)

ggplot2

Basic template:

ggplot(data = <DATASET>) + 
     <geom_FUNCTION>(mapping = aes(<MAPPINGS>))

To label x and y axis

aes(x = example, y = example)

To add color to the plot

aes(color = example)

Making x and y orientation in ggplot: use the aes function.

ggplot(data = yourdataset, aes(x = xaxis, y = yaxis))

Using geom_point you can change the size, color, and alpha (or transparency)of your plot's points. labs allows you to rename your labels to something more useful. scale allows you to scale your axis to logs.

example:

ggplot(yourdataset, aes(x = xaxis, y = yaxis)) + 
    geom_point(size = 5, color = "red", alpha = 0.5) + 
    labs(x = "X Label", y = "Y Label", title = "X and Y Axis") + 
    scale_x_log10("X Label") + 
    scale_y_log10("Y Label")

Arguments in aes()

When using aes() to define mappings of a plot, select the columns to be used for the x and y axis (or just the x axis for a histogram).

ggplot(<DATASET>) +
geom_point(mappings = aes(x = <COLUMN>, y = <COLUMN>))

To separate the data by formatting to a third variable, include that WITHIN the aes() function. If manually changing the formatting without adding another variable, list that argument AFTER the aes() function.

ggplot(<DATASET>) +
geom_point(mappings = aes(x = <COLUMN>, y = <COLUMN>, color = <COLUMN>))

ggplot(<DATASET>) +
geom_point(mappings = aes(x = <COLUMN>, y = <COLUMN>), color = "red")

In ggplot2, if you are building a histogram and you want to rotate the x-axis labels to a 90 degree angle:

ggplot(df, aes(x,y))+
theme(axis.text.x=element_text(angle=90,hjust=1)

Bins in geom_hist()

The bins are automatically set to 30 in the histogram, but sometimes this doesn't always show the data clearly to change the bins we have to adjust it in the mapping of geom_hist()

ggplot(data, mapping = aes(x, fill = color)) + 
  geom_histogram(mapping = NULL, stat = "bin", bins = #) 

Committing an Edited Script

After editing your script, make sure you save it! It should be saved to a "scripts" folder in your directory.

The script file should appear in your Git window with a blue "M", indicating the file is modified since the last commit. To commit the new version, check the staging box, click commit, add descriptive comments, and then click commit.

Don't forget to upload your commits to your repository by pushing!

If you're having trouble reading in a file...

Have you created an Rproj, and is it open? (If not, the data frame can't be opened via a relative path)

Is the data frame name in parentheses?

Is the data frame saved in a directory within your project?

If it's saved in a directory within your project directory (such as a "data" folder), you might have to put "data/" before the data frame name.

Navigating Git

Where did my commit and commit message go??

If you want to see where you added your commits to new changes you've made in git, click on the "History" button next to "Changes". You can also see your changes by clicking the "Diff" button in your Git panel.

Utilizing Git for version control

Pushing commits to Git allows for cloud-based storage that can be retrieved by pulling the project to a local computer. Pushing should be done often in order to ensure that edits can be reversed, while pulling is useful for collaboration with others in a group.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.