# Set up

<img style="float: left;" src="Rlogo.png">


**Project tempalate**

This package, [info available [here](https://cran.r-project.org/web/packages/ProjectTemplate/index.html)] is very useful for ensuring data analysis is easy to follow and managable.

Take the time to read through the `readme` files to understand how to use this system, but don't forget to customise it to suit your needs

In [9]:
install.packages("ProjectTemplate")
library("ProjectTemplate")
create.project("../my_project", merge.strategy = "allow.non.conflict")

Updating HTML index of packages in '.Library'
Making 'packages.html' ... done


### Useful packages

**To load data**

[RMySQL](https://www.rdocumentation.org/packages/RMySQL/versions/0.10.13/topics/MySQLDriver-class), [RPostgresSQL](https://www.rdocumentation.org/search?q=RPostgresSQL), [RSQLite](https://www.rdocumentation.org/packages/RSQLite/versions/2.0) - If you'd like to read in data from a database, these packages are a good place to start. Choose the package that fits your type of database.

[readr](http://readr.tidyverse.org/) - The goal of readr is to provide a fast and friendly way to read rectangular data (like csv, tsv, and fwf). It is designed to flexibly parse many types of data found in the wild, while still cleanly failing when data unexpectedly changes.

[tibble](http://tibble.tidyverse.org/) - A tibble, or tbl_df, is a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not. Tibbles are data.frames that are lazy and surly: they do less (i.e. they don’t change variable names or types, and don’t do partial matching) and complain more (e.g. when a variable does not exist). This forces you to confront problems earlier, typically leading to cleaner, more expressive code. 

[XLConnect](https://cran.rstudio.com/web/packages/XLConnect/), [xlsx](https://cran.rstudio.com/web/packages/xlsx/) - These packages help you read and write Micorsoft Excel files from R. You can also just export your spreadsheets from Excel as .csv's.

[foreign](http://www.rdocumentation.org/packages/foreign) - Want to read a SAS data set into R? Or an SPSS data set? Foreign provides functions that help you load data files from other programs into R.

R can handle plain text files – no package required. Just use the functions read.csv, read.table, and read.fwf. If you have even more exotic data, consult the CRAN guide to data import and export.

**To manipulate data**

[dplyr](http://blog.rstudio.org/2014/01/17/introducing-dplyr/) - Essential shortcuts for subsetting, summarizing, rearranging, and joining together data sets. dplyr is our go to package for fast data manipulation.

[tidyr](http://blog.rstudio.org/2014/07/22/introducing-tidyr/) - Tools for changing the layout of your data sets. Use the gather and spread functions to convert your data into the tidy format, the layout R likes best.

[stringr](http://journal.r-project.org/archive/2010-2/RJournal_2010-2_Wickham.pdf) - Easy to learn tools for regular expressions and character strings.

[lubridate](http://www.r-statistics.com/2012/03/do-more-with-dates-and-times-in-r-with-lubridate-1-1-0/) - Tools that make working with dates and times easier.

**To visualize data**

[ggplot2](http://docs.ggplot2.org/current/) - R's famous package for making beautiful graphics. ggplot2 lets you use the grammar of graphics to build layered, customizable plots.

[ggvis](http://ggvis.rstudio.com/) - Interactive, web based graphics built with the grammar of graphics.

[rgl](http://rgl.neoscientists.org/about.shtml) - Interactive 3D visualizations with R

[htmlwidgets](http://www.htmlwidgets.org/) - A fast way to build interactive (javascript based) visualizations with R. 

[googleVis](https://cran.rstudio.com/web/packages/googleVis) - Let's you use Google Chart tools to visualize data in R. Google Chart tools used to be called Gapminder, the graphing software Hans Rosling made famous in hie TED talk.



**Ecological & microbial analysis**

[vegan](https://cran.r-project.org/web/packages/vegan/index.html) - The vegan package provides tools for descriptive community ecology. It has most basic functions
of diversity analysis, community ordination and dissimilarity analysis. Most of its multivariate tools
can be used for other data types as well.

[microbiome](https://www.bioconductor.org/packages/devel/bioc/vignettes/microbiome/inst/doc/vignette.html) - The microbiome R package facilitates exploration and analysis of microbiome profiling data, in particular 16S taxonomic profiling.

[phyloseq](https://bioconductor.org/packages/release/bioc/html/phyloseq.html) - phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data.

[diveRsity](https://cran.r-project.org/web/packages/diveRsity/index.html) - Allows the calculation of both genetic diversity partition statistics, genetic differentiation statistics, and locus informativeness for ancestry assignment. It also provides users with various option to calculate bootstrapped 95\% confidence intervals both across loci, for pairwise population comparisons, and to plot these results interactively

[Rhea](https://lagkouvardos.github.io/Rhea/) - A set of R scripts that encode a series of well-documented choices for the downstream analysis of Operational Taxonomic Units (OTUs) tables, including normalization steps, alpha- and beta-diversity analysis, taxonomic composition, statistical comparisons, and calculation of correlations.

[RevEcoR](https://cran.r-project.org/web/packages/RevEcoR/) - An implementation of the reverse ecology framework. Reverse ecology refers to the use of genomics to study ecology with no a priori assumptions about the organism(s) under consideration, linking organisms to their environment. It allows researchers to reconstruct the metabolic networks and study the ecology of poorly characterized microbial species from their genomic information, and has substantial potentials for microbial community ecological analysis.

**For spatial data**

[sp](https://cran.rstudio.com/web/packages/sp), [maptools](http://www.rdocumentation.org/packages/maptools) - Tools for loading and using spatial data including shapefiles.

[maps](http://www.rdocumentation.org/packages/maps) - Easy to use map polygons for plots.

[ggmap](http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf) - Download street maps straight from Google maps and use them as a background in your ggplots.

**For time series and financial data**

[zoo](https://cran.rstudio.com/web/packages/zoo) - Provides the most popular format for saving time series objects in R.

[xts](https://cran.rstudio.com/web/packages/xts) - Very flexible tools for manipulating time series data sets.

[quantmod](http://www.quantmod.com/) - Tools for downloading financial data, plotting common charts, and doing technical analysis.

**Work with the web**

[XML](http://www.omegahat.org/RSXML/shortIntro.pdf) - Read and create XML documents with R

[jsonlite](https://cran.rstudio.com/web/packages/jsonlite) - Read and create JSON data tables with R

[httr](http://www.rdocumentation.org/packages/httr/functions/httr) - A set of useful tools for working with http connections

**To report restults**

[R Markdown](https://cran.r-project.org/web/packages/rmarkdown/index.html) - The perfect workflow for reproducible reporting. Write R code in your [markdown](https://daringfireball.net/projects/markdown/) reports. When you run render, R Markdown will replace the code with its results and then export your report as an HTML, pdf, or MS Word document, or a HTML or pdf slideshow. The result? Automated reporting. R Markdown is integrated straight into RStudio.

[shiny](http://shiny.rstudio.com/) - Easily make interactive, web apps with R. A perfect way to explore data and share findings with non-programmers.

[xtable](https://www.rdocumentation.org/packages/xtable/versions/1.8-2/topics/xtable) - The xtable function takes an R object (like a data frame) and returns the latex or HTML code you need to paste a pretty version of the object into your documents. Copy and paste, or pair up with R Markdown.

**Other Resources**

There are _many_ other resources available like those [here](https://www.computerworld.com/article/2921176/business-intelligence/great-r-packages-for-data-import-wrangling-visualization.html) and [here](https://www.analyticsvidhya.com/blog/2015/08/list-r-packages-data-analysis/) to start with.

In [16]:
install.packages("tidyverse") #this installed all tidyverse packages - ggplot, shiny, dplyr, readr etc
library("lattice")
library("plyr")
library("colorspace")
library("RColorBrewer")
library("forcats")


Attaching package: ‘dplyr’

The following objects are masked from ‘package:plyr’:

    arrange, count, desc, failwith, id, mutate, rename, summarise,
    summarize

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

Loading tidyverse: tibble
Loading tidyverse: tidyr
Loading tidyverse: readr
Loading tidyverse: purrr
Conflicts with tidy packages ---------------------------------------------------
arrange():   dplyr, plyr
compact():   purrr, plyr
count():     dplyr, plyr
failwith():  dplyr, plyr
filter():    dplyr, stats
id():        dplyr, plyr
lag():       dplyr, stats
mutate():    dplyr, plyr
rename():    dplyr, plyr
summarise(): dplyr, plyr
summarize(): dplyr, plyr
