# R Protocols Notebook  <a id="TOC"></a>   
### Shantel A. Martinez |  Update 2019.02.19  
-------------  
*This **notebook** contains summaries, data analysis, and explanations of R scripts and programs found useful for genomic selection and beyond.*  
*This notebook is meant to be shared for the purpose of transparency and as a collaborative resource. It is a working document, therefore mistakes, unintentional incorrect analyses, and un-updated material may be shown.*   
*If you do see a mistake, I would love to learn a better way to analye my data. Please feel free to email me ([shantel.a.martinez@gmail.com](mailto:shantel.a.martinez@gmail.com)) with helpful feedback and I will be sure to update this public R Protocols Notebook*  

**TABLE OF CONTENTS:**  
**General Topic**  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [Genomic Prediction](#gs)   
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; GWAS[](#gwas)   
**Other Useful Commands**  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; References for [online cheatsheets](#cheat)   
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [Other Computing Resources](#other)   
**Coding Shortcuts**  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [Git](#git)  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [Anaconda](#anaconda)  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [Jupyter](#jupyter)  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [R](#r)  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [Markdown](#md)  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [Typora](#typora)  

## Genomic Prediction Modeling <a id="gs"></a>
Genomic Prediction models: rrbLUP, RKHS, and LASSO [script]() using a 5-fold cross-validation.  




## Online Coding Cheatsheets <a id="cheat"></a>  
#### Simple cheat sheets for markdown:   
[Github Wiki](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) or [md Basics](https://www.markdownguide.org/basic-syntax/)  

#### Simple cheat sheets for Jupyter Notebook:   
[Jupyter Notebook shortcuts](http://maxmelnick.com/2016/04/19/python-beginner-tips-and-tricks.html)  
Datacamps Jupyter and R markdown [cheatsheet](https://datacamp-community-prod.s3.amazonaws.com/48093c40-5303-45f4-bbf9-0c96c0133c40)  


## Other Computing Resources <a id="other"></a>    
#### Data Management  
[Data organization for Spreadsheets](https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989)    


{::options parse_block_html="false" /}

<div class="center">



</div>

--------- 


A [Nature article](https://www.nature.com/articles/d41586-018-06008-w?utm_source=twt_nr&utm_medium=social&utm_campaign=NNPnature) on how we need to be very transarent in our data analysis pipeline. If I ever want to published raw data files, the script to organize and analyze the data, all the way down to producing the final figures, I know from personal experience that I need to be extremely organized, and I am starting [here](https://www.nature.com/articles/d41586-018-06008-w?utm_source=twt_nr&utm_medium=social&utm_campaign=NNPnature). I have high admiration for scientists that are very transparent, and I dream of getting to the point of publishing a github repo with every raw data file with 'clean' script for the public to follow. So, I'm working on those skills, and these [articles](https://google.github.io/styleguide/Rguide.xml) on how to tidy up my script are also a good start.    

Furthermore, here is a talk by Karl Broman on [collaborating reproducibly](https://t.co/yYQjWS768e). Feel free to get lost in his blog posts on everything coding, science, and reproducability.   

#### Electronic Lab Notebook (ELN)   

Great [article](https://www.nature.com/articles/d41586-018-05895-3?utm_source=twt_nnc&utm_medium=social&utm_campaign=naturenews&sf195296490=1) on getting started with an electronic notebook.   
A [Nature article](https://www.nature.com/articles/d41586-018-07196-1?tm_source=twt_nnc&utm_medium=social&utm_campaign=naturenews&sf201140318=1) about why so many scientists love Jupyter Notebook (I am biased, I know).     


## Coding Shortcuts 
*If recently unused, I often forget these coding commands or shortcuts and I have to google search them again. Instead, I just keep my running list of forgotten favorites listed here*

### Git   <a id="git"></a>

*All research files are backed up onto GitHub*  
`git pull origin master` : Updates this computers master folder with changes from the other computer   
`git status`  : Tells you what has changed since the last push  
`git add -u`  : This tells git to automatically stage tracked files -- including deleting the previously tracked files.  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; OR `git add /folder` to add a whole specific folder of changes  
`git commit -m 'enter commit comment here'`  
`git push origin master`  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*Enter in user name (email) and password*

### Anaconda  <a id="anaconda"></a>

*Use the Anaconda terminal to access Jupyter Notebook*  
`cd /d D:`  
`cd /d General\ Research\ Files/`  
`jupyter notebook` This will start Jupyter Notebook in the web browser  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*Notebooks are found in the `/Lab notebook/` folder*  

### Jupyter  <a id="jupyter"></a>

*Shorthand keys*  
`Shift-Enter` run cell, select below  
`Ctrl-Enter` run cell  
`Alt-Enter` run cell, insert below  
`M` to markdown (remember to esc from edit mode) 
`Y` to code  

### Markdown  <a id="md"></a>

`<span style="color:#6E8B3D">Green Text</span>`:  <span style="color:#6E8B3D">Green Text</span>    
`<span style="color:#CD5C5C">Red Text</span>`: <span style="color:#CD5C5C">Red Text</span>   
`<span style="color:#4F94CD">Blue Text</span>`:  <span style="color:#4F94CD">Blue Text</span>  
*Colors only work when the md output is html ex: github pages or jupyter notebooks*  

`<a id="abbr_name"></a>`:  Header link  
`()[#abbr_name]`: Reference Header Link   
`&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`: Tab 6 spaces  

`<div style="text-align: right"> [TOC](#TOC) </div>`

`--------------` creates a line break


### Typora  <a id="typora"></a>

`Ctrl+/`: Source Code Mode  
`Ctrl+Shift+-`: Zoom Out   
`Ctrl+Shift+=`: Zoom In    

### R     <a id="r"></a>

`colnames(k)<- sub("X","",colnames(k))` replace column name characters like X or V with nothing. This is useful when importing data tables and R formats the empty column names.   

`c("#F8766D", "#7CAE00", "#00BFC4","#C77CFF")` : default ggplot2 colors  
2   
` length(df[!is.na(df)])`  Tells me how many values in the df are NOT NA. you can also specify col `df$col`   

`CV <- subset(CNLM, GID %in% myCVc$taxa)` Subsetting CNLM with  similar GID as myCV taxa, There are 1059 GID in common    
`PCC<- merge(myCVc,CV,by="ID")`  merge columns by two columns with the same name  
`names(myCVc)[14] <- "GID"` rename column 14 only   
`PHSred7$GIDx <- with(PHSred7,paste("cuGS",PHSred7$GID,sep="")) ` Make df\$GID ### to df$GIDx cuGS### by adding a new row  
`PHSwhite$GIDx <- gsub("cuGSOH", "OH", PHSwhite\$GIDx)` replace cuGSOH in column GIDx with OH.    

**lme4**  
`VarName` : Fixed Effect   
`(1|VarName)`: Random Effect; random intercept with fixed mean  
`x + (x|VarName)`: Random Effect; correlated intercept and slope with the fixed effect `x`  
`(1|Env%in%GID)`: Random effect interaction; GID within Env  
