# My R Course 

This R document on GitHub: https://github.com/peterthorsteinson/R_Jupyter/blob/master/index.ipynb  
The related Python document on GitHub: https://github.com/peterthorsteinson/Py_Jupyter/blob/master/index.ipynb  

This is a compendium of notes, code examples, and links to relevant docs, blogs, and other references, organized
around many aspects of R related syntax, concepts, packages, mathematics, statistics, numerical methods, as well
as several important machine learning and optimization algorithms.

 ## Background Info

- Created by Ross Ihaka and Robert Gentleman (University of Auckland, New Zealand)
- Interpreted language
- Case sensitive
- Open source implementation of S language for statistical analysis and graphics reporting
- GNU General Public License
- Available as source code and pre-compiled binary code
- Cross-platform including Linux, Windows and Mac operating systems
- Integrates with C, C++, .Net, Python, and FORTRAN languages
- Commands are separated by a semi-colon or by a newline
- Commands can be grouped into a compound expression within curly braces
- Comments can be placed almost anywhere, starting with a hashmark up to the end of the line
- If a command is not complete at the end of a line, the R CLI provides the continuation prompt (+)

## Install R and RStudio

* [Download and install R from CRAN](https://cran.r-project.org)
* [Download and install RStudio](https://www.rstudio.com/products/rstudio/download)  


## Installing R Language

See: https://www.r-project.org/

## Installing IRkernel in Jupyter

See: https://irkernel.github.io

Run the following in the Anaconda console with elevated permissions:
* install.packages('devtools')
* install.packages('IRkernel')
* IRkernel::installspec()

* See also: [My Git Course](../MyGitCourse/index.ipynb)

## Learning Resources

[R Reference Card](https://cran.r-project.org/doc/contrib/Short-refcard.pdf)  
[R for Data Science - Grolemund & Wickham](http://r4ds.had.co.nz)  
[Exploratory Data Analysis with R - Roger Peng](https://bookdown.org/rdpeng/exdata)  
[R Programming for Data Science - Roger Peng](https://bookdown.org/rdpeng/rprogdatascience)  
[R tips: 16 HOWTO’s with examples for data analysts - Lingyun Zhang](https://bookdown.org/lyzhang10/lzhang_r_tips_book/)  
[Cookbook for R - Winston Chang](http://www.cookbook-r.com/)  
[UC Business Analytics R Programming Guide](http://uc-r.github.io/)  
[WikiBooks: R Programming](https://en.wikibooks.org/wiki/R_Programming)  
[CRAN: An Introduction to R](https://cran.r-project.org/doc/manuals/r-devel/R-intro.html)  
[CRAN: R Language Definition](https://cran.r-project.org/doc/manuals/r-devel/R-lang.html)  
[CRAN: R Language Definition](https://cran.r-project.org/doc/manuals/r-devel/R-lang.pdf)  
    
    
## Free R Language Books
* [R for Data Science](https://r4ds.had.co.nz/)
* [Hands-On Programming with R](https://rstudio-education.github.io/hopr/)
* [ggplot2: Elegant Graphics for Data Analysis](https://github.com/hadley/ggplot2-book)
* [R Notes for Professionals](https://books.goalkicker.com/RBook/)
* [R for Beginnersn](https://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf)

## GitHub Repos
* https://ramnathv.github.io/pycon2014-r
* https://kingaa.github.io/R_Tutorial

- Outline
- Overview
- Install R and RStudio
- Data Science Workflow
- Good Reads

[A first example: gapminder](gapminder.ipynb)  

## R Topics

* [Explore RevoScaleR Functions](ExploreRevoScaleRFunctions.ipynb)
* [Getting Started](getting_started.ipynb)
* [The_R_Language](The_R_Language.ipynb)
* [Basic R Concepts](Basic_R_Concepts.ipynb)
* [00_R_Syntax_Examples](00_R_Syntax_Examples.ipynb)
* [01 Starter Demos](01_Starter_Demos.ipynb)
* [02 Simple R Syntax](02_Simple_R_Syntax.ipynb)
* [03 Atomic Classes](03_Atomic_Classes.ipynb)
* [04 Control Structures](04_Control_Structures.ipynb)
* [05 The Base Plotting System](05_The_Base_Plotting_System.ipynb)
* [06 Vectors](06_Vectors.ipynb)
* [07 The plotrix Graphics Package](07_The_plotrix_Graphics_Package.ipynb)
* [08 Lists](08_Lists.ipynb)
* [09 Factors](09_Factors.ipynb)  
* [10 Base R Sample Datasets](10_Base_R_Sample_Datasets.ipynb)
* [11 Attributes](11_Attributes.ipynb)
* [12 Data Frames](12_Data_Frames.ipynb)
* [13 Date Time](13_Date_Time.ipynb)
* [14 Manipulating Data](14_Manipulating_Data.ipynb)
* [15 Algorithms](15_Algorithms.ipynb)
* [16 Xxx](16_Xxx.ipynb)
* [17 Linear Algebra](17_Linear_Algebra.ipynb)
* [18 Probability Distributions](18_Probability_Distributions.ipynb)
* [19 Statistical Probability Functions](19_Statistical_Probability_Functions.ipynb)
* [20 Matrices](20_Matrices.ipynb)
* [21 Functions](21_Functions.ipynb)  
* [22 Logistic Regression](22_Logistic_Regression.ipynb)
* [23 Linear Regression](23_Linear_Regression.ipynb)
* [23b Linear Regression](23b_Linear_Regression.ipynb)
* [24 Regression Correlation](24_Regression_Correlation.ipynb)
* [25 Object Oriented Programming](25_Object_Oriented_Programming.ipynb)
* [26 The ggplot2 Package](26_The_ggplot2_Package.ipynb)
* [27 Calculus](27_Calculus.ipynb)
* [28 The dplyr Package](28_The_dplyr_Package.ipynb)
* [29 The tibble Package](29_The_tibble_Package.ipynb)
* [30 The readr Package](30_The_readr_Package.ipynb)
* [31 The tidyr Package](31_The_tidyr_Package.ipynb)
* [32 K-Nearest Neighbors](32_K_Nearest_Neighbors.ipynb)
* [33 Machine Learning Project](33_Machine_Learning_Project.ipynb)
* [34 Artificial Neural Networks](34_Artificial_Neural_Networks.ipynb)
* [35 Bayesian_Networks](35_Bayesian_Networks.ipynb)
* [36 K-Means Clustering](36_K_Means_Clustering.ipynb)
* [37 Support Vector Machines](37_Support_Vector_Machines.ipynb)
* [38 MNIST Machine Learning](38_MNIST_Machine_Learning.ipynb)
* [39 Monte Carlo Simulation](39_Monte_Carlo_Simulation.ipynb)
* [40 Debugging and Performance](40_Debugging_Performance.ipynb)
* [41 The plotly Package](41_The_plotly_Package.ipynb)

   
## Data Science Workflow

![Data Science Workflow](images/DataScienceWorkflow.png "Data Science Workflow")

- **Data Source** First the data must be imported from some data source, such as a file, database, or web API, and that data is usually loaded into a data frame. 
- **Tidy** Tidying data means storing it in a consistent structure where each table column is a variable, and each row is an observation, and missing data and outliers may be handled in a useful way.
- **Transform** Transforming data includes activities that focus on specific variables in the data, or create new convenience variables that are computed from existing variables, and calculate summary statistics. Together, tidying and transforming data is known as data wrangling.
- **Visualize** Data visualisation provides insight into your data that can deepen understanding and guide further inquirey. Visualisation requires human or artificial intellegence to interpret the underlying meaning embedded in the data.
- **Model** A data model is a mathematical or algorithmic reprsentation of data relationships that can verify a hypothesis about the data, or make new predictions from the data.

© 2018 Peter Thorsteinson - This work is licensed under the [Creative Commons Attribution-NonCommercial-NoDerivs 3.0](https://creativecommons.org/licenses/by-nc-nd/3.0/us/).

BTC: 1D4d7Bhxj5QWcgbQyZCowTMEi36CNYcmQf