# Table of Contents
 <p><div class="lev1 toc-item"><a href="#Install-external-packages" data-toc-modified-id="Install-external-packages-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Install external packages</a></div><div class="lev2 toc-item"><a href="#Functions-from-external-R-packages" data-toc-modified-id="Functions-from-external-R-packages-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Functions from external R packages</a></div><div class="lev2 toc-item"><a href="#User-defined-functions" data-toc-modified-id="User-defined-functions-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>User-defined functions</a></div><div class="lev3 toc-item"><a href="#Passing-optional-arguments-to-a-function-in-the-function" data-toc-modified-id="Passing-optional-arguments-to-a-function-in-the-function-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Passing optional arguments to a function in the function</a></div><div class="lev2 toc-item"><a href="#Efficient-evaluation-of-functions:-apply,-lapply,-sapply" data-toc-modified-id="Efficient-evaluation-of-functions:-apply,-lapply,-sapply-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Efficient evaluation of functions: apply, lapply, sapply</a></div>

[Back to course overview](../CourseOverviewR.ipynb)

# Install external packages
* Author: Johannes Maucher
* Last Update: 2017-03-13

R is primarily a functional language. Functions are treated as other data types. For example functions can be assigned to variables and can be passed as arguments to other functions. Even simple operators as *+* are functions. The conventional formulation *x+y* is just a shortcut for "+"(x,y):

## Functions from external R packages
There exists more than 10000 R-packages, which provide solutions for all kinds of problems. External R packages can be downloaded e.g. from [https://cran.r-project.org/](https://cran.r-project.org/). The list of all installed package, available in your current environment, can be obtained by the following statement:

In [20]:
library()

Functions of external packages, which are installed in your environment must be loaded before they can be applied. Functions from package *X* can be loaded by
*library(X)*. For example the following statement loads the package *ggplot2*

In [24]:
library(ggplot2)

External packages, which are not already installed in your environment, can be downloaded and installed by

install.packages("NameOfPackage")

For example the following statement downloads and installs the package *ttseries*

In [33]:
install.packages("tagcloud")

package 'tagcloud' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\maucher\AppData\Local\Temp\RtmpuKA0I2\downloaded_packages


This command is only successful if the package exists on the configured CRAN-mirror. A list of all available CRAN-mirrors can be obtained by the *getCRANmirrors()*-function. A particular mirror can the be set by

*options(repos=structure(c(CRAN="http://cloud.r-project.org/")))*

Example:

In [31]:
getCRANmirrors(all = FALSE, local.only = FALSE)
options(repos=structure(c(CRAN="http://cloud.r-project.org/")))

Name,Country,City,URL,Host,Maintainer,OK,CountryCode,Comment
0-Cloud [https],0-Cloud,0-Cloud,https://cloud.r-project.org/,"Automatic redirection to servers worldwide, currently sponsored by Rstudio",winston # stdout.org,1,us,secure_mirror_from_master
0-Cloud,0-Cloud,0-Cloud,http://cloud.r-project.org/,"Automatic redirection to servers worldwide, currently sponsored by Rstudio",winston # stdout.org,1,us,secure_mirror_from_master
Algeria [https],Algeria,Algiers,https://cran.usthb.dz/,University of Science and Technology Houari Boumediene,Boukala m c <mboukala # usthb.dz>,1,dz,
Algeria,Algeria,Algiers,http://cran.usthb.dz/,University of Science and Technology Houari Boumediene,Boukala m c <mboukala # usthb.dz>,1,dz,
Argentina (La Plata),Argentina,La Plata,http://mirror.fcaglp.unlp.edu.ar/CRAN/,Universidad Nacional de La Plata,esuarez # Fcaglp.unlp.edu.ar,1,ar,
Australia (Canberra) [https],Australia,Canberra,https://cran.csiro.au/,CSIRO,"Bill.Venables # CSIRO.au, ServiceDesk2 # CSIRO.au",1,au,secure_mirror_from_master
Australia (Canberra),Australia,Canberra,http://cran.csiro.au/,CSIRO,"Bill.Venables # CSIRO.au, ServiceDesk2 # CSIRO.au",1,au,secure_mirror_from_master
Australia (Melbourne) [https],Australia,Melbourne,https://cran.ms.unimelb.edu.au/,University of Melbourne,cran # ms.unimelb.edu.au,1,au,secure_mirror_from_master
Australia (Melbourne),Australia,Melbourne,http://cran.ms.unimelb.edu.au/,University of Melbourne,cran # ms.unimelb.edu.au,1,au,secure_mirror_from_master
Australia (Perth) [https],Australia,Perth,https://cran.curtin.edu.au/,Curtin University of Technology,unix # curtin.edu.au,1,au,secure_mirror_from_master


## User-defined functions

Users can define their own functions. The encapsulations of code in functions provides a more structured and readable code. The most important advantage however is, that some routines, which are required not only once need not be implemented repititevly. A function must be defined only once and can then be used wherever it is required.

The general syntax for functions in R is:

In [33]:
functionName<-function(listOfParameters){
 statements
 return (result)
}

The list of parameters within the brackets that follow the keyword *function* are the arguments, which are passed as input to the function. Within the function body (inside the curly brackets) arbitrarily complex statements are executed. The result of this computation is returned by the function. The defined function is assigned to a variable *functionName*. The function can be accessed via this variable-name as shown below.

For example in the following code-snippet a function is defined, which normalizes the values of the vector, which is passed as argument to the function. The normalized values of the passed vector is returned by the function.

In [1]:
myNormalizer<-function(rawdata){
    maximum<-max(rawdata)
    minimum<-min(rawdata)
    normeddata=(rawdata-minimum)/(maximum-minimum)
    return (normeddata)
}

Now, this function can be executed wherever it is required, by the name 

In [11]:
a<-10:20
A<-myNormalizer(a)
print(A)

 [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0


In [12]:
b<-c(33,34,20,52,60,71)
B<-myNormalizer(b)
print(B)

[1] 0.2549020 0.2745098 0.0000000 0.6274510 0.7843137 1.0000000


### Passing optional arguments to a function in the function
The function *myNormalizer()* shall now be extended such that it can also provide normalized values, which are rounded to a configurable number of digits. The standard R-function *round()* already has the parameter *digits*, which allows to set the number of digits after the decimal point. The round()-function shall now be applied in the new function *myRoundNormalizer()*. Hence, a value for the *digits* parameter of the *round()*-function must be passed to has to *myRoundNormalizer()*. Passing an arbitrary set of parameters to an inner function can be realized by the *...*-function (triple dot function). This is demonstrated in the following code cells. 

In [13]:
myRoundNormalizer<-function(rawdata,round=F,...){
    maximum<-max(rawdata)
    minimum<-min(rawdata)
    normeddata=(rawdata-minimum)/(maximum-minimum)
    if (!round){
        return (normeddata)
    }else{
        return (round(normeddata,...))
    }
}

In [14]:
B<-myRoundNormalizer(b)
print(B)
B<-myRoundNormalizer(b,round=T)
print(B)
B<-myRoundNormalizer(b,round=T,digits=2)
print(B)

[1] 0.2549020 0.2745098 0.0000000 0.6274510 0.7843137 1.0000000
[1] 0 0 0 1 1 1
[1] 0.25 0.27 0.00 0.63 0.78 1.00


## Efficient evaluation of functions: apply, lapply, sapply
In the case that there exists many sequences of numeric values (such as *a* and *b* above) and each sequence shall be normalized by the *myNormalizer*-function, one can just implement a loop, which envokes in each iteration the *myNormalizer*-function for an individual input-argument (sequence of numeric values). Such an implementation would work, but is not very efficient. It would be much more efficient to use the R built-in function *lapply(list of variables, functionName)*.

As shown in the following code-snippet no looping is required in this way:

In [15]:
columnlist<-list(l1=a,l2=b)
print(columnlist)

$l1
 [1] 10 11 12 13 14 15 16 17 18 19 20

$l2
[1] 33 34 20 52 60 71



In [16]:
columnlistNormed<-lapply(columnlist,myNormalizer)

In [17]:
columnlistNormed

Note that the first parameter of the *lapply()*-function is a list, which contains the objects on which the function (*myNormalizer()* in the example above) shall be executed. In the case that an arbitrary function shall be executed not on a list of objects, but on an array or a matrix, the *apply()*-function can be used. This function has an additional parameter, which determines along which axes of the multidimensional object the function shall be applied. This is demonstrated in the example below. Here the *myNormalizer()*-function is first performed columnwise (parameter 1 in *apply()*) and then rowwise (parameter 2 in *apply()*).

In [57]:
mymat=matrix(floor(runif(28)*20),nrow=4,ncol=7)

In [58]:
mymat

0,1,2,3,4,5,6
0,9,15,2,17,19,15
15,17,16,6,14,14,16
16,15,15,14,4,5,0
4,8,19,1,19,15,12


Columnwise normalization:

In [59]:
mymatNormed=apply(mymat,1,myNormalizer)
mymatNormed

0,1,2,3
0.0,0.8181818,1.0,0.1666667
0.4736842,1.0,0.9375,0.3888889
0.7894737,0.9090909,0.9375,1.0
0.1052632,0.0,0.875,0.0
0.8947368,0.7272727,0.25,1.0
1.0,0.7272727,0.3125,0.7777778
0.7894737,0.9090909,0.0,0.6111111


Rowwise normalization:

In [60]:
mymatNormed=apply(mymat,2,myNormalizer)
mymatNormed

0,1,2,3,4,5,6
0.0,0.1111111,0.0,0.07692308,0.8666667,1.0,0.9375
0.9375,1.0,0.25,0.38461538,0.6666667,0.6428571,1.0
1.0,0.7777778,0.0,1.0,0.0,0.0,0.0
0.25,0.0,1.0,0.0,1.0,0.7142857,0.75


*sapply()*-is similar to *lapply()*. However, it returns a vector or a matrix instead of a list: 

In [23]:
columnmean<-lapply(columnlist,mean)
columnmean
class(columnmean)

In [24]:
columnmean<-sapply(columnlist,mean)
columnmean
class(columnmean)