# Maximum Likelihood

This section is based on the [R programming wikibook](https://en.wikibooks.org/wiki/R_Programming/Maximum_Likelihood) 

## Introduction


Maximum likelihood estimation is just an optimization problem. You have to write down your log likelihood function and use some optimization technique. Sometimes you also need to write your score (the first derivative of the log likelihood) and or the hessian (the second derivative of the log likelihood). 

## One dimension


If there is only one parameter, then the log likelihood can be optimised using the __optimize__ function.


### <u>Example 1</u> - Type 1 Pareto distribution

Note that in this example the minimum value is treated as known and is not estimated. Therefore this is a one-dimensional problem.

The __rpareto1__ function from the __actuar__ package is used to generate a random vector from a type 1 Pareto distribution with shape equal to 1 and minimum value equal to 500. The __dpareto1__ function, also from the __actuar__ package , is used  with option log = TRUE to write the log likelihood. Finally, __optimize__ is used with maximum=TRUE and a minimum and maximum value for the parameter is provided using the the interval option.

First, install the __actuar__ package.

In [3]:
install.packages("actuar")

Installing package into ‘/home/grosedj/R-packages’
(as ‘lib’ is unspecified)

also installing the dependency ‘expint’


“installation of package ‘expint’ had non-zero exit status”
“installation of package ‘actuar’ had non-zero exit status”


In [7]:
library(actuar)
y <- rpareto1(1000, shape = 1, min = 500)
ll <- function(mu, x) 
{
    sum(dpareto1(x,mu[1],min = min(x),log = TRUE)) 
} 
optimize(f = ll, x = y, interval = c(0,10), maximum = TRUE)


### <u>Exercise 1</u>

Find out more about the __optimize__ function.

In [8]:
help(optimize)

### <u>Exercise 2</u>

How could you use the __Curry__ function from the __functional__ package to help organise your functions and parameters for use with __optimize__ ?

### <u>Exercise 3</u>

Demonstrate your solution to Exercise 2

## Multiple dimensions

For optimising more than one parameter, use the  __optim__ function.

### <u>Example 2</u> - Beta distribution

In [14]:
y <- rbeta(1000,2,2)
loglik <- function(mu, x) 
{ 
    sum(-dbeta(x,mu[1],mu[2],log = TRUE)) 
}  
out <- optim(par = c(1,1), fn=loglik,x=y,method = "L-BFGS-B",lower=c(0,0))
print(out)

$par
[1] 1.842181 1.849503

$value
[1] -101.1554

$counts
function gradient 
       7        7 

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"



Note that the runtime of the optimiser can grow with dimension $d$ of the problem (i.e. the number of parameters), and, in general, has complexity $\mathcal{O}(2^{d})$. Having a function which calculates the jacobian of the objective function can greatly reduce this growth in runtime. To be able to exploit the availability of a function that efficiently computes the jacobian the __optimx__ package can be used. 

### <u>Exercise 4</u>

Find details of the __optimx__ package and install it.