1\. **Maximum wind speed prediction at the Sprogø station**

The exercise goal is to predict the maximum wind speed occurring every 50 years even if no measure exists for such a period. The available data are only measured over 21 years at the Sprogø meteorological station located in Denmark. 

The annual maxima are supposed to fit a normal probability density function. However such function is not going to be estimated because it gives a probability from a wind speed maxima. Finding the maximum wind speed occurring every 50 years requires the opposite approach, the result needs to be found from a defined probability. That is the quantile function role and the exercise goal will be to find it. In the current model, it is supposed that the maximum wind speed occurring every 50 years is defined as the upper 2% quantile.

By definition, the quantile function is the inverse of the cumulative distribution function. The latter describes the probability distribution of an annual maxima. In the exercise, the cumulative probability $p_i$ for a given year i is defined as $p_i = i/(N+1)$ with $N = 21$, the number of measured years. Thus it will be possible to calculate the cumulative probability of every measured wind speed maxima. From those experimental points, the scipy.interpolate module will be very useful for fitting the quantile function. Finally the 50 years maxima is going to be evaluated from the cumulative probability of the 2% quantile.

Practically, load the dataset:

```python
import numpy as np
max_speeds = np.load('max-speeds.npy')
years_nb = max_speeds.shape[0]
```

Compute then the cumulative probability $p_i$ (`cprob`) and sort the maximum speeds from the data. Use then the  UnivariateSpline from scipy.interpolate to define a quantile function and thus estimate the probabilities.

In the current model, the maximum wind speed occurring every 50 years is defined as the upper 2% quantile. As a result, the cumulative probability value will be:

```python
fifty_prob = 1. - 0.02
```

So the storm wind speed occurring every 50 years can be guessed as:

``` python
fifty_wind = quantile_func(fifty_prob)
```



In [None]:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
from scipy.interpolate import UnivariateSpline

max_speeds = np.load('max-speeds.npy')
years_nb = max_speeds.shape[0]
years = np.linspace(1, years_nb, years_nb)
cprob = [i/(years_nb + 1) for i in years]
max_speeds = np.sort(max_speeds)

plt.plot(cprob, max_speeds,'x', label='Observed Data')
quantile_f = UnivariateSpline(cprob, max_speeds)
plt.plot(cprob, quantile_f(cprob), label='Predicted Data')
plt.legend()

fifty_prob = 1. - 0.02
fifty_wind = quantile_f(fifty_prob)
print('\nThe guessed storm wind speed occurring every 50 years is:', fifty_wind)

2\. **Curve fitting of temperature in Alaska** 

The temperature extremes in Alaska for each month, starting in January, are given by (in degrees Celcius):

max:  17,  19,  21,  28,  33,  38, 37,  37,  31,  23,  19,  18

min: -62, -59, -56, -46, -32, -18, -9, -13, -25, -46, -52, -58

* Plot these temperature extremes.
* Define a function that can describe min and max temperatures. 
* Fit this function to the data with scipy.optimize.curve_fit().
* Plot the result. Is the fit reasonable? If not, why?
* Is the time offset for min and max temperatures the same within the fit accuracy?

In [None]:
from scipy.stats import norm
from scipy import optimize

max_temp = np.array([17, 19, 21, 28, 33, 38, 37, 37, 31, 23, 19, 18])
min_temp = np.array([-62, -59, -56, -46, -32, -18, -9, -13, -25, -46, -52, -58])

months = np.linspace(1, 12, 12)
plt.scatter(months, max_temp, color='red', label='Max Temperature')
plt.scatter(months, min_temp, color='blue', label='Min Temperature')

# Define gaussian function
def gaussian(x, mean, sigma, amp, off):
    return amp*norm.pdf(x, mean, sigma) + off

# For the case of optimizing min temperature I added an initial guess of the parameters 
# in order to obtain a good result. Otherwise it doesn't work.

max_p, max_p_cov = optimize.curve_fit(gaussian, months, max_temp)
min_p, min_p_cov = optimize.curve_fit(gaussian, months, min_temp, p0=[7, 1, 1, -60])

plt.plot(months, gaussian(months, *max_p), color='red', label='Max T predicted')
plt.plot(months, gaussian(months, *min_p), color='blue', label='Min T predicted')
plt.legend()
plt.show()

3\. **2D minimization of a six-hump camelback function**

$$
f(x,y) = \left(4-2.1x^2+\frac{x^4}{3} \right) x^2 +xy + (4y^2 -4)y^2
$$

has multiple global and local minima. Find the global minima of this function.

Hints:

* Variables can be restricted to $-2 < x < 2$ and $-1 < y < 1$.
* Use numpy.meshgrid() and pylab.imshow() to find visually the regions.
* Use scipy.optimize.minimize(), optionally trying out several of its methods.

How many global minima are there, and what is the function value at those points? What happens for an initial guess of $(x, y) = (0, 0)$ ?


In [None]:
def f(x,y):
    return (4 - 2.1*(x**2) + (x**4)/3)*(x**2) + x*y + (4*(y**2) - 4)*(y**2)
def f_p(x):
    return (4 - 2.1*(x[0]**2) + (x[0]**4)/3)*(x[0]**2) + x[0]*x[1] + (4*(x[1]**2) - 4)*(x[1]**2)

N = 100
x = np.linspace(-2,2,N)
y = np.linspace(-1,1,N)

x_v, y_v = np.meshgrid(x, y)

plt.imshow(f(x_v, y_v), extent=[-2, 2, -1, 1])
plt.colorbar()

# We can see that there are two global minimal and this is due to the fact
# that the function is an even function f(x,y)=f(-x,-y) and two local minima

#First initial guess
x1 = [0, -1]
min1 = optimize.minimize(f_p, x1)
print('First global min in:', min1.x)

#Second initial guess
x2 = [0, 1]
min2 = optimize.minimize(f_p, x2)
print('Second global min in:', min2.x)

#Initial guess (0,0)
x0 = [0,0]
min0 = optimize.minimize(f_p, x0)
print('Result obtained starting from point (0,0):', min0.fun)
# It means that (0,0) is a saddle point

4\. **FFT of a simple dataset**

Performe a periodicity analysis on the lynxs-hares population

In [None]:
from scipy import fftpack

data = np.loadtxt('populations.txt')
year, hares, lynxes, carrots = data.T

plt.plot( year, hares, year, lynxes, year, carrots ) 
plt.legend( ('Hare', 'Lynx', 'Carrot') )

In [None]:
fft_hares = fftpack.fft(hares)
fft_lynxes = fftpack.fft(lynxes)
sample_freq = fftpack.fftfreq(year.size)

plt.figure()
plt.plot(sample_freq, np.abs(fft_hares), label='Hares')
plt.plot(sample_freq, np.abs(fft_lynxes), label='Lynxes')
plt.xlabel('Period')
plt.ylabel('Power')

5\. **FFT of an image**

* Examine the provided image `moonlanding.png`, which is heavily contaminated with periodic noise. In this exercise, we aim to clean up the noise using the Fast Fourier Transform.
* Load the image using pylab.imread().
* Find and use the 2-D FFT function in scipy.fftpack, and plot the spectrum (Fourier transform of) the image. Do you have any trouble visualising the spectrum? If so, why?
* The spectrum consists of high and low frequency components. The noise is contained in the high-frequency part of the spectrum, so set some of those components to zero (use array slicing).
* Apply the inverse Fourier transform to see the resulting image.

In [None]:
image = plt.imread("moonlanding.png")

f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize = (15,10))
ax1.imshow(image, plt.cm.gray)
ax1.set_title("Original Image")

image_fft = fftpack.fft2(image)
spectrum = np.abs(image_fft)
ax2.plot(spectrum)
ax2.set_title('Spectrum')

#Since there are many peaks, eliminate high frequency components
filtered_fft = image_fft
filtered_fft[spectrum > 2e3] = 0
filtered_spectrum = np.abs(filtered_fft)
ax3.plot(filtered_spectrum)
ax3.set_title('Filtered Spectrum')

filtered_image = fftpack.ifft2(filtered_fft).real
ax4.imshow(filtered_image, plt.cm.gray)
ax4.set_title("Filtered Image")