1\. **Maximum wind speed prediction at the Sprogø station**

The exercise goal is to predict the maximum wind speed occurring every 50 years even if no measure exists for such a period. The available data are only measured over 21 years at the Sprogø meteorological station located in Denmark. 

The annual maxima are supposed to fit a normal probability density function. However such function is not going to be estimated because it gives a probability from a wind speed maxima. Finding the maximum wind speed occurring every 50 years requires the opposite approach, the result needs to be found from a defined probability. That is the quantile function role and the exercise goal will be to find it. In the current model, it is supposed that the maximum wind speed occurring every 50 years is defined as the upper 2% quantile.

By definition, the quantile function is the inverse of the cumulative distribution function. The latter describes the probability distribution of an annual maxima. In the exercise, the cumulative probability $p_i$ for a given year i is defined as $p_i = i/(N+1)$ with $N = 21$, the number of measured years. Thus it will be possible to calculate the cumulative probability of every measured wind speed maxima. From those experimental points, the scipy.interpolate module will be very useful for fitting the quantile function. Finally the 50 years maxima is going to be evaluated from the cumulative probability of the 2% quantile.

Practically, load the dataset:

```python
import numpy as np
max_speeds = np.load('max-speeds.npy')
years_nb = max_speeds.shape[0]
```

Compute then the cumulative probability $p_i$ (`cprob`) and sort the maximum speeds from the data. Use then the  UnivariateSpline from scipy.interpolate to define a quantile function and thus estimate the probabilities.

In the current model, the maximum wind speed occurring every 50 years is defined as the upper 2% quantile. As a result, the cumulative probability value will be:

```python
fifty_prob = 1. - 0.02
```

So the storm wind speed occurring every 50 years can be guessed as:

``` python
fifty_wind = quantile_func(fifty_prob)
```



In [None]:
import numpy as np
from matplotlib import pyplot as plt
from scipy.interpolate import UnivariateSpline
from scipy import optimize
import scipy as sp

max_speeds = np.load('max-speeds.npy')
years_nb = max_speeds.shape[0]

In [None]:
cprob = (np.arange(years_nb) + 1)/(years_nb + 1)
sorted_max_speeds = np.sort(max_speeds)
quantile_func = UnivariateSpline(cprob, sorted_max_speeds)
fifty_prob = 1. - 0.02
fifty_wind = quantile_func(fifty_prob)


In [None]:
fifty_wind

In [None]:
plt.plot(sorted_max_speeds, cprob, '*')
plt.plot(fifty_wind, 1, 'r*')
nprob = np.linspace(0, 1, 1000)
fitted_max_speeds = quantile_func(nprob)
plt.plot(fitted_max_speeds, nprob , 'g--')

2\. **Curve fitting of temperature in Alaska** 

The temperature extremes in Alaska for each month, starting in January, are given by (in degrees Celcius):

max:  17,  19,  21,  28,  33,  38, 37,  37,  31,  23,  19,  18

min: -62, -59, -56, -46, -32, -18, -9, -13, -25, -46, -52, -58

* Plot these temperature extremes.
* Define a function that can describe min and max temperatures. 
* Fit this function to the data with scipy.optimize.curve_fit().
* Plot the result. Is the fit reasonable? If not, why?
* Is the time offset for min and max temperatures the same within the fit accuracy?

In [None]:
m = np.arange(1,13)
MT = np.array([17, 19, 21, 28, 33, 38, 37, 37, 31, 23, 19, 18])
mt = np.array([-62, -59, -56, -46, -32, -18, -9, -13, -25, -46, -52, -58])
plt.plot(m,mt,'b*')
mt = mt+62 #needed to fit the gaussian
plt.plot(m,MT,'r*')
    
from scipy import asarray as ar,exp
from scipy import optimize

def gaussian(x, a, x0, sigma, off):
    return a*exp(-(x-x0)**2. / (2 * sigma**2.))+ off

def f(times, avg, ampl, time_offset):
    return (avg + ampl * np.cos((times + time_offset) * 2 * np.pi / times.max()))

rM, cM = optimize.curve_fit(gaussian, m, MT, [36, 6, 4, 0])
rm, cm = optimize.curve_fit(gaussian,m, mt, [53, 7, 2, 0])

days = np.linspace(0, 12, num=365)
plt.plot(days, gaussian(days, *rM), 'r--')
plt.plot(days, gaussian(days, *rm)-62, 'b--')

3\. **2D minimization of a six-hump camelback function**

$$
f(x,y) = \left(4-2.1x^2+\frac{x^4}{3} \right) x^2 +xy + (4y^2 -4)y^2
$$

has multiple global and local minima. Find the global minima of this function.

Hints:

* Variables can be restricted to $-2 < x < 2$ and $-1 < y < 1$.
* Use numpy.meshgrid() and pylab.imshow() to find visually the regions.
* Use scipy.optimize.minimize(), optionally trying out several of its methods.

How many global minima are there, and what is the function value at those points? What happens for an initial guess of $(x, y) = (0, 0)$ ?


In [None]:
N = 1000
def f(x,y):
    return (4-2.1*x**2+x**4/3)*x**2+x*y+(4*y**2-4)*y**2
x = np.linspace(-2,2,num=N)
y = np.linspace(-1,1,num=N)
xm, ym = np.meshgrid(x, y)
plt.figure()
plt.imshow(f(xm, ym), extent=[-2, 2, -1, 1])
plt.colorbar()

In [None]:
#long time needed
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(20, 10))
ax = fig.gca(projection='3d')
ax.plot_surface(xm, ym, f(xm, ym),cstride=1,rstride=1)
plt.show()

In [None]:
import matplotlib.cm as cm
plt.imshow(f(xm,ym), origin='lower', extent=[-2, 2, -1, 1])
plt.contour(xm,ym,f(xm,ym), cmap=cm.Blues, levels=np.arange(-2,7,0.1))
plt.show()

#### (0,0)
it's a saddle point(as we can see in the graph above) so the point does not move at all

In [None]:
def f(t):
    x,y = t
    return (4-2.1*x**2+x**4/3)*x**2+x*y+(4*y**2-4)*y**2
ig = [0,0]
result = optimize.minimize(f, ig)
print("fitted params {} , f={}".format(result.x, f(result.x)))

In [None]:
#create a bunch of hypo
xs = np.linspace(-2,2,50)
ys = np.linspace(-1,1,25)
found = {}
count = 0

plt.figure()
plt.imshow(f([xm, ym]), extent=[-2, 2, -1, 1])
plt.colorbar()
for a in xs:
    for b in ys:
        ig = [a,b]
        result = optimize.minimize(f, ig)
        if result.success:
            #print("for ig={} fitted params {} , f={}".format(ig, result.x, f(result.x)))
            #numpy ndarry not hashable -> tuple
            if tuple(result.x) in found:
                found[tuple(result.x)][0] +=1
            else:
                found[tuple(result.x)] = [1, f(result.x)]
        else:
            raise ValueError(result.message)
        plt.plot(a,b,'r*')
plt.title('starting points in red')
plt.show()

In [None]:
plt.figure()
plt.imshow(f([xm, ym]), extent=[-2, 2, -1, 1])
plt.colorbar()
for i in found:
    plt.plot(i[0],i[1],'*',label='{},{} with val {}'.format(i[0],i[1],found[i][1]))
plt.title('min found with that starting points')
#plt.legend()
plt.show()

found 1250(= to the number of starting points) minima located around 6 points
the global ones are the two with x~0 

4\. FFT of a simple dataset

Performe a periodicity analysis on the lynxs-hares population

In [None]:
from scipy import fftpack
from scipy import signal
df = np.loadtxt('populations.txt')
year, hares, lynxes, carrots = df.T
plt.plot(year, hares, year, lynxes, year, carrots) 
plt.legend(('Hare', 'Lynx', 'Carrot'), loc=(1.01, 0.75)) 
ts = 1 #timestep
fig = plt.figure("FFT", figsize=(20,4))
fig2 = plt.figure("Peak")
c = 1
for i in (hares, lynxes, carrots):
    plt.figure('FFT')
    fft = fftpack.fft(i)
    fft = np.abs(fft)
    freq = fftpack.fftfreq(i.size,d=ts)
    
    ax1 = fig.add_subplot(1, 3, c)
    if (c==1):
        plt.title('Hares FFT power')
    elif (c==2):
        plt.title('Lynxes FFT power')
    else:
        plt.title('Carrot FFT power')
    ax1=plt.plot(freq, fft)
    plt.xlabel('Frequency')
    plt.ylabel('power')
    mask = np.where(freq > 0)
    l_freq = freq[mask]
    l_peak = l_freq[fft[mask].argmax()]
    plt.figure('Peak')
    axes = plt.axes([0.65, 0.4, 0.2, 0.4]) #numbers range from 0 to 1 and set position left-right, position up-down, width, height 
    plt.title('Peak frequency')
    if (c==1):
        leg = 'Hares'
        print('Hares peak freq: {}'.format(l_peak))
    elif (c==2):
        leg = 'Lynxes'
        print('Hares peak freq: {}'.format(l_peak))
    else:
        leg = 'Carrot'
        print('Hares peak freq: {}'.format(l_peak))
    
    plt.plot(freq[:4], fft[:4],label=leg)
    plt.setp(axes, yticks=[])
    hf_fft = fftpack.fft(i)
    hf_fft[np.abs(freq)>l_peak] = 0 #0 freq greater than peak
    f_ifft = fftpack.ifft(hf_fft)
    plt.figure('OrigVsFilt'+leg)
    if (c==1):
        plt.title('Hares FFT power')
    elif (c==2):
        plt.title('Lynxes FFT power')
    else:
        plt.title('Carrot FFT power')
    plt.plot(year, i, label='Original signal')
    plt.plot(year, f_ifft, linewidth=3, label='Filtered signal')
    plt.xlabel('Time [s]')
    plt.ylabel('Amplitude')
    plt.legend(loc='best')

    c+=1
plt.figure('Peak')
plt.legend(bbox_to_anchor=(1,1))

5\. **FFT of an image**

* Examine the provided image `moonlanding.png`, which is heavily contaminated with periodic noise. In this exercise, we aim to clean up the noise using the Fast Fourier Transform.
* Load the image using pylab.imread().
* Find and use the 2-D FFT function in scipy.fftpack, and plot the spectrum (Fourier transform of) the image. Do you have any trouble visualising the spectrum? If so, why?
* The spectrum consists of high and low frequency components. The noise is contained in the high-frequency part of the spectrum, so set some of those components to zero (use array slicing).
* Apply the inverse Fourier transform to see the resulting image.

In [None]:
import pylab
from matplotlib.colors import LogNorm

# Load image
img = plt.imread('moonlanding.png')
data = plt.imread('moonlanding.png')

fft = fftpack.fft2(data)
power = np.abs(fft)

plt.figure(figsize=(18, 18))
ax1 = plt.subplot(3, 2, 1)
ax1.set_title('Original Image')
ax1.imshow(img)

ax1 = plt.subplot(3, 2, 2)
ax1.set_title('Original Image - grey')
ax1.imshow(img, cmap=plt.cm.gray)

ax2 = plt.subplot(3, 2, 3)
ax2.imshow(power, norm=LogNorm(vmin=5))
ax2.set_title('FFT power log')

ax3 = plt.subplot(3, 2, 4)
ax3.plot(power)
ax3.set_title("Spectrum")

#filtered fft, keep only 95% of freq
filtered_fft = fft.copy()
filtered_power = power.copy()
for k in range(80,10000,10):
    temp = np.where( power > k)
    if (1-(len(temp[0]))/fft.size) > 0.995:
        filtered_fft[power > k] = 0
        filtered_power[power > k] = 0
        print("{} retain {} of power".format(k,(1-(len(temp[0]))/fft.size)))
        break
ax4 = plt.subplot(3,2,5)
ax4.set_title('Filtered power')
ax4.plot(filtered_power)

# Plot results
ax5 = plt.subplot(3, 2, 6)
ax5.set_title('Cleared Image')
cleared = fftpack.ifft2(filtered_fft).real
ax5.imshow(cleared, cmap=plt.cm.gray)

plt.show()