# <span style="color:blue"> <center>Student / TD3 : 4TPU279U $-$ Bachelor 1st year $-$ spring 2023</center></span>
# <center>Introduction to python programming</center>
# <hr style="border:1px solid black"><center>   Survival guide for practical work </center><hr style="border:1px solid black">
</br>

<div style="text-align: right"> Credits: R. Boisgard, L. Truflandier, Philippe Paillou, Julien Burgin, Sara Zein, Leo Delmarre, Simon Villain-Guillot </div>

Given an example, in this survival guide we will:

1. Plot the data using python. Add axis labels, units, legend.
2. Fit a linear model to the data. 
3. Superimpose this line on the plot and save the figure.
4. Use the linear model equation

### Example

In water-resources engineering, the sizing of reservoirs depends on accurate estimates of
water flow in the river that is being impounded. For some rivers, long-term historical
records of such flow data are difficult to obtain. In contrast, meteorological data on
precipitation are often available for many years past. Therefore, it is often useful to
determine a relationship between flow and precipitation. This relationship can then be
used to estimate flows for years when only precipitation measurements were made. The
following data are available for a river: [<sup>1</sup>](#fn1)

*Dans le domaine de l'ingénierie des ressources en eau, le dimensionnement des réservoirs dépend d'une estimation précise du débit de la rivière que l'on considère. Pour certains cours d'eau, il est difficile d'obtenir des enregistrements historiques à long terme du débit. En revanche, les données météorologiques sur les précipitations sont souvent disponibles pour de plus longues périodes. Par conséquent, il est souvent utile de déterminer une relation entre le débit et les précipitations. Cette relation peut alors être utilisée pour estimer les débits pour les années où seules les précipitations ont été mesurées. Les données suivantes sont disponibles pour une rivière :*

Precip. (cm/yr)|  88.9 | 108.5 | 104.1 | 139.7 | 127.0 | 94.0 | 116.8 | 99.1 |
---------------|-------|-------|-------|-------|-------|------|-------|------|
Flow (m$^3$/s) |  14.6 |  16.7 |  15.3 |  23.2 |  19.5 | 16.1 |  18.1 | 16.6 |



<span id="fn1"> $^1$ Example taken from *Applied Numerical Methods with MATLAB* by S. C. Chapra.</span>

### <hr style="border:1px solid black"> Load the data <hr style="border:1px solid black">

#### Create numpy array with the raw data

> Here we first create list using the data which will be converted to `numpy arrays`

> Warning : we used English floating-point notation !

In [None]:
from numpy import array

prec = [88.9, 108.5, 104.1, 139.7, 127.0, 94.0, 116.8, 99.1]
flow = [14.6,  16.7,  15.3,  23.2,  19.5, 16.1,  18.1, 16.6]

x = array(prec)
y = array(flow)

#### ... or load the data from a file using python intrinsic function

In [None]:
file = open('data_examples/prec_flow.csv','r')
file.readline()

x = [ ] ; y = [ ]
for line in file:
    words = line.split(',')
    x.append( float(words[0]) )
    y.append( float(words[1]) )
print(prec)
print()
print(flow)

#### ... or load the data from a file using pandas

In [None]:
from pandas import read_csv

data_pandas = read_csv('data_examples/prec_flow.csv')
print(data_pandas)
#data_numpy_array = data_pandas.to_numpy()
#x = data_numpy_array[:,0]
#y = data_numpy_array[:,1]

### <hr style="border:1px solid black"> Plot the graphics <hr style="border:1px solid black">

#### Using Matplotlib

In [None]:
import matplotlib.pylab as plt

plt.plot(x,y,marker='o',color='black', linestyle='')
plt.xlabel("Precipitation (cm/yr)")
plt.ylabel("Flow (m$^3$/s)")

plt.xticks(arange(min(x)+1.1,max(x)+1,5))
plt.yticks(arange(min(y)+0.4,max(y),1))
plt.title('Water flow as a function of precipitation')

#plt.xlim()
#plt.ylim()

#### Using Pandas

> Pandas can be used to performed plot with the function [`DataFrame.plot`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.html)

In [None]:
data_pandas.plot(x='prec',y='flow',kind='scatter',color='black')

### <hr style="border:1px solid black"> Perform a linear fit <hr style="border:1px solid black">

> The fit is performed using `curve_fit` from the library `scipy.optimize`

> Given a the data points $\{x_i,y_i\}^N_{i=1}$, the standard error $\{\sigma_i\}^N_{i=1}$ on $\{y_i\}^N_{i=1}$ and the model function $f_\textrm{model}(x;a_0,a_1,...a_m)$ where $\{a_k\}_{k=1}^m$ are the coefficient to optimized. Given $\chi^2$:
$$\chi^2=\sum_{i=1}^{N}\frac{\left[y_i-f_\textrm{model}\left(x_i;a_0,a_1,...a_m\right)\right]^2}{\sigma^2_i}$$
`curve_fit`perform the $\chi^2$ minimization with respect to these coefficients.

First we define the model function ; in this case a linear function: $f(x)=a_0 +a_1x$

In [None]:
def fmodel(x,a0,a1):
    return a0 + a1*x

Then we call the function `curve_fit` 

> with inputs $(f_\textrm{model},x,y,\sigma)$. 

> In output we have the optimal values for the coefficients (`popt`) and the (co)variance (`pcov`) of the coefficient estimates.

In [None]:
from scipy.optimize import curve_fit
from numpy import ones, sqrt

# standard error is set to 1 in this example
sig = ones((len(x)))

# call curve_fit to obtain the optimized coefficients and covariance
popt, pcov = curve_fit(f=fmodel, xdata=x, ydata=y, sigma=sig)

# extract optimized coefficients
[a0, a1] = popt

# extract standard deviation errors
sa0 = sqrt(pcov[0,0])
sa1 = sqrt(pcov[1,1])

print('a0 = %.4f +/- %.4f'%(a0,sa0))
print('a1 = %.4f +/- %.4f'%(a1,sa1))

### <hr style="border:1px solid black"> Plot the linear model <hr style="border:1px solid black">

Given $x_\textrm{model}\in[\min(x),\max(x)]$ with $N=128$ compute $y_\textrm{model}=f_\textrm{model}(x_\textrm{model};a_0,a_1)$ 

In [None]:
from numpy import linspace

x_model = linspace(min(x),max(x),128)
y_model = fmodel(x=x_model,a0=a0,a1=a1)

Plot the experimental data and the linear model function ; save the figure

In [None]:
import matplotlib.pylab as plt

plt.plot(x,      y,       marker='o', color='black', linestyle='', label='exp. data')
plt.plot(x_model,y_model, marker= '', color='blue',  linestyle='-',
         label='linear regression with $f(x)= %.3f + %.3f x$'%(a0,a1))

plt.xlabel("Precipitation (cm/yr)")
plt.ylabel("Flow (m$^3$/s)")

plt.xticks(arange(min(x)+1.1,max(x)+1,5))
plt.yticks(arange(min(y)+0.4,max(y),1))

plt.legend()
plt.title('Water flow as a function of precipitation')

plt.savefig('figure_prec_flow.png', dpi=300, format='png', transparent=True)

### <hr style="border:1px solid black"> Use the linear model <hr style="border:1px solid black">

Evaluate the annual water flow if the precipitation is 120 cm/year.

In [None]:
prec = 120
flow = fmodel(x=prec,a0=a0,a1=a1)
print('flow = %.4f m^3/s for prec. = %.4f'%(prec,flow))