# Python deliberate Practice

## An Introduction to Numerical Computing with Python

While the Python language is an excellent tool for general-purpose programming, with a highly readable syntax, rich and powerful data types (strings, lists, sets, dictionaries, arbitrary length integers, etc) and a very comprehensive standard library, it was not designed specifically for mathematical and scientific computing.  Neither the language nor its standard library have facilities for the efficient representation of multidimensional datasets, tools for linear algebra and general matrix manipulations (an essential building block of virtually all technical computing), nor any data visualization facilities.

In particular, Python lists are very flexible containers that can be nested arbitrarily deep and which can hold any Python object in them, but they are poorly suited to represent efficiently common mathematical constructs like vectors and matrices.  In contrast, much of our modern heritage of scientific computing has been built on top of libraries written in the Fortran language, which has native support for vectors and matrices as well as a library of mathematical functions that can efficiently operate on entire arrays at once.

## Basics of Numpy arrays

We now turn our attention to the Numpy library, which forms the base layer for the entire 'scipy ecosystem'.  Once you have installed numpy, you can import it as

In [1]:
import numpy as np

In [2]:
x = range(50000)
y = np.arange(50000)

%timeit [e**2  for e in x]
%timeit y**2

10 loops, best of 3: 19.4 ms per loop
10000 loops, best of 3: 71.4 µs per loop


### The Numpy array structure
<center>
<img src="files/array_memory_strides.png" width=70%>
</center>

### Arrays vs lists

As mentioned above, the main object provided by numpy is a powerful array.  We'll start by exploring how the numpy array differs from Python lists.  We start by creating a simple list and an array with the same contents of the list:

In [3]:
lst = [10, 20, 30, 40]
arr = np.array([10, 20, 30, 40])

Elements of a one-dimensional array are accessed with the same syntax as a list:

In [4]:
lst[0]

10

In [5]:
arr[0]

10

In [6]:
arr[-1]

40

In [7]:
arr[2:]

array([30, 40])

The first difference to note between lists and arrays is that arrays are *homogeneous*; i.e. all elements of an array must be of the same type.  In contrast, lists can contain elements of arbitrary type. For example, we can change the last element in our list above to be a string:

In [8]:
lst[-1] = 'a string inside a list'
lst

[10, 20, 30, 'a string inside a list']

but the same can not be done with an array, as we get an error message:

In [9]:
arr[-1] = 'a string inside an array'

ValueError: invalid literal for int() with base 10: 'a string inside an array'

### Array memory representation

The information about the type of an array is contained in its *dtype* attribute:

In [10]:
x = np.array([[1, 2], [3, 4]], dtype=np.uint8)
print(x)

[[1 2]
 [3 4]]


In [11]:
[b for b in bytes(x.data)]

[1, 2, 3, 4]

In [12]:
arr = np.array([10, 20, 123123])
arr.dtype

dtype('int64')

In [13]:
[b for b in bytes(arr.data)]

[10, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 243, 224, 1, 0, 0, 0, 0, 0]

Once an array has been created, its dtype is fixed and it can only store elements of the same type.  For this example where the dtype is integer, if we store a floating point number it will be automatically converted into an integer:

In [14]:
arr[-1] = 1.234
arr

array([10, 20,  1])

Strange things can also happen when manipulating values in an array:

In [15]:
x = np.array([0, 127, 255], dtype=np.uint8)
print(x)

[  0 127 255]


In [16]:
x + 1

array([  1, 128,   0], dtype=uint8)

In [17]:
x - 1

array([255, 126, 254], dtype=uint8)

### Array creation

Above we created an array from an existing list; now let us now see other ways in which we can create arrays, which we'll illustrate next.  A common need is to have an array initialized with a constant value, and very often this value is 0 or 1 (suitable as starting value for additive and multiplicative loops respectively); `zeros` creates arrays of all zeros, with any desired dtype:

In [18]:
np.zeros((5, 5), dtype=np.float64)

array([[ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.]])

In [19]:
np.zeros((2, 3), dtype=np.int64)

array([[0, 0, 0],
       [0, 0, 0]])

In [20]:
np.zeros((3,2), dtype=complex)

array([[ 0.+0.j,  0.+0.j],
       [ 0.+0.j,  0.+0.j],
       [ 0.+0.j,  0.+0.j]])

and similarly for `ones`:

In [21]:
np.ones(5)

array([ 1.,  1.,  1.,  1.,  1.])

Then there are the `linspace` and `logspace` functions to create linearly and logarithmically-spaced grids, respectively, with a fixed number of points and including both ends of the specified interval:

In [22]:
np.linspace(0, 1, 5)   # start, stop, num

array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ])

In [23]:
np.logspace(1, 4, 4)  # Logarithmic grid between 10^1 and 10^4

array([    10.,    100.,   1000.,  10000.])

Finally, it is often useful to create arrays with random numbers that follow a specific distribution.  The `np.random` module provides several random number generators.  For example, here we produce an array of 5 random samples taken from a standard normal distribution (0 mean and variance 1):

In [24]:
rng = np.random.RandomState(0)  # <-- seed value, do not have to specify, but useful for reproducibility

In [25]:
rng.normal(loc=5, scale=1, size=5)

array([ 6.76405235,  5.40015721,  5.97873798,  7.2408932 ,  6.86755799])

Or the same, but from a uniform distribution:

In [26]:
uni = rng.uniform(-10, 10, size=5)  # 5 random numbers, picked from a uniform distribution between -10 and 10
print(uni)

[-1.24825577  7.83546002  9.27325521 -2.33116962  5.83450076]


## Indexing with other arrays

Above we saw how to index arrays with single numbers and slices, just like Python lists.  But arrays allow for a more sophisticated kind of indexing which is very powerful: you can index an array with another array, and in particular with an array of boolean values.  This is particluarly useful to extract information from an array that matches a certain condition.

Consider for example that in the array `uni` we want to replace all values above 0 with the value 10.  We can do so by first finding the *mask* that indicates where this condition is true or false:

In [27]:
mask = uni > 0
mask

array([False,  True,  True, False,  True], dtype=bool)

Now that we have this mask, we can use it to either read those values or to reset them to 0:

In [28]:
print('Array:', uni)
print('Masked array:', uni[mask])

Array: [-1.24825577  7.83546002  9.27325521 -2.33116962  5.83450076]
Masked array: [ 7.83546002  9.27325521  5.83450076]


In [29]:
uni[mask] = 10
print(uni)

[ -1.24825577  10.          10.          -2.33116962  10.        ]


### Arrays with more than one dimension

Most of the array creation methods can be used to construct >1D arrays:

In [30]:
np.zeros((3, 4))

array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

In [31]:
np.zeros((2, 3, 2, 2))

array([[[[ 0.,  0.],
         [ 0.,  0.]],

        [[ 0.,  0.],
         [ 0.,  0.]],

        [[ 0.,  0.],
         [ 0.,  0.]]],


       [[[ 0.,  0.],
         [ 0.,  0.]],

        [[ 0.,  0.],
         [ 0.,  0.]],

        [[ 0.,  0.],
         [ 0.,  0.]]]])

We can also reshape arrays to fit the desired shape:

In [32]:
np.arange(12)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [33]:
arr = np.arange(12).reshape((3, 4))
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

With two-dimensional arrays we start seeing the power of numpy: while a nested list can be indexed using repeatedly the `[ ]` operator, multidimensional arrays support a more direct indexing syntax with a single `[ ]` and a set of indices separated by commas:

In [34]:
arr[0][1]

1

In [35]:
arr[:, 0]

array([0, 4, 8])

If you only provide one index, then you will get an array with one fewer dimension containing that row:

In [36]:
print('First row:  ', arr[0])
print('Second row: ', arr[1])

First row:   [0 1 2 3]
Second row:  [4 5 6 7]


## Slicing, repeating, tiling

Extracting elements from NumPy array works pretty much like in lists:

In [37]:
x = np.array([1,2,3,4,5,6,7])
print(x[3:])
print(x[::-1])

[4 5 6 7]
[7 6 5 4 3 2 1]


In [38]:
print(x[::-2])

[7 5 3 1]


In [39]:
print(x[::-5])

[7 2]


In [40]:
print(x[::-10])

[7]


In [41]:
print(x[3::-1])

[4 3 2 1]


In [42]:
print(x[-2::-3])

[6 3]


In [43]:
print(x[-4::-1])

[4 3 2 1]


### Other numpy functions and array properties

Now that we have seen how to create arrays with more than one dimension, let's take a look at some other properties.

In [44]:
print('Data type                :', arr.dtype)
print('Total number of elements :', arr.size)
print('Number of dimensions     :', arr.ndim)
print('Shape (dimensionality)   :', arr.shape)
print('Memory used (in bytes)   :', arr.nbytes)

Data type                : int64
Total number of elements : 12
Number of dimensions     : 2
Shape (dimensionality)   : (3, 4)
Memory used (in bytes)   : 96


There are also many useful functions in numpy that operate on arrays, e.g.:

In [45]:
print('Minimum and maximum             :', np.min(arr), np.max(arr))
print('Sum and product of all elements :', np.sum(arr), np.prod(arr))
print('Mean and standard deviation     :', np.mean(arr), np.std(arr))

Minimum and maximum             : 0 11
Sum and product of all elements : 66 0
Mean and standard deviation     : 5.5 3.45205252953


For these methods, the above operations area all computed on all the elements of the array.  But for a multidimensional array, it's possible to do the computation along a single dimension, by passing the `axis` parameter; for example:

In [46]:
print('For the following array:\n', arr)
print('The sum of elements along the rows is    :', np.sum(arr, axis=1))
print('The sum of elements along the columns is :', np.sum(arr, axis=0))

For the following array:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
The sum of elements along the rows is    : [ 6 22 38]
The sum of elements along the columns is : [12 15 18 21]


As you can see in this example, the value of the `axis` parameter is the dimension which will be *consumed* once the operation has been carried out.  This is why to sum along the rows we use `axis=0`.

Another widely used property of arrays is the `.T` attribute, which allows you to access the transpose of the array (NumPy does this without making a copy of the array, by manipulating its strides):

In [47]:
#print('Array:\n', arr)
#print('Transpose:\n', arr.T)

## Operating with arrays

Standard mathematical operations Just Work (TM):

In [48]:
arr1 = np.arange(4)
arr2 = np.arange(10, 14)
print(arr1, '+', arr2, '=', arr1 + arr2)

[0 1 2 3] + [10 11 12 13] = [10 12 14 16]


Note, that, unlike in MATLAB, operations are performed element-wise:

In [49]:
print(arr1, '*', arr2, '=', arr1 * arr2)

[0 1 2 3] * [10 11 12 13] = [ 0 11 24 39]


While this means that, in principle, arrays must always match in their dimensionality in order for an operation to be valid, numpy will *broadcast* dimensions when possible.  For example, suppose that you want to add the number 1.5 to `arr1`, this works:

In [50]:
arr1 + 3

array([3, 4, 5, 6])

<img src="files/broadcast_1D.png"/>

### The broadcasting rules

This broadcasting behavior is powerful, especially because when numpy broadcasts to create new dimensions or to 'stretch' existing ones, it doesn't replicate the data.  In the example above the operation is carried *as if* the 3 was a 1-d array with 3 in all of its entries, but no actual array was ever created.  This can save memory in cases when the arrays in question are large, with significant performance implications.

The general rule is: when operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward, creating dimensions of length 1 as needed. Two dimensions are considered compatible when

* they are equal or either is None or one
* either dimension is 1 or ``None``, or if dimensions are equal

If these conditions are not met, a `ValueError: frames are not aligned` exception is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is the maximum size along each dimension of the input arrays.

Examples below:

```
(9, 5)   (9, 5)   (9, 5)   (9, 1)
   ( )   (9, 1)   (   5)   (   5)
------   ------   ------   ------
(9, 5)   (9, 5)   (9, 5)   (9, 5)

```

<img src="files/broadcast_rougier.png"/>

Sketch from [Nicolas Rougier's NumPy tutorial](http://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html)

### Visual illustration of broadcasting
<center>
<img src="files/numpy_broadcasting.svg" width=80%>
</center>

Sometimes, it is necessary to modify arrays before they can be used together.  Numpy allows you to add new axes to an array by indexing with `np.newaxis`:

In [51]:
c = np.arange(5)
d = np.arange(6)

print(c.shape)
print(d.shape)

c + d

(5,)
(6,)


ValueError: operands could not be broadcast together with shapes (5,) (6,) 

In [52]:
c = np.arange(5)
d = np.arange(6)

c = c[:, np.newaxis]

print(c.shape)
print('  ', d.shape)
print('-------')
print((c + d).shape)
 
#   d d d d d d
#
# c              c c c c c c   d d d d d d
# c              c c c c c c   d d d d d d
# c     +      = c c c c c c + d d d d d d
# c              c c c c c c   d d d d d d
# c              c c c c c c   d d d d d d

c + d

(5, 1)
   (6,)
-------
(5, 6)


array([[0, 1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5, 6],
       [2, 3, 4, 5, 6, 7],
       [3, 4, 5, 6, 7, 8],
       [4, 5, 6, 7, 8, 9]])

For the full broadcasting rules, please see the official Numpy docs, which describe them in detail and with more complex examples.

Also see: [G-Node Summer School Advanced NumPy tutorial](https://github.com/stefanv/teaching/blob/master/2014_assp_split_numpy/numpy_advanced.ipynb)

As we mentioned before, Numpy ships with a full complement of mathematical functions that work on entire arrays, including logarithms, exponentials, trigonometric and hyperbolic trigonometric functions, etc.  Furthermore, scipy ships a rich special function library in the `scipy.special` module that includes Bessel, Airy, Fresnel, Laguerre and other classical special functions.  For example, sampling the sine function at 100 points between $0$ and $2\pi$ is as simple as:

In [53]:
x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)

## Linear algebra in numpy

Numpy ships with a basic linear algebra library, and all arrays have a `dot` method whose behavior is that of the scalar dot product when its arguments are vectors (one-dimensional arrays) and the traditional matrix multiplication when one or both of its arguments are two-dimensional arrays:

In [54]:
v1 = np.array([2, 3, 4])
v2 = np.array([1, 0, 1])
print(v1, '·', v2, '=', v1.dot(v2))

[2 3 4] · [1 0 1] = 6


In [55]:
A = np.arange(6).reshape(2, 3)
print('A:\n', A)
print('v1:\n', v1)

A:
 [[0 1 2]
 [3 4 5]]
v1:
 [2 3 4]


In [56]:
A.dot(v1)

array([11, 38])

For matrix-matrix multiplication, the same dimension-matching rules must be satisfied, e.g. consider the difference between $A \times A^T$:

In [57]:
print(A.dot(A.T))

[[ 5 14]
 [14 50]]


and $A^T \times A$:

In [58]:
print(A.T.dot(A))

[[ 9 12 15]
 [12 17 22]
 [15 22 29]]


In Python 3.5, we'll be able to write this as:

```
A.T @ A
```

Furthermore, the `numpy.linalg` module includes additional functionality such as determinants, matrix norms, Cholesky, eigenvalue and singular value decompositions, etc.  For even more linear algebra tools, `scipy.linalg` contains the majority of the tools in the classic LAPACK libraries as well as functions to operate on sparse matrices.  We refer the reader to the Numpy and Scipy documentations for additional details on these.

## Reading and writing arrays to disk

Numpy lets you read and write arrays into files in a number of ways.  In order to use these tools well, it is critical to understand the difference between a *text* and a *binary* file containing numerical data.  In a text file, the number $\pi$ could be written as "3.141592653589793", for example: a string of digits that a human can read, with in this case 15 decimal digits.  In contrast, that same number written to a binary file would be encoded as 8 characters (bytes) that are not readable by a human but which contain the exact same data that the variable `pi` had in the computer's memory.  

The tradeoffs between the two modes are thus:

* Text mode: occupies more space, precision can be lost (if not all digits are written to disk), but is readable and editable by hand with a text editor.  Can *only* be used for one- and two-dimensional arrays.

* Binary mode: compact and exact representation of the data in memory, can't be read or edited by hand.  Arrays of any size and dimensionality can be saved and read without loss of information.

First, let's see how to read and write arrays in text mode.  The `np.savetxt` function saves an array to a text file, with options to control the precision, separators and even adding a header:

In [None]:
arr = np.arange(10).reshape(2, 5)
print(arr)                            
np.savetxt('test.out', arr)

In [None]:
!cat test.out

And this same type of file can then be read with the matching `np.loadtxt` function:

In [None]:
arr2 = np.loadtxt('test.out')
print(arr2)

For binary data, Numpy provides the `np.save` and `np.savez` routines.  The first saves a single array to a file with `.npy` extension, while the latter can be used to save a *group* of arrays into a single file with `.npz` extension.  The files created with these routines can then be read with the `np.load` function.

Let us first see how to use the simpler `np.save` function to save a single array:

In [None]:
np.save('test.npy', arr)
# Now we read this back
arr_loaded = np.load('test.npy')

print(arr)
print(arr_loaded)

print(arr_loaded.dtype)

# Let's see if any element is non-zero in the difference.
# A value of True would be a problem.
print('Any differences?', np.any(arr - arr_loaded))

Now let us see how the `np.savez_compressed` function works.

In [None]:
np.savez_compressed('test.npz', first=arr, second=arr2)
arrays = np.load('test.npz')
arrays.files

The object returned by `np.load` from an `.npz` file works like a dictionary:

In [None]:
arrays['first']

This `.npz` format is a very convenient way to package compactly and without loss of information, into a single file, a group of related arrays that pertain to a specific problem.  At some point, however, the complexity of your dataset may be such that the optimal approach is to use one of the standard formats in scientific data processing that have been designed to handle complex datasets, such as NetCDF or HDF5.  

Fortunately, there are tools for manipulating these formats in Python, and for storing data in other ways such as databases.  A complete discussion of the possibilities is beyond the scope of this tutorial, but of particular interest for scientific users we at least mention the following:

* The `scipy.io` module contains routines to read and write Matlab files in `.mat` format and files in the NetCDF format that is widely used in certain scientific disciplines.

* For manipulating files in the HDF5 format, there are two excellent options in Python: the **PyTables** project offers a high-level, object oriented approach to manipulating HDF5 datasets, while the **h5py** project offers a more direct mapping to the standard HDF5 library interface.  Both are excellent tools; if you need to work with HDF5 datasets you should read some of their documentation and examples and decide which approach is a better match for your needs.

In [None]:
arrays['second']

## Advanced Language Strings 

In [None]:
'funky town'.capitalize()

In [None]:
'funky town'.capitalize().split()

In [None]:
[x.capitalize() for x in 'funky town'.split()]

In [None]:
'I want to take you to funky town'.split('to')

#### .strip(), .join() and .replace()

In [None]:
csv_string = 'cat, dog, spam, mouse, 3.1432'

In [None]:
csv_string.strip('\t')

In [None]:
csv_clean = [x.strip() for x in csv_string.split(',')]

In [None]:
csv_clean

In [None]:
csv_string = "cat, dog, spam, mouse,, 3.1432"

In [None]:
csv_string.replace(",,", ",")

### String formatting 

In [None]:
'On {0}, I feel {1}'.format('Saturday', 'excited')

In [None]:
'{desire} to {place}'.format(desire = 'Fly me', place = 'the moon')

In [None]:
f = {'desire': 'Fly me', 'place': 'the moon'}

In [None]:
'{desire} to {place}'.format(**f)

### Formating comes after a colon :

In [None]:
'{0:00.2f}'.format(3.14159,42)

# File I/O (read/write) 

.open() and .close() are built in functions

open modes: r (read), w (write), r+ (read + update), rb (read as binary stream, ...), rt (read as text file)

Writing data: .write() or .writelines()

In [None]:
%%file mydata.dat
This is my zeroth file I/O. Zing!

In [None]:
file_stream = open('mydata.data', 'r')
print(type(file_stream))
file_stream.close()

In [None]:
f = open('test.dat', 'w')
f.write('This is mt first I/O file. Zing! again.')
f.close()


In [None]:
! cat mydata.dat

! cat test.dat

In [None]:
f = open('test.dat', 'w')
f.writelines([ "a = ['This is mt first I/O file.']\n", 'Zing! again.'])
f.close()
print(type('test.dat'))
! cat test.dat

In [None]:
f = open('test.dat', 'r')
data = f.readlines()
f.close()
data
type(data)

# Lambda functions

Anonymous functions from Lisp and functional programming

In [None]:
import math
tmp = lambda x: x**2
print(type(tmp))

In [None]:
tmp(4)

In [None]:
# forget about creating a function name... just do it

(lambda x, y: x**2 + y)(2, 6)

In [None]:
#Creating a list of lambda functions 
lamfun = [lambda x: x**2, lambda x: x**3, lambda x: x**4, \
             lambda y: math.sqrt(y) if y >= 0 else 'Really?' '{0:00.1f}'.format(y)]

In [None]:
for l in lamfun: print(l(-2))

### Some working examples

In [None]:
%matplotlib inline 

import requests 
import numpy as np
import pandas as pd # pandas
import matplotlib.pyplot as plt # module for plotting 
import datetime as dt # module for manipulating dates and times
import numpy.linalg as lin # module for performing linear algebra operations
from __future__ import division
import matplotlib

import sklearn.decomposition
import sklearn.metrics
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn import cross_validation



In [None]:
data_df = pd.read_csv('DatasetGP.csv')

In [None]:
data_df.head()

In [None]:
trainX = data_df.values[1:480,0:-1]
trainY = data_df.values[1:480,2]

In [None]:
testX = data_df.values[481:600, 0:-1]
testY = data_df.values[481:600, 2]

In [None]:
trainX

In [None]:
trainY

In [None]:
def crossValidation_all(theta, nugget, nfold, trainX, trainY):
    
    thetaU = theta * 2
    thetaL = theta/2
    
    scores = np.zeros((len(nugget) * len(theta), nfold))
    labels = ["" for x in range(len(nugget) * len(theta))]

    k = 0
    for j in range(len(theta)):
        for i in range(len(nugget)):
            gp = gaussian_process.GaussianProcess(theta0 = theta[j], nugget = nugget[i])
            scores[k, :] = cross_validation.cross_val_score(gp, trainX, trainY, scoring='r2', cv = nfold)  
            labels[k] = str(theta[j]) + '|' + str(nugget[i])  
            k = k + 1
    
    plt.figure(figsize=(10,4))
    plt.boxplot(scores.T, sym='b+', labels = labels, whis = 0.5)
    plt.ylim([0,1])
    plt.title('R2 score as a function of nugget')
    plt.ylabel('R2 Score')
    plt.xlabel('Choice of theta | nugget')
    plt.show()
    
    
theta = np.arange(1, 8, 2)
nfold = 10
nugget = np.arange(0.01, 0.2, 0.03)

crossValidation_all(theta, nugget, nfold, trainX, trainY)

In [None]:
def predictAll(theta, nugget, trainX, trainY, testX, testY, title):

    gp = gaussian_process.GaussianProcess(theta0=theta, nugget =nugget)
    gp.fit(trainX, trainY)

    predictedY, MSE = gp.predict(testX, eval_MSE = True)
    sigma = np.sqrt(MSE)

    results = testY.copy()
    results['predictedY'] = predictedY
    results['sigma'] = sigma

    print("Train score R2:", gp.score(trainX, trainY))
    print("Test score R2:", sklearn.metrics.r2_score(testY, predictedY))

    plt.figure(figsize = (9,8))
    plt.scatter(testY, predictedY)
    plt.plot([min(testY), max(testY)], [min(testY), max(testY)], 'r')
    plt.xlim([min(testY), max(testY)])
    plt.ylim([min(testY), max(testY)])
    plt.title('Predicted vs. observed:')
    plt.xlabel('Observed')
    plt.ylabel('Predicted')
    plt.show()
    
    return gp, results

gp_daily, results_daily = predictAll(3, 0.04, trainX, trainY, testX, testY, 'Daily')