<center>
    <img src="http://sct.inf.utfsm.cl/wp-content/uploads/2020/04/logo_di.png" style="width:60%">
    <h1> INF285 - Computación Científica </h1>
    <h2> List Comprehension vs NumPy </h2>
    <h2> <a href="#acknowledgements"> [S]cientific [C]omputing [T]eam </a> </h2>
    <h2> Version: 1.00</h2>
</center>

<div id='toc' />

## Table of Contents
* [Introduction](#intro)
* [First Task](#first)
* [Second Task](#second)
* [Final Task](#final)
* [Acknowledgements](#acknowledgements)

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import matplotlib as mpl
mpl.rcParams['font.size'] = 14
mpl.rcParams['axes.labelsize'] = 20
mpl.rcParams['xtick.labelsize'] = 14
mpl.rcParams['ytick.labelsize'] = 14
%matplotlib inline

<div id='intro' />

# Introduction
[Back to TOC](#toc)

In this Jupyter Notebook we will compare the use of List Comprehension versus NumPy in Numerical Computation, i.e. in Scientific Computing.
The idea is to highlight the main advantage of vectorized computation with NumPy over what you learn about Python initially.
The purpose of this is to give you the tools to create cleaner and faster code with Pyhton, where cleaner means less lines of code and more readable, and faster means it takes way less time than a *traditional* implementation without NumPy.
Notice that we will not include __map__ in the comparison since in most of the cases I have seen, the use of List Comprehension is more common.

We strongly suggest you to take a look to the Jupyter Notebook **Bonus - 00 - The beginning.ipynb** to understand more about vectorized computing and more advantages of NumPy.

This Jupyter Notebook will be organized as a sequence of tasks and comparison, so it is important you go through all of them.

<div id='first' />

# First task: Build a list with the integers from 1 to N=100000.
[Back to TOC](#toc)

In [2]:
N = int(1e5)

# Python base
def buildListIntegers(N):
    out = []
    for i in range(1,N+1):
        out.append(i)
    return out
i1 = buildListIntegers(N)

t_out1 = %timeit -o buildListIntegers(N)

# NumPy version
i2 = np.arange(1,N+1, dtype=int)

t_out2 = %timeit -o np.arange(1,N+1, dtype=int)

4.13 ms ± 398 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
41.7 µs ± 2.72 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [3]:
print('NumPy is', t_out1.average/t_out2.average, 'times faster than the Python base version using \'append\'!!')

NumPy is 99.0342332003033 times faster than the Python base version using 'append'!!


<div id='second' />

# Second Task: Square the elements of the lists of integers.
[Back to TOC](#toc)

In [4]:
# Python base
i1_sq = [i**2 for i in i1]
t_out3 = %timeit -o [i**2 for i in i1]

# Intermediate implementation, notice the effect.
t_out3b = %timeit -o np.power(i1,2)

# NumPy version
i2_sq = np.power(i2,2)
t_out4a = %timeit -o i2**2
t_out4b = %timeit -o np.power(i2,2)

In this case we have used two version of the NumPy alternative, in both cases it is way faster than the traditional version of Python.

In [None]:
print('NumPy, version a, is', t_out3.average/t_out4a.average, 'times faster than the Python base version!!')
print('NumPy, version b, is', t_out3.average/t_out4b.average, 'times faster than the Python base version!!')

print('The intermediate implementation is OK but not the best, it is only',t_out3.average/t_out3b.average,'times better.')
print('The issue for the intermediate implementation is that NumPy needs to translate i1 to a NumPy array first, \nand that\'s where the execution uses extra time.')

NumPy, version a, is 84.25401621585367 times faster than the Python base version!!
NumPy, version b, is 24.81369717123068 times faster than the Python base version!!
The intermediate implementation is OK but not the best, it is only 1.0666681291827487 times better.
The issue for the intermediate implementation is that NumPy needs to translate i1 to a NumPy array first, 
and that's where the execution uses extra time.


We clearly notice (at least in my machine) that version ''a'' is faster than version ''b'', and it could be more readable, but we do encourage to use version ''b'' for advanced tasks to avoid or reduce numerical issues.
For instance, in the Jupyter Notebook ''Bonus - 00 - Playing with Julia.ipynb'' we highlight an issue with adding numbers, which is handled correctly by **np.sum**, thus, using NumPy is not only for clearness and speed, it is also about correctness of the computation.

<div id='final' />

# This and final task: Evaluate the following expression,
[Back to TOC](#toc)

$$
\begin{align*}
[x_1,x_2]&= \displaystyle{\mathop{\mathrm{argmin}}_{\widehat{x}_1,\widehat{x}_2\in [-1,1]}}\,
\max_{x\in [-1,1]} |(x-\widehat{x}_1)\,(x-\widehat{x}_2)|.
\end{align*}
$$
This expression appears in the Jupyter Notebook **Bonus - 05 - Finding 2 Chebyshev PointsGraphically.ipynb**, where we explain why it is important.
Now, we will just evaluate it.

In [None]:
# We assume we will work over a discrete grid of points in [-1,1]
# and the number of discrete points will be N. This means that
# the continuous variables "x", "x_1", "x_2", "hat{x}_1" and "hat{x}_2"
# will be discretized in N points.

# Python base version
def find_x1_x2(N):
    list_i = buildListIntegers(N)
    # Here we apply a linear transformation to build the discrete version of "x"
    # which will be denoted by "xi"
    x_discrete = [(2*(i-1)/(N-1))-1 for i in list_i]
    min_value_outer = -1
    x1_hat_min = -1
    x2_hat_min = -1
    for x1_hat in x_discrete:
        for x2_hat in x_discrete:
            max_value_inner=-1
            for xi in x_discrete:
                value_tmp = abs((xi-x1_hat)*(xi-x2_hat))
                # Here we find the max |(x-x1_hat)*(x-x2_hat)|
                if value_tmp>max_value_inner:
                    max_value_inner=value_tmp
            if min_value_outer == -1 or min_value_outer>max_value_inner:
                min_value_outer = max_value_inner
                x1_hat_min = x1_hat
                x2_hat_min = x2_hat
    return x1_hat_min, x2_hat_min

# NumPy version, storing intermediate values.
def find_x1_x2_NumPy(N):
    x=np.linspace(-1,1,N)
    w = lambda x1,x2: np.max(np.abs((x-x1)*(x-x2)))
    wv=np.vectorize(w)
    [X,Y]=np.meshgrid(x,x)
    W=wv(X,Y)
    id_min = np.unravel_index(np.argmin(W, axis=None),W.shape)
    x1_hat = X[id_min]
    x2_hat = Y[id_min]
    return x1_hat, x2_hat

# Debugging the output
# print('found Python base:', find_x1_x2(N))
# print('found Numpy:', find_x1_x2_NumPy(N))
# print(np.sqrt(2)/2)

N = 350
t_out5 = %timeit -o find_x1_x2(N)
t_out6 = %timeit -o find_x1_x2_NumPy(N)

print('The NumPy version is', t_out5.average/t_out6.average, 'times faster than the Python base version!!')

<div id='acknowledgements' />

# Acknowledgements
[Back to TOC](#toc)

* _Material created by professor Claudio Torres_ (`ctorres@inf.utfsm.cl`). DI UTFSM. _May 2022._