<a href="https://colab.research.google.com/github/kknippenberg11/CPHC-intro-numpy/blob/main/why.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Why Numpy?

### Limitations of Traditional Python **(in Scientific Applications)**

* Python's list can contain objects of diffferent types
  * e.g.: a = [1,'2',3.0,1+3j,[4,5],{'UT':'SLC'}]
  * <font color="green"><b>Pro</b></font>: Very flexible
  * <font color="red"><b>Con</b></font>: Slow
* In scientific applications, we mostly need <b><font color="blue">homogeneous</font></b> arrays (e.g. ALL int, float, double,complex, str)<br>
  * double x[20]     
  * double precison, dimension(20) :: x  
* CPython is <b><font color="red">single-threaded</font></b> due to its <b><font color="red">Global Interpretor Lock (GIL)</font></b><br>
  https://wiki.python.org/moin/GlobalInterpreterLock
  * <b><font color="green">Increasing</font></b> #cores/CPU but <b><font color="red">decreasing</font></b> clock rates
  * => multi-threading becomes a conditio sine qua non

### What is NumPy?

* started in 2006 (based on previous packages numeric & numarray)
* provides:
  1. An array object <font color="green"><b>(ndarray)</b></font> over arbitrary items of the <font color="green"><b>same</b></font> type (ie, we don't have to check the types or know the types).
  2. Fast mathematical operations over arrays
  3. Linear Algebra, Fourier Transforms, Random Number Generation
* can run in a <font color="green"><b>multi-threaded</b></font> fashion by relying on C/Fortran libraries<br>
  such as BLAS/LAPACK (MKL, OpenBlas,..), Pseudo random number generators (PRNGs) in C/C++.
* forms the corner stone of a lot of (python based) scientific packages, such as:
  * scipy: fundamental library for scientific computing
  * matplotlib: 2D plotting
  * pandas: data structures & analysis
  * dask: parallel scaling module
  * scikit-learn: supervised machine learning
  * scikit-image: image processing in python
  * ...

#### Further Reading (Advanced)

<a href="http://conference.scipy.org/proceedings/scipy2018/pdfs/anton_malakhov.pdf">
    Composable Multi-Threading and Multi-Processing for Numeric Libraries (by Anton Malakhov et al. (Intel))</a>

### How to invoke numpy?

In [None]:
!git clone https://github.com/kknippenberg11/CPHC-intro-numpy

Cloning into 'CPHC-intro-numpy'...
remote: Enumerating objects: 68, done.[K
remote: Counting objects: 100% (68/68), done.[K
remote: Compressing objects: 100% (58/58), done.[K
remote: Total 68 (delta 16), reused 49 (delta 9), pack-reused 0 (from 0)[K
Receiving objects: 100% (68/68), 3.36 MiB | 9.93 MiB/s, done.
Resolving deltas: 100% (16/16), done.


In [None]:
import numpy as np
import pprint
print(f"Numpy Version:{np.__version__}")
pprint.pprint(f"{np.show_config()}")

Numpy Version:1.26.4
Build Dependencies:
  blas:
    detection method: pkgconfig
    found: true
    include directory: /usr/local/include
    lib directory: /usr/local/lib
    name: openblas64
    openblas configuration: USE_64BITINT=1 DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS=
      NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= HASWELL MAX_THREADS=2
    pc file directory: /usr/local/lib/pkgconfig
    version: 0.3.23.dev
  lapack:
    detection method: internal
    found: true
    include directory: unknown
    lib directory: unknown
    name: dep139863411681952
    openblas configuration: unknown
    pc file directory: unknown
    version: 1.26.4
Compilers:
  c:
    args: -fno-strict-aliasing
    commands: cc
    linker: ld.bfd
    linker args: -Wl,--strip-debug, -fno-strict-aliasing
    name: gcc
    version: 10.2.1
  c++:
    commands: c++
    linker: ld.bfd
    linker args: -Wl,--strip-debug
    name: gcc
    version: 10.2.1
  cython:
    commands: cython
    linker: cython
    na

### Simple test/comparison with standard Python

In [None]:
SZ=10000

In [None]:
%timeit [item**2 for item in range(SZ)]

3.1 ms ± 73.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
%timeit np.arange(SZ)**2

13.5 µs ± 2.37 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)


### Documentation/Help

* http://www.numpy.org/
* <a href="https://www.enthought.com/wp-content/uploads/Enthought-MATLAB-to-Python-White-Paper.pdf">MATLAB to Python - A Migration Guide (Enthought)</a>
* <a href="https://www.scipy.org/scipylib/mailing-lists.html">Numpy Mailing lists</a>  
* <a href="https://stackoverflow.com/">Stack Overflow</a>  
* Directly within Numpy/Jupyter:
  * using TAB & ?
  * help(<name>) e.g. help(np.equal)
  * np.info()  # Info on object (including ufuncs)
* <a href="https://github.com/numpy/numpy">Numpy Source Code</a> (Github)  