# NumPy beyond 2020

Ross Barnowski `rossbar@berkeley.edu` | [rossbar](https://github.com/rossbar) on GitHub

University of Michigan EECS | 1/30/2020

# What is NumPy?

> *NumPy is the fundamental package for scientific computing with Python*
> 
>  [numpy.org](https://numpy.org/)

Strong stuff.

In [1]:
# Code example: github graphql query for top starred projects with numpy as a dependency

# What does NumPy provide?

 - `ndarray`: A generic, n-dimensional array data structure
 - Sophisticated machinery for operating on array data (broadcasting, `ufuncs`)
 - Tools for common scientific/numerical tasks:
   * Random number generation (`np.random`)
   * Fourier analysis (`np.fft`)
   * Linear algebra (`np.linalg`)
 - Language extension/integration (C-API, `f2py`)

# A bit of history

 - **Mid 90's/Early 00's**: desire for high-performance numerical computation in python leads to [`numeric`](https://numpy.org/_downloads/768fa66c250a0335ad3a6a30fae48e34/numeric-manual.pdf)
 - Early adopters included the [Space Telescope Science Institute (STScI)](http://www.stsci.edu/) who developed another array computation package to better suit their needs: `numarray`.
 - **2005** The best ideas from `numeric` and `numarray` were combined in the development of a new library, `numpy`
   * This work was largely done by Travis Oliphant, a graduate student at the Mayo Clinic at the time
 - **2006** Numpy v1.0 released
 
[NumPy Development History](https://github.com/numpy/numpy/graphs/contributors)

# Where is NumPy used?

 - To produce the first image of a black hole 
   [Event Horizon Telescope Collaboration](https://github.com/achael/eht-imaging)
 - [To detect the gravitational wave signature from a neutron star merger](https://github.com/gwastro/pycbc)
 - [To discover fundamental particles like the Higgs Boson](https://github.com/cms-sw/cmssw)
   * Also [scikit-hep](https://scikit-hep.org/)
 - [Neuroimaging](https://nipy.org/nibabel/) - nipy uses `ndarray` as the fundamental structure for the entire stack
   * fMRI visualization example from [section 3.4](https://www.frontiersin.org/articles/10.3389/fninf.2014.00014/full#h4)
     is a nice, brief example

# Scope of NumPy

NumPy currently targets computation involving:

 * in-memory, homogenously-typed array data
 * cpu-based
 
Important guiding principles:
 - **Stability**: Foundational component of the scientific python ecosystem for going-on 15 years
 - **Interoperability**: A *de facto* standard for array APIs in python

# Adapting to community needs

 - In the early days, many new NumPy users were converts from matlab
   * See the [NumPy for Matlab users](https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html) article in the docs
   
 - Now: The scientific ecosystem is incredibly feature-rich and powerful: attracting many new users
   * Users interested in specific applications (geoscience, image segmentation, bioinformatics, etc.) end up interacting with NumPy indirectly
   * Focus resources on supporting stable, performant base for dependent libraries
   * 

# How is NumPy Developed

 - Collaboratively (caveat here about the bus factor)

Commitment to stability means proposed changes must go through extensive design and review:
 - NEPs - analogous to PEPs, specific to NumPy

In [None]:
# Rise of data science google trends here