# Introduction to the Python Scientific Ecosystem


Python is slow and memory-inefficient.

In [20]:
%%system
echo "#include <stdio.h>" > sum.c
echo "int main() { long long result=0; for (long long i=0; i < 1e8; i++) result += i; printf(\"%lld\\n\", result); }" >> sum.c
gcc sum.c
time ./a.out

['4999999950000000', '', 'real\t0m0.261s', 'user\t0m0.259s', 'sys\t0m0.000s']

In [22]:
%%time
print(sum(range(int(1e8))))

4999999950000000
CPU times: user 2.97 s, sys: 7.68 ms, total: 2.98 s
Wall time: 2.98 s


That's where the whole story could have ended... But...


## History

Quoting wikipedia:

> The Python programming language was not originally designed for numerical computing,
but attracted the attention of the scientific and engineering community early on.
In 1995 the special interest group (SIG) matrix-sig was founded with the aim
of defining an array computing package; among its members was Python designer
and maintainer Guido van Rossum, who extended Python's syntax
(in particular the indexing syntax) to make array computing easier.

> An implementation of a matrix package was completed by Jim Fulton,
then generalized by Jim Hugunin and called Numeric
 (also variously known as the "Numerical Python extensions" or "NumPy").
 Hugunin, a graduate student at the Massachusetts Institute of Technology (MIT),
  joined the Corporation for National Research Initiatives (CNRI)
  in 1997 to work on JPython, leaving 
Paul Dubois of Lawrence Livermore National Laboratory (LLNL) to take over as maintainer.
Other early contributors include David Ascher, Konrad Hinsen and Travis Oliphant.

> A new package called Numarray was written as a more flexible replacement for Numeric.
Like Numeric, it too is now deprecated. Numarray had faster operations for 
large arrays, but was slower than Numeric on small ones, so for a time both packages
 were used in parallel for different use cases. The last version of Numeric (v24.2)
  was released on 11 November 2005, while the last version of numarray (v1.5.2)
  was released on 24 August 2006.

> There was a desire to get Numeric into the Python standard library,
but Guido van Rossum decided that the code was not maintainable in its state then.

> In early 2005, NumPy developer Travis Oliphant wanted to unify the community
 around a single array package and ported Numarray's features to Numeric,
 releasing the result as NumPy 1.0 in 2006. This new project was part of SciPy.
  To avoid installing the large SciPy package just to get an array object
   this new package was separated and called NumPy.
    Support for Python 3 was added in 2011 with NumPy version 1.5.0. 

> In the early Oughts, the open source scientific computing community was reaching a critical point of maturity for Numeric/Numpy, IPython, Matplotlib, and SciPy. By 2010, a critical mass of projects were in need of a more formal structure to provide support and help to organize the community.

…enter [NumFOCUS](https://numfocus.org/)!

> Travis Oliphant (author of NumPy), Fernando Pérez (author of IPython), Perry Greenfield (author of Numarray and Astropy), John Hunter (author of Matplotlib), Jarrod Millman (release manager for SciPy), and Anthony Scopatz (who came up with the name “NumFOCUS”) became the founding board of NumFOCUS. Leah Silen was selected as the founding Executive Director. In fall of 2012, NumFOCUS received 501(c)(3) public charity status as a nonprofit in the United States. 

### Sponsored projects

![NumFOCUS-sponsored projects](image-20210118-143459.png)

## Notable libraries

### Arrays and dataframes

- **numpy** (efficient multidimensional arrays)
- **pandas** (tabular data)
- **xarray** (multidimensional arrays + meta-data)

### Mathematics and statistics

- scipy
- statsmodels
- sympy (symbolic maths)

### Visualization

- **matplotlib**
- **plotly** (interactive plots)
- bokeh
- seaborn
- altair
- vega
- d3.js
- hvplot
- ...many many others

### Image processing

- opencv
- scikit-image
- PIL

### Performance

- numba
- cython
- jax

### Parallelization

- dask
- ray

### Big data

- (py)spark

### Machine learning

- scikit-learn
- tensorflow
- pytorch
