# A Review of EuroSciPy2015



# TO DO: Go through rest of Valerio Maggio's NumPy notebooks. Write something about each talk.


### 0. IPython Notebooks


### 1. Tutorial 1: An Introduction to Python (Joris Vankerschaver)


### 2. Tutorial 2: Never get in a data battle without Numpy arrays (Valerio Maggio)


### 3. numexpr


### 4. Interesting talks



## 0. Jupyter aka IPython Notebook

Interactive programming interface
Simultaneous development & documentation

All tutorials and most lectures at EuroSciPy2015 were given using IPython Notebooks.

We have adopted the Notebook for this presentation as a practical exercise.

In [None]:
pip install ipython-notebook

In [None]:
ipython notebook

## 1. Tutorial #1: An Introduction to Python (Joris Vankerschaver)

iPython Notebook: https://github.com/jvkersch/python-tutorial-files

An overview of basic Python syntax and data structures, including:
- lists, tuples, dictionaries
- mutable vs immutable objects
- set, enumerate
- read from / write to files

Ex: list comprehension

##### Accessing Python's Source Code

cf. http://stackoverflow.com/questions/8608587/finding-the-source-code-for-built-in-python-functions

We can get help for a built-in Python function, such as range, with a single question mark:

In [2]:
range?

And we can read the source code of built-in functions with a double question mark (or by downloading the source from the Python.org Mercurial repositories): https://hg.python.org/

In [5]:
range??

In [1]:
import inspect
# inspect.getsourcefile(range)

In [None]:
# Python's built-in "Counter" class defines the control flow of iterators,
# used in functions such as "for i in range(0,10) ..."

class Counter(object):
    def __init__(self, low, high):
        self.current = low
        self.high = high

    def __iter__(self):
        'Returns itself as an iterator object'
        return self

    def __next__(self):
        'Returns the next value till current is lower than high'
        if self.current > self.high:
            raise StopIteration
        else:
            self.current += 1
            return self.current - 1

We can view the source code for a particular function, such as range, in the Python.org Mercurial Repository: https://hg.python.org/cpython/file/c6880edaf6f3/Objects/rangeobject.c

In [1]:
for i in range(0,10):
    print i**i

1
1
4
27
256
3125
46656
823543
16777216
387420489


In [78]:
# List comprehension (with filter)
[a*2 for a in range(0,10) if a>3]

[8, 10, 12, 14, 16, 18]

# 2. Tutorial 2: Never get in a data battle without Numpy arrays (Valerio Maggio)

IPython Notebook: https://github.com/leriomaggio/numpy_euroscipy2015

+ Numpy 100

https://github.com/rougier/numpy-100


In [33]:
# We can infer the data type of an array structure (but not of int, list, etc.)
a = np.array([1, 2, 3])
a.dtype

dtype('int64')

In [55]:
# Typecast variables into float, complex numbers,
b = np.float64(64)
c = np.complex(b)
print "R(c) = ", c.real
print "I(c) = ", c.imag

R(c) =  64.0
I(c) =  0.0


In [58]:
# Specify type of array elements
x = np.ones(4, 'int8')
x

array([1, 1, 1, 1], dtype=int8)

In [60]:
# Wrap-around
x[0] = 256
x

array([0, 1, 1, 1], dtype=int8)

In [75]:
# Define a new record and create an array of corresponding data types
rt = np.dtype([('title', np.str_, 40), ('track', np.int32), ('time', np.float32)])
songs = np.array([('Four Minutes and Thirty Three Seconds',1,273)], dtype=rt)
songs

array([('Four Minutes and Thirty Three Seconds', 1, 273.0)], 
      dtype=[('title', 'S40'), ('track', '<i4'), ('time', '<f4')])

# 3. numexpr

@ https://github.com/pydata/numexpr

- JIT (Just-in-time) compilation for significant speed-up of numerial calculations
  - numexpr evaluates multiple-operator array expressions many times faster than NumPy can. It accepts the expression as a string, analyzes it, rewrites it more efficiently, and compiles it on the fly into code for its internal virtual machine (VM). Due to its integrated just-in-time (JIT) compiler, it does not require a compiler at runtime.

- Multithreading to make use of multiple CPU cores
  - numexpr implements support for multi-threading computations straight into its internal virtual machine, written in C. This allows to bypass the GIL in Python, and allows near-optimal parallel performance in your vector expressions, most specially on CPU-bounded operations (memory-bounded ones were already the strong point of numexpr).

- Can be used to evaluate expressions in NumPy and Pandas

Ex: https://code.google.com/p/numexpr/

cf. https://github.com/leriomaggio/numpy_euroscipy2015/blob/master/06_Numexpr.ipynb

In [27]:
import numexpr as ne
import numpy as np

a = np.arange(1e4)
b = np.arange(1e4)

print "NumPy   >> " 
%timeit c = (a*b-4.1*a > 2.5*b)

NumPy   >> 
The slowest run took 20.55 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 28 µs per loop


In [28]:
print "NumExpr >> "
%timeit ne.evaluate('a*b-4.1*a > 2.5*b')
# %timeit ne.evaluate('4*a*b-b')

# %timeit a*b-4.1*a > 2.5*b
# %timeit ne.evaluate('a*b-4.1*a > 2.5*b')

NumExpr >> 
1000 loops, best of 3: 473 µs per loop


## 4. Interesting talks

#### ReScience

https://www.euroscipy.org/2015/schedule/presentation/17/

@ https://github.com/ReScience/ReScience/wiki

#### Massively parallel implementation in Python of a pseudo-spectral DNS code for turbulent flows (Mikael Mortensen)
https://www.euroscipy.org/2015/schedule/presentation/6/

A Navier-Stokes equations solver using only Python's numpy & mpi4py that can perform as fast as a C++ implementation.

#### Dashboarding with the IPython notebook for online introspection into long-running experiments (Thomas Greg Corcoran)
https://www.euroscipy.org/2015/schedule/presentation/19/

Query data while an experiment is running.

#### HoloViews: Building complex visualizations easily for reproducible science (Jean-Luc Stevens, Philipp Rudiger)
https://www.euroscipy.org/2015/schedule/presentation/18/
http://ioam.github.io/holoviews/

