# <center>Five Things</center>
## <center>(to make your research life easier)</center>
### <center>(if you use Python)</center>
### <center>Britton Smith</center>
#### <center>Stobie Talk - 14 February, 2020</center>

# numba - http://numba.pydata.org/
```
pip install numba
```
### "Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code."

In [1]:
import numpy as np
import time

In [2]:
x = np.random.random(10000000)
y = np.random.random(10000000)

In [3]:
def monte_carlo_pi(x, y):
    nsamples = x.size
    acc = 0
    for i in range(nsamples):
        if (x[i] ** 2 + y[i] ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

In [4]:
t1 = time.time()
print (monte_carlo_pi(x, y))
t2 = time.time()
try1 = t2 - t1
print (f"That took {try1} seconds.")

3.1419288
That took 8.27987813949585 seconds.


In [5]:
def monte_carlo_pi_numpy(x, y):
    nsamples = x.size
    acc = ((x**2 + y**2) < 1).sum()
    return 4.0 * acc / nsamples

In [6]:
t1 = time.time()
print (monte_carlo_pi_numpy(x, y))
t2 = time.time()
try2 = t2 - t1
print (f"That took {try2} seconds ({(try1/try2):.2f} speedup).")

3.1419288
That took 0.1450340747833252 seconds (57.09 speedup).


In [7]:
from numba import jit

In [8]:
@jit(nopython=True)
def monte_carlo_pi_numba(x, y):
    nsamples = x.size
    acc = 0
    for i in range(nsamples):
        if (x[i] ** 2 + y[i] ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

In [9]:
t1 = time.time()
print (monte_carlo_pi_numba(x, y))
t2 = time.time()
try3 = t2 - t1
print (f"That took {try3} seconds ({(try1/try3):.2f} speedup).")

3.1419288
That took 0.25689077377319336 seconds (32.23 speedup).


In [10]:
t1 = time.time()
print (monte_carlo_pi_numba(x, y))
t2 = time.time()
try4 = t2 - t1
print (f"That took {try4} seconds ({(try1/try4):.2f} speedup).")

3.1419288
That took 0.012539148330688477 seconds (660.32 speedup).


# unyt - http://unyt.readthedocs.io/
```
pip install unyt
```
### "This package provides a python library for working with data that has physical units."

In [11]:
from unyt import unyt_array, unyt_quantity

In [12]:
H0 = unyt_quantity(70, 'km/s/Mpc')

In [13]:
print (H0)

70 km/(Mpc*s)


In [14]:
print (H0.in_cgs())

2.268545503e-18 1/s


In [15]:
print ((1/H0).to('Gyr'))

13.968460307330654 Gyr


In [16]:
m = unyt_array(np.logspace(5, 7, 3), 'Msun')
v = unyt_array(np.logspace(4, 6, 3), 'cm/s')
E = 0.5*m*v**2
print (E)

[5.e+12 5.e+15 5.e+18] Msun*cm**2/s**2


In [17]:
print (E.in_cgs())

[9.9420793e+45 9.9420793e+48 9.9420793e+51] erg


# yt.save_as_dataset - https://yt-project.org/
```
pip install yt
```
### The `save_as_dataset` function allows you to save arbitrary arrays (with units) and reload them with yt.

Documentation [here](https://yt-project.org/docs/dev/analyzing/saving_data.html).

In [18]:
import yt

In [19]:
m = yt.YTArray(m, 'Msun')
v = yt.YTArray(v, 'cm/s')
H0 = yt.YTQuantity(H0, 'km/s/Mpc')

data = {'mass': m,
        'velocity': v}

metadata = {"hubble_constant": H0}

yt.save_as_dataset(metadata, data=data, filename='my_data.h5')

yt : [INFO     ] 2020-02-14 11:20:44,862 Saving field data to yt dataset: my_data.h5.


In [20]:
ds = yt.load('my_data.h5')



In [21]:
print (ds.data['mass'])
print (ds.data['velocity'].to('km/s'))
print (ds.hubble_constant)

[  100000.  1000000. 10000000.] Msun
[ 0.1  1.  10. ] km/s
70.0 km/(Mpc*s)


# yt.parallel_objects - https://yt-project.org/
```
pip install yt
```
### The `parallel_objects` function turns a regular loop into distributed parallelism (multiple nodes).

Documentation [here](https://yt-project.org/docs/dev/analyzing/parallel_computation.html#parallelizing-over-multiple-objects).

In [22]:
for i in range(10):
    yt.mylog.info(i)

yt : [INFO     ] 2020-02-14 11:22:31,160 0
yt : [INFO     ] 2020-02-14 11:22:31,161 1
yt : [INFO     ] 2020-02-14 11:22:31,162 2
yt : [INFO     ] 2020-02-14 11:22:31,162 3
yt : [INFO     ] 2020-02-14 11:22:31,163 4
yt : [INFO     ] 2020-02-14 11:22:31,163 5
yt : [INFO     ] 2020-02-14 11:22:31,164 6
yt : [INFO     ] 2020-02-14 11:22:31,164 7
yt : [INFO     ] 2020-02-14 11:22:31,165 8
yt : [INFO     ] 2020-02-14 11:22:31,166 9


In [23]:
for i in yt.parallel_objects(range(10)):
    yt.mylog.info(i)

yt : [INFO     ] 2020-02-14 11:22:46,465 0
yt : [INFO     ] 2020-02-14 11:22:46,466 1
yt : [INFO     ] 2020-02-14 11:22:46,466 2
yt : [INFO     ] 2020-02-14 11:22:46,467 3
yt : [INFO     ] 2020-02-14 11:22:46,467 4
yt : [INFO     ] 2020-02-14 11:22:46,468 5
yt : [INFO     ] 2020-02-14 11:22:46,469 6
yt : [INFO     ] 2020-02-14 11:22:46,470 7
yt : [INFO     ] 2020-02-14 11:22:46,471 8
yt : [INFO     ] 2020-02-14 11:22:46,472 9
