# Harley Wood School for Astronomy 2019 

<img src="https://research.smp.uq.edu.au/asa2019/static/asa19/img/HWSA2019-logo.png" width=300>

## Part IV - Timing and profiling code snippets to help write faster code

In this part of the workshop we will look at an example code to reproduce HR diagrams using Gaia data.

<img src="https://www.cosmos.esa.int/documents/29201/1666086/kine_all.png/8b9de0b4-8eb1-ad73-0922-9bf323687f6e?t=1524224828914" width=400>

The above Gaia Hertzsprung-Russell diagrams, Gaia absolute magnitude versus GBP-GRP colour, are a function of the stars tangential velocity (VT), using Gaia DR2 with relative parallax uncertainty better than 10% and low extinction stars (E(B-V)<0.015), together with astrometric and photometric quality filters. The colour scale represents the square root of the density


## Table of Contents

1. [When and What to Optimise](#When-and-what-to-optimise)
2. [Timing Code Snippets](#Timing-code-snippets)
3. [Profiling-code](#Profiling-code-line-by-line)
4. [Advanced Track](#Advanced-Track)


### Required libraries

This notebook uses several Python packages that come standard with the [Anaconda Python distribution](http://continuum.io/downloads). The primary libraries that we'll be using are:

* **astropy**
* **numpy**
* **pandas**
* **line_profiler**

If you have created a new environment with `conda env create -f hwsa-environment.yml`, then you are all set. If not, To make sure you have all of the packages you need, install them with `conda`:

    conda install [package name]
    conda install -c astropy astroquery
    
`conda` may ask you to update some of the packages if you don't have the most recent version. Allow it to do so.

Alternatively, if you can install the packages with [pip](https://pip.pypa.io/en/stable/installing/) (a Python package manager):

    pip install [package name]
    
Be sure to restart your kernel if you had to install new packages.

# When and What to Optimise


**Code is read far more frequently than written**

Optimisation mantras:

1. [Obligatory XKCD](https://xkcd.com/1205/)

2. Do not optimise unless you have to (your time is *the most precious*)

3. Optimise for readability (imagine very cranky future developers, and reduce their cognitive load)

4. Variety of optimisations 
    - Readability
    - Performance   
    - Memory
    - Lines of code (code-golf)
    
5. Remember, every line of code is a potential source of bugs, and future maintenance head-aches. Choose wisely!    




# Timing Code Snippets

`timeit` magic command is absolutely essential to quantify how long a code takes to run. What are the challenges to timing a piece of code you say?

- Ensure that the timing is representative of the average case
- Get some sense of the scatter in the runtimes

`timeit` will repeat a chunk of code some *N* times till a stable runtime measurement is reached. Lots of customisable options -- see the manual [here](https://docs.python.org/3/library/timeit.html)


In [None]:
%%timeit
#read data 
from astropy.io.votable import parse_single_table
table = parse_single_table("async_20190630210155.vot")
t = table.to_table(use_names_over_ids=True)

# That takes a LOOOONG time

Can we reduce the total read time? Perhaps we can only read-in the specific columns that we *need*

In [None]:
%load_ext memory_profiler

In [1]:
%%time
#read data 
if not table:
    from astropy.io.votable import parse_single_table
    columns = ['phot_g_mean_mag', 'parallax']
    table = parse_single_table("async_20190630210155.vot", columns=columns)
    print("Done reading table")

t = table.to_table(use_names_over_ids=True)



Done reading table
CPU times: user 20min 31s, sys: 41 s, total: 21min 12s
Wall time: 22min 13s


In [None]:
%%file mprun_demo.py
def read_votable():
    if not table:
        from astropy.io.votable import parse_single_table
        columns = ['phot_g_mean_mag', 'parallax']
        table = parse_single_table("async_20190630210155.vot", columns=columns)
        print("Done reading table")
    return table.to_table(use_names_over_ids=True)

In [None]:
from mprun_demo import read_votable
%mprun -f read_votable read_votable()

In [None]:
from mprun_demo import read_votable
%lprun -f read_votable read_votable()

# (After some time) now we have the data. 


In [3]:
%%time
df = t.to_pandas()

#check the data frame
df.head()

CPU times: user 2.13 s, sys: 2.36 s, total: 4.49 s
Wall time: 4.74 s


In [17]:
%%timeit
#convert to pandas df and calculate absolute mag
import numpy as np
from math import log10

df['mg'] = 0
df['dist'] = 0

for c, v in enumerate(df['phot_g_mean_mag']):
    
    p =df.loc[c,'parallax']
    if p>0:
        df.loc[c,'mg'] = v + 5 * log10(p) - 10
        df.loc[c,'dist'] = 1000/p
    else:
        df.loc[c,'mg'] = np.nan
        df.loc[c,'dist'] = np.nan

KeyboardInterrupt: 

In [30]:
def second_attempt_at_abs_mag_and_dist(df):
    import pandas as pd
    import numpy as np
    import math

    df['mg2'] = 0
    df['dist2'] = 0

    for c, v in enumerate(df['phot_g_mean_mag']):

        p =df.loc[c,'parallax']
        if p>0:
            df.loc[c,'mg2'] = v + 5 * math.log10(p) - 10
            df.loc[c,'dist2'] = 1000/p
        else:
            df.loc[c,'mg2'] = np.nan
            df.loc[c,'dist2'] = np.nan

%prun second_attempt_at_abs_mag_and_dist(df)

KeyboardInterrupt: 

In [33]:
#convert to pandas df and calculate absolute mag
def second_attempt_at_abs_mag_and_dist(df):
    import pandas as pd
    import numpy as np
    import math

    df['mg2'] = 0
    df['dist2'] = 0

    for c, v in enumerate(df['phot_g_mean_mag']):

        p =df.loc[c,'parallax']
        if p>0:
            df.loc[c,'mg2'] = v + 5 * np.log10(p) - 10
            df.loc[c,'dist2'] = 1000/p
        else:
            df.loc[c,'mg2'] = np.nan
            df.loc[c,'dist2'] = np.nan

#%timeit second_attempt_at_abs_mag_and_dist(df)

In [34]:
%lprun -f second_attempt_at_abs_mag_and_dist second_attempt_at_abs_mag_and_dist(df)

*** KeyboardInterrupt exception caught in code being profiled.

# For loops are sloooow in python

But we do need to loop - how do we that? List comprehensions to the rescue!

In [40]:
#convert to pandas df and calculate absolute mag
def third_attempt_at_abs_mag_and_dist(df):
    import numpy as np
    import math

    apparent_mags = df['phot_g_mean_mag']
    parallax = df['parallax']
    abs_mags = [mag + 5*math.log10(dist) - 10 if dist > 0 else np.nan for mag, dist in zip(apparent_mags, parallax)]
    dists = [1000.0/d if d > 0 else np.nan for d in parallax ]
    
    df['mg3'] = abs_mags
    df['dist3'] = dists


In [41]:
%lprun -f third_attempt_at_abs_mag_and_dist third_attempt_at_abs_mag_and_dist(df)

<class 'pandas.core.series.Series'>


In [42]:
%timeit third_attempt_at_abs_mag_and_dist(df)

<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
2.03 s ± 15.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [43]:
#convert to pandas df and calculate absolute mag
def fourth_attempt_at_abs_mag_and_dist(df):
    import numpy as np

    apparent_mags = df['phot_g_mean_mag'].to_numpy()
    parallax = df['parallax'].to_numpy()
    abs_mags = [mag + 5*np.log10(dist) - 10 if dist > 0 else np.nan for mag, dist in zip(apparent_mags, parallax)]
    dists = [1000.0/d if d > 0 else np.nan for d in parallax ]
    
    df['mg4'] = abs_mags
    df['dist4'] = dists

%timeit fourth_attempt_at_abs_mag_and_dist(df)

KeyboardInterrupt: 

In [44]:
#convert to pandas df and calculate absolute mag
def fifth_attempt_at_abs_mag_and_dist(df):
    import pandas as pd
    import numpy as np
    import math

    apparent_mags = df['phot_g_mean_mag'].to_numpy()
    parallax = df['parallax'].to_numpy()
    abs_mags = apparent_mags + 5.0*np.log10(parallax) - 10
    dist = 1000.0/parallax

    bad_inds = (~np.isfinite(parallax) | (parallax <= 0))
    abs_mags[bad_inds] = np.nan
    dist[bad_inds] = np.nan
  
    df['mg5'] = abs_mags
    df['dist5'] = dist

%timeit fifth_attempt_at_abs_mag_and_dist(df)

  if __name__ == '__main__':
  if sys.path[0] == '':


93.9 ms ± 927 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [45]:
#convert to pandas df and calculate absolute mag
def sixth_attempt_at_abs_mag_and_dist(df):
    import numpy as np

    apparent_mags = df['phot_g_mean_mag'].to_numpy()
    parallax = df['parallax'].to_numpy()
    
    abs_mags = np.full_like(apparent_mags, np.nan)
    dist = np.full_like(parallax, np.nan)

    good_inds = (np.isfinite(parallax) & (parallax > 0))
    abs_mags[good_inds] = apparent_mags[good_inds] + 5.0*np.log10(parallax[good_inds]) - 10
    dist[good_inds] = 1000.0/parallax[good_inds]

    df['mg6'] = abs_mags
    df['dist6'] = dist

%timeit sixth_attempt_at_abs_mag_and_dist(df)

  # This is added back by InteractiveShellApp.init_path()


133 ms ± 2.29 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [46]:
def add_abs_mag_and_distance(df):
    import numpy as np
    df['optim_abs_mag'] = df['phot_g_mean_mag'] + 5*np.log10(df['parallax']) - 10
    df['optim_dist'] = 1000.0/df['parallax']
       
%timeit  add_abs_mag_and_distance(df)

  This is separate from the ipykernel package so we can avoid doing imports until


88.6 ms ± 2.66 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


# Profiling Code (line by line)




In [23]:
%reload_ext line_profiler

In [None]:
def second_attempt_at_abs_mag_and_dist(df):
    import pandas as pd
    import numpy as np

    df['mg2'] = 0
    df['dist2'] = 0

    for c, v in enumerate(df['phot_g_mean_mag']):

        p =df.loc[c,'parallax']
        if p>0:
            df.loc[c,'mg2'] = v + 5 * log10(p) - 10
            df.loc[c,'dist2'] = 1000/p
        else:
            df.loc[c,'mg2'] = np.nan
            df.loc[c,'dist2'] = np.nan

In [None]:
%lprun -f second_attempt_at_abs_mag_and_dist second_attempt_at_abs_mag_and_dist(df)

# Advanced Track 


## Challenge #1: Difficulty Rating - *Easy*

There is an open issue on the astropy repo [issue no 8946](https://github.com/astropy/astropy/issues/8946). See if you figure out what causes the high-memory usage. 

Hint: You will need `mem_profiler` for the first challenge

## Challenge #2: Difficulty Rating - *Medium*

Perform the entire operation in parallel. First, create the parallel interface with `multiprocessing`, and then create an MPI parallel implementation with `mpi4py`.

Hint: Use the package `schwimbadd` for a customisable solution. 

## Challenge #3: Difficulty Rating - *Medium*

Open a pull-request to fix [issue no 8946](https://github.com/astropy/astropy/issues/8946)

## Challenge #4: Difficulty Rating - *Ninja*

Open a pull-request to fix [issue no 8946](https://github.com/astropy/astropy/issues/8946) *AND* to read-in only some *Nrows* of data (i.e., add a new keyword). Any addition must maintain backwards compatibility. 

**Disclaimer** (MS) I do not know how to do this. If you are (interested in) attempting this, please let me know - I can put you in touch with astropy developers that would

In [50]:
gaia_dr2_np_arrays = df.to_numpy()


AttributeError: 'numpy.ndarray' object has no attribute 'savez'

In [53]:
import numpy as np
np.savez("async_20190630210155.npz", gaia_dr2_np_arrays)

OSError: [Errno 22] Invalid argument

In [54]:
df.to_pickle("async_20190630210155.pkl")