# Recap

In order of priority/time taken

1. pandas init dict
    - `basal_area_aw_df = pd.DataFrame(columns=['BA_Aw'], index=xrange(max_age))`
    - find a faster way to create this data frame
    - relax the tolerance for aspen
2. pandas set item
    - use at method 
    - http://pandas.pydata.org/pandas-docs/stable/indexing.html#fast-scalar-value-getting-and-setting
3. lambdas
    - use cython for the gross tot vol and merch vol functions
    - might be wise to refactor these first to have conventional names, keyword arguments, and a base implementation to get rid of the boilerplate
    - don't be deceived - the callable is a miniscule portion; series.__getitem__ is taking most of the time
    - again, using .at here would probably be a significant improvement
4. basalareaincremementnonspatialaw
    - this is actually slow because of the number of times the BAFromZeroToDataAw function is called as shown above
    - relaxing the tolerance may help
    - indeed the tolerance is 0.01 * some value while the other factor finder functions have 0.1 tolerance i think
    - can also use cython for the increment functions

do a profiling run with IO (of reading input data and writing the plot curves to files) in next run


# Characterize what is happening

Indexing with df[] or series[] is slow for scalars (lambdas, pandas set)
basalareaincrement is running a lot for aw, use the same tolerance as is used for other species

merchvol, increment, and gross vol functions use pure python. cython would be effective.

# Decide on the action

- use same tolerance for aw as other species
- use at instead of [] or ix? - compare these in MWE
- creating data frame is slow, maybe because its fromdict. see if this can be improved

# MWEs

In [29]:
%%timeit
d = pd.DataFrame(columns=['A'])
for i in xrange(1000):
    d.append({'A': i}, ignore_index=True)

1 loop, best of 3: 1.39 s per loop


In [30]:
%%timeit
d = pd.DataFrame(columns=['A'], index=xrange(1000))
for i in xrange(1000):
    d.loc[i,'A'] = i

1 loop, best of 3: 150 ms per loop


# Revise the code

Go on. Do it.

# Review code changes

In [2]:
%%bash
# git log --since 2016-11-09 --oneline

In [3]:
# ! git diff HEAD~7 ../gypsy

# Tests

Do tests still pass?

# Run profiling

In [18]:
from gypsy.forward_simulation import simulate_forwards_df



In [19]:
data = pd.read_csv('../private-data/prepped_random_sample_300.csv', index_col=0, nrows=10)

In [20]:
%%prun -D forward-sim-2.prof -T forward-sim-2.txt -q
result = simulate_forwards_df(data)

 
*** Profile stats marshalled to file u'forward-sim-1.prof'. 

*** Profile printout saved to text file u'forward-sim-1.txt'. 


In [21]:
!head forward-sim-2.txt

         10055657 function calls (9875729 primitive calls) in 76.264 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   492069    6.857    0.000    6.857    0.000 GYPSYNonSpatial.py:427(BasalAreaIncrementNonSpatialAw)
  1836602    6.527    0.000    9.190    0.000 {isinstance}
796652/624746    3.102    0.000    4.823    0.000 {len}
     7191    2.670    0.000   40.459    0.006 GYPSYNonSpatial.py:959(BAfromZeroToDataAw)
   511948    2.020    0.000    3.373    0.000 {getattr}


In [22]:
!diff -y forward-sim-2.txt forward-sim-1.txt

diff: forward-sim.txt: No such file or directory


# Compare performance visualizations

Now use either of these commands to visualize the profiling

```
pyprof2calltree -k -i forward-sim-1.prof forward-sim-1.txt

# or

dc run --service-ports snakeviz notebooks/forward-sim-1.prof
```

### Old

![definitive reference profile screenshot](forward-sim-1-performance.png)

### New

![1st iteration performance](forward-sim-2-performance.png)

## Summary of performance improvements

forward_simulation is now 4x faster due to the changes outlined in the code review section above

on my hardware, this takes 1000 plots to ~8 minutes

on carol's hardware, this takes 1000 plots to ~25 minutes

For 1 million plots, we're looking at 5 to 17 days on desktop hardware



# Profile with I/O


In [None]:
! rm -rfd gypsy-output

In [None]:
output_dir = 'gypsy-output'

In [20]:
%%prun -D forward-sim-2.prof -T forward-sim-2.txt -q
# restart the kernel first
data = pd.read_csv('../private-data/prepped_random_sample_300.csv', index_col=0, nrows=10)
result = simulate_forwards_df(data)
os.makedirs(output_dir)
for plot_id, df in result.items():
    filename = '%s.csv' % plot_id
    output_path = os.path.join(output_dir, filename)
    df.to_csv(output_path)


 
*** Profile stats marshalled to file u'forward-sim-1.prof'. 

*** Profile printout saved to text file u'forward-sim-1.txt'. 


# Identify new areas to optimize



- from last time:
    - parallel (3 cores) gets us to 2 - 6 days - save for last
    - AWS with 36 cores gets us to 4 - 12 hours ($6.70 - $20.10 USD on a c4.8xlarge instance in US West Region)
- now:
    - 

# Identify some means of optimization

In order of priority/time taken

1.
2.