<h1 align="center">PROFILING- Why Is It Still So Slow?<br> How Do I Run A Billion Iterations Without Taking A Long Lunch Break?</h1>
<br>

Better yet is the question, ***'Can we make it run faster?'***

**Disclaimer:** Before approaching the subject of profiling and speed improvement a couple of things need to be made abundantly clear. First this is just a toy program for a functional example teaching approach and NOT something designed for production use! Second, based on the above comments this codes is NOT highly pythonic nor optimized. Third, the profiling we will be doing is old school, 'poor-man' remedial timing runs.

Regardless of these facts you will see dramatic results from what we will now embark on!

## Scaling Runs

All scaling runs use the time output from the Jupyter Notebook 'Done' message output or the QtMessagebox from our QtGuied application. Both have one simple modifictation, the addition a loop to run the 'Timing Loop' 3 times and take the average. The iterations were log increased from 100 to 1 billion iterations. The results from all scaling runs we will discuss are listed below.

In [None]:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

%matplotlib inline

In [None]:
#number of iterations used in scale profiling tests
numIters = np.array([100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000])

#functionl based script with numbe.jit enabled
jit = np.array([0.4933, 0.5543, 0.5737, 0.5923, 0.7379, 1.3825, 8.3346, 406.154])

#Final class based Qt project
calcTime=   [0.0009, 0.0129, 0.0747, 0.6073, 6.1585, 60.7318, 601.6905, 6172.3073]
renderTime= [0.0268, 0.0258, 0.0189, 0.0239, 0.0438,  0.4268,   1.0261,   12.5624]
totalTime=  [0.0448, 0.0577, 0.1156, 0.6671, 6.2902, 61.2784, 605.0613, 6188.4471]

staticJit=  [0.0156, 0.0313, 0.0489, 0.2356, 1.9807,  19.4169, 193.2154, 2083.1466]

#Pure c++ and OpenGL using pixel binning for shading (No Datashader)
pureC =     [0.0093, 0.0147, 0.0706,0.1776, 0.2213, 0.4147, 2.5003, 121.8462]

#Speed increase seen when using numba.jit
pdiff = totalTime/jit


## Function based code with numba.jit enhancement VS the class approach with no numba.jit

The first thing we will look at is the comparision of our first function based code which included numba.jit versus the sad effects of turning our code into a class based approach but which was incapable of running our original numba.jit enhancement.

As we know just from the 'feel' of it running with numba.jit is indeed much faster, uhm compiled code-duhhh, verses non-jit enhanced code. But now we can see just how much numba jit has helped us!

Note the beatiful(?) linear nature fo the plot. Remember we are workign with a purely iterative function making it linear in its scaling. Tghe advantage is once you get a feel for the run times for given iteration variations you can fairly accurately predict the runtime for a new iteration. Meaning increase the iterations by 1 magnitude and the time will increase by 1 magnitude!


In [None]:
fig, (ax1, ax2) = plt.subplots(1,2,figsize=(12, 4),sharey=True)
fig.suptitle('Scaling profile for numba.jit\nLeft Plot=Linear, Right Plot=Semilogx')

ax1.plot(numIters, totalTime, label='No jit')
ax1.plot(numIters, jit, label='jit enabled')
ax1.set_xlabel('Number of iterations')
ax1.set_ylabel('Runtime (seconds)')
ax1.legend(loc=0)
ax1.grid()

ax2.semilogx(numIters, calcTime, label='Calc Time')
ax2.semilogx(numIters, renderTime, label='Render Time')
ax2.set_xlabel('Number of iterations')
ax2.legend(loc=0)
ax2.grid()

plt.show()

## @jit speed increase over the Qt based 
Numba claims that using @jit to precompile code you can expect upwards of a 50% speed increase. We can't tell that from the above plots but if we calculate the actual percent speed increase we can see we actually do extremely well on the speed increases except at the extremes (100 and 1 billion).

Why the disparity. Well when you have a simple set of linear equations and a low number of iterations over it, its just going to be, in this case, blazingly fast no matter what. One the other end, well ahh 1 billion is just simply A LOT to expect when you take into account, unless you are running on a cluster, that there is probably a lot of junk running in the background which over time will literally suck the life out of your performance. Does that really account for the huge drop in speed up at 1 billion point? I doubt it but have not investigated further.


In [None]:
fig, (ax1, ax2) = plt.subplots(1,2,figsize=(12, 4),sharey=True)
fig.suptitle('Speedup using numba.jit\nLeft Plot=Linear, Right Plot=Semilogx')

ax1.plot(numIters, pdiff)
ax2.semilogx(numIters, pdiff)
ax1.set_ylabel('Speed up (%)')
ax1.set_xlabel('Number of iterations')
ax1.grid()

ax2.set_xlabel('Number of iterations')
ax2.grid()

plt.show()

## Wheres the real slow down, calculating the points or rendering them?

So the real question is where is the real time sink in our code, the calculation of the attractor or rendering all the points. If, like me, you spend your life rendering things you would instantly shout out RENDERING! Rendering even today can be PAINFULLY slow, as in days for a single plate and you have 5 seconds (at roughly 30fps) to render so boom say good bye to a large set of render nodes for a week (seriously!)

But, happily we are not raytracing with all the latest greatest new tools thrown into the mix!

So the question still remains who is slower, calculating or rendering.

In [None]:
fig, (ax1, ax2) = plt.subplots(1,2,figsize=(12, 4),sharey=True)
fig.suptitle('Calculation time vs Render Time\nLeft Plot=Linear, Right Plot=Semilogx')

ax1.plot(numIters, calcTime, label='Calc Time')
ax1.plot(numIters, renderTime, label='Render Time')
ax1.set_xlabel('Number of iterations')
ax1.set_ylabel('Runtime (seconds)')
ax1.legend(loc=0)
ax1.grid()

ax2.semilogx(numIters, calcTime, label='Calc Time')
ax2.semilogx(numIters, renderTime, label='Render Time')
ax2.set_xlabel('Number of iterations')
ax2.legend(loc=0)
ax2.grid()

plt.show()

Clearly Datashader is EXTREMELY efficent at rendering large datasets!

All our time is being used on calculating the data! Plus as expected, we are still seeing a nice(?) linear relationship when increasing the number of iterations.

But is Datashader really static in rendertime regardless of how much data you throw at it?

The below, proves that no it is not. But what is nice, at least in this usage of Datashader its pretty linear as well. Not only that but when manipulating and render 1  BILLION datapoints it still only takes ~12 seconds! Thats extremely sexy for use who now live in a world of instant coumputing gratification expectations!


In [None]:
fig, (ax1, ax2) = plt.subplots(1,2,figsize=(12, 4),sharey=True)
fig.suptitle('Datashader render Time Scaling\nLeft Plot=Linear, Right Plot=Semilogx')

ax1.plot(numIters, renderTime)
ax1.set_xlabel('Number of iterations')
ax1.set_ylabel('Runtime (seconds)')
ax1.grid()

ax2.semilogx(numIters, renderTime)
ax2.set_xlabel('Number of iterations')
ax2.grid()

plt.show()

## Well thats not shocking news but what can we do about it?

So we have seen that @jit does not work in classes. We have aslo explored the potential of using @jitclass() and found it pretty much prohibitive. So what are the options?

First lets go back and recognize that @jit does indeed work if its JUST a function and not embedded in a class. Yet we are beginning to see the power and absolute need for classes. If Python is truely this magical 'dreamit and it already exists' language then wheres the solution? 

## Introducing the @staticmethod

OK, how about this, what if we could have a class that had functions in it but they were not actually part of the class? Confused yet? Well heres the solution **@staticmethod** a vanillia, albeit not overly discussed, core python feature.

The magic of @staticmethod is that it creates a static object, which can live within the 'container' of class but not be associated with the class 'object' itself. If you are familiar with Java it's somewhat akin to a global function but do not confuse it with Fortran or C++'s concept of a global! As for Julia and it's ability to create a new scope at almost any time, yeah well lets just not go there right now!


- The basic construct of a @staticmethod is to add this decorator before the function in your class.
- Next, unlike all (almost) other functions in a class you DO NOT add 'self' as the first or any parameter!!!
- Now add your function code.
- When you want to call the code you still need a handle/instance to the class but you do not need to pre-instantiate a variable as a handle to the class. 

Below is a quick example of how such a function is created

In [None]:
import numpy as np

class AttractorEquations:
    
    @staticmethod
    def Clifford(x,y,z,a,b,c,d,*args):
      Xn = np.sin(a * y) + c * np.cos(a * x)
      Yn = np.sin(b * x) + d * np.cos(b * y)
      Zn = 0.0
      return Xn,Yn,Zn
    

There are several important things you need to make note of which are important 'caveats' of @staticmethod function.
- We now have to find a way to pass all the variables we need for the function to work. Meaning in this case we can NOT use variables in the scope of the class (again we are just using the class as a shell and have no association to the class as an object! Thus no setters or class member variable access!
- Likewise anything we return has to literally be returned, no getters!
- One other thing is the '\*arg' parameter. This is cool and we have seen it earlier but now we are using the concept of \*args slightly differently. Remember we actually have 7 different attractor variables we are making use of in all our various attractor equations. Yet for Clifford we only need 4 (a,b,c,d). Since we desire a generic method to handle any of the equations regardless of the parameters needed we pass all of them to any of your attractor equation classes. So in this case the empty variables e,f, and g are just caught with the \*args variable and simply ignored after that point. It's only 3 variables in this case but remember the 167 parameter equation I menetioned earlier? How would you like to deal with that for EVERY equation function? That brings up the question then of how would you deal with passing 167 parameters? Answer: \*args. Pass them all in one var and then parse them in your function or call another @staticmethod parsing function.


As long as we understand and work in these confines then life is grand.

Now to access this function you can either;<br>
```python
eqns = AttractorEquations()
x[i+1],y[i+1],z[i+1] = eqns.Clifford(my args)
```
or just<br>
```python
x[i+1],y[i+1],z[i+1] = AttractorEquations.Clifford(my args)
```

## @staticemethod has a sister - @classmethod

We are not going to dig to much into it but there is a simular, albeit opposite, decorator function called **@classmethod**. There are three extremely important times when you could, and usually should, use @classmethod.
- When creating an abstract base class, especially when you have functions that the child classes are intended to overload (also called overriding)!
- Creating factory methods/classes.
- When creating a polymorphic class with various optional constructors.

What makes @classmethod different from @staticmethod, in context is in @staticmethod function you do not use 'self' becuase the function is not within the 'object' of the class just the container of it. 

In @classmethod you also do not include 'self' as a parameter. Instead you pass 'cls' in 

In [None]:
class Researcher(object):
    def __init__(self, first_name, last_name):
        self.first_name = first_name
        self.last_name = last_name    

    @classmethod
    def from_string(cls, szName):
        first_name, last_name = map(str, szName.split(' '))
        gra = cls(first_name, last_name)
        return gra

funnestNameEver = Researcher.from_string('Ranomi Kromowidjojo')  

print(funnestNameEver)

#Ranomi Kromowidjojo is a Dutch freestyler swimmer who has the 2008 Olympic Gold medal in the 4x100 relay 
#and the 50m and 100m freestyle swims from the 2016 Olympics.
#As far as I know she still holds the world record in the 50m freestyle short course. 
#But more exciting, to me anyways, is how insanely fun it is to say her name!

Two major things to note here.
- We can use the hosting class(object)'s class members and if desired class functions.
- Note that when you print out the returning object from a @classmethod you actually get an object!

Plus 3 not so widely known facts even amoung @classmethod users.
- You can call a @staticmethod from a @classmethod
- Likewise you can call another @classmethod from a @classmethod (I guess this coule be a factory of factories(?))
- You do not have to name your 'cls' variable 'cls'. Heck you can call it 'Kromowidjojo' if you like but by convention please always use 'cls so others will know what is going on when reading your code!

## Back to our @jit issue
Well that was a nice segway but what does it have to do with getting @jit to work inside a class?
I'm so glad you asked! Remember when you tried to find our why it was failing? It did not like being or using members or functions in a class (true this is a poor definition of what is really happening but this is the effective result).

But now we have shown that we can create a class that has our equations in it and they are pure static(ish for c/c++ folks) functions. So what if we modify our function to this;

In [None]:
import numpy as np
from numba import jit

class AttractorEquations:
    
    @staticmethod
    @jit
    def Clifford(x,y,z,a,b,c,d,*args):
      Xn = np.sin(a * y) + c * np.cos(a * x)
      Yn = np.sin(b * x) + d * np.cos(b * y)
      Zn = 0.0
      return Xn,Yn,Zn

**IMPORTANT NOTE:** The ordering of the @staticmethod and @jit are critical. The function must be declared as a static method before numba can work with it without barfing errors on you.

# A full example of working version of our Attractor classes making use of @staticmethods and @jit


Below is a modified version of the Notebook version of the code we created earlier which enables the ability to create our much more flexible class design work and still use numba.jit.

In [None]:
import numpy as np
import pandas as pd
import datashader as ds
from datashader import transfer_functions as tf
from datashader import utils
from numba import jit
import time 
import os
from colorcet import palette


In [None]:
class Attractor_Config():
    """Utility class to open the Attractors configuration file
       and extract the parameters for a choosen attractor type"""
    
    def __init__(self, fName=""):
        self._df = None
        self._fName = fName
        self._coords = []
        self._params = []
        
        if self.fName is not None:
            self.get_AttractorConfigFile()
    
    def get_AttractorConfigFile(self):
        """Open the Attractors configuration file"""
        if os.path.isfile(self._fName) and os.access(self._fName, os.R_OK):
            self._df = pd.read_csv(self._fName)
            self._df.set_index('Attractor', inplace=True)   
        else:
            print("ERROR: The file is either missing or it's not readable")
            sys.exit(1)
    
    def getAttractor(self, fxn):
        """Get the configuration for the desired attractor"""
        a = self._df.loc[fxn]

        self._coords = [a.iloc[1], a.iloc[2], a.iloc[3]]

        self._params = np.array([])
        for v in range(4, np.size(a)):
            self._params = np.append(self._params, a.iloc[v])

        return self._coords, self._params
    
    @property
    def coords(self):
        """Getter - return coords"""
        return self._coords
    
    @property
    def df(self):
        """Getter - return df"""
        return self._df

    @property
    def fName(self):
        """Getter - return fName """
        return self._fName
    
    @property
    def params(self):
        """Getter - return params"""
        return self._params

    
    @df.setter
    def df(self, val):
        """Setter - sets df"""
        self._df = val
        
    @coords.setter
    def coords(self, val):
        """Setter - sets coords"""
        self._coords = val
        
    @fName.setter
    def fName(self, val):
        """Setter - sets fName"""
        self._fName = val
        
    @params.setter
    def fName(self, val):
        """Setter - sets params"""
        self._params = val
    

In [None]:
class AttractorEquations():
    @staticmethod
    @jit
    def Pickover(x, y, z, a, b, c, d, *args):
        Xn =  np.sin(a*y) - z*np.cos(b*x)
        Yn =  z*np.sin(c*x) - np.cos(d*y)
        Zn =  np.sin(x) 

        return Xn, Yn, Zn
    
    @staticmethod
    @jit
    def Clifford(x,y,z,a,b,c,d,*args):
      Xn = np.sin(a * y) + c * np.cos(a * x)
      Yn = np.sin(b * x) + d * np.cos(b * y)
      Zn = 0.0
      return Xn,Yn,Zn

In [None]:
class Attractors(object):
    """ """
    def __init__(self, *args, **kwargs):
        self.n = 100000#00
        
        self.parse_kwargs(**kwargs)
        self.parse_coords()
        self.parse_avars()
        self._cmap = 'bgy'

    def parse_kwargs(self, **kwargs):
        svd_opts = ['fxn', 'coords', 'avars']
        
        for key in svd_opts:
          if key in kwargs:
            setattr(self, key, kwargs[key])

    def parse_coords(self):
        self.x = self.coords[0]
        self.y = self.coords[1]
        self.z = self.coords[2]

    def parse_avars(self):
        self.a = self.avars[0]
        self.b = self.avars[1]
        self.c = self.avars[2]
        self.d = self.avars[3]
        self.e = self.avars[4]
        self.f = self.avars[5]
        self.g = self.avars[5]

    @staticmethod
    @jit
    def trajectory(fn, x0, y0, z0, a=0, b=0, c=0, d=0, e=0, f=0, g=0, n=1000):
        eqns = AttractorEquations()
        fxn_dispatch = {'Clifford' : eqns.Clifford}
        print(fn)
       
        x, y,z = np.zeros(n), np.zeros(n), np.zeros(n)
        x[0], y[0], z[0] = x0, y0, z0

        
        for i in np.arange(n-1):
            x[i+1], y[i+1], z[i+1] = fxn_dispatch[fn](x[i], y[i], z[i], a, b, c, d, e, f, g)
        return pd.DataFrame(dict(x=x, y=y, z=z))
    
    def dsplot(self,  fn, coords, avars, n,cmap='bgy'):
        """Return a Datashader image by collecting `n` trajectory points for the given attractor `fn`"""
        cmap = palette[cmap][::-1]      
        df  = self.trajectory(fn, *coords, *avars, n=n)
        cvs = ds.Canvas(plot_width = 400, plot_height = 400)
        agg = cvs.points(df, 'x', 'y')
        img = tf.shade(agg, cmap=cmap)
        return img
    
    @property
    def cmap(self):
        """Getter - return cmap"""
        return self._cmap
    
    @cmap.setter
    def cmap(self, val):
        """Setter - sets cmap"""
        self._cmap = val

In [None]:
#if __name__ == '__main__':
fName = "attractors.atr"

attrConfig = Attractor_Config(fName)
coords, avars = attrConfig.getAttractor('Clifford')

start = time.time()
attr = Attractors(fxn='Clifford', coords=coords, avars=avars)
attr.cmap = 'bgy'

img = attr.dsplot(fn='Clifford',coords=coords, avars=avars, n=1000000,cmap='kbc')
end = time.time()
print('Time: {}'.format(end-start))

img

# But does it help?

So now we did so fancy antsy pants things to our code and it magically (again Python is the best example of true magic in our world) just simply works when in all reality it probably shoulden't!
BUT did all of the work? Is it really worth it to change our code to make this adjustment? Well, lets jump back to our profiling data.

Below we have a plot which includes our previous results:<br>
The orange line is our base function code using @jit<br>
The blue line is our Qt project code.<br>
The green line is our new @staticmethod + @jit code

In [None]:
fig, ax = plt.subplots()
ax.plot(numIters, totalTime, label='No jit')
ax.plot(numIters, jit, label='jit enabled')
ax.plot(numIters, staticJit, label='Static jit')

ax.set_title('Scaling profile for @staticmethod + numba.jit\non Clifford attractor')
ax.set_xlabel('Number of iterations')
ax.set_ylabel('Runtime (seconds)')
ax.legend(loc=0)
ax.grid()
plt.show()

While the speedup with our new code does not match up to our original purely functuion based code, it is vastly superiour to our Qt code as it currently stands!


# Can we do something even better yet with Python?

Using an old C++ with OpenGL, program I wrote many years ago a scaling run using the exact same attractor equation and parameters was done. The profiling plot below shows that even older C++/OpenGL code runs even faster then the combination of Pythons @staticmethod along with numba.jit. Or heck even Python's purely function based script. This should be a no brainer really but its interesting to see what difference, in this case, there is.


In [None]:
fig, ax = plt.subplots()
ax.plot(numIters, totalTime, label='No jit')
ax.plot(numIters, jit, label='jit enabled')
ax.plot(numIters, staticJit, label='Static jit')
ax.plot(numIters, pureC, label='Pure c++')

ax.set_title('Scaling profile for @staticmethod + numba.jit\non Clifford attractor')
ax.set_xlabel('Number of iterations')
ax.set_ylabel('Runtime (seconds)')
ax.legend(loc=0)
ax.grid()
plt.show()

So what does that mean for Python code enhancment?

**Cython!!!** Cython allows you to write c extensions inside of python itself. Plus there are other ways to blend other langauages with Python.<br>

Will it really help? This story is to be continued---


# The moral of this story

So the real question is, again, is it worth converting our code to add in @staticmethod and @jit? Well if you plan on rendering code with >1 billion points and doing it more then once then yes.<br>
**Note:** I found, even after bumping the memory as high as I can in the Jupyter config files, that Jupyter Notebook croaks at 100 billion iterations. My C++ code easily (albeit wasting an entire weekend) can render at least 1 trillion iterations!!!

***More importantly the answer is NO!***
Wait what? Why the heck would I say that? Because Cython is a better solution? Maybe but lest assume no.
OK then again, WHAT? 
The point is you now know that tools like @jit and making them work in the flexible all powerful world of classes exists. Why the heck would you intentionally design something like this and then plan to change it? Well unless your teaching a workshop. But other then this exception.<br>
***Design it right from the beginning and there is no time wasted refactoring all your code!***

**You now have a suite of new tools available to you - Go forth and create new magic of your own!**