# <font color=red>Python Profiling:</font> *`(using the right tool for the situation)`*

# `Mr Fugu Data Science`

# (◕‿◕✿)

*Motivation*: `wouldn't it be better to see what the bottle neck is: instead of getting an estimate or theoretical idea?`

+ If you have a program with a function that takes say 10 minutes to run, is this 30% or 90% of the total run time?
    + It would serve us to check if our code is inefficient. But, without experience or guidance how would you know in the first place you have terribly slow or poorly written code?
    
*`Stepping back and doing a few things will aid in our inspection.`*

`1.)` Investigate the run time of our program

* Get a baseline of what we are dealing with

`2.)` Then find functions that may be holding us back

`3.)` If you need a more granular approach consider a line by line search when you start to narrow down areas


**Before we move on, there is something we should really consider: You have times where you may have blocks of code which have dependency on other pieces of code. Therefore, complicating how your program runs. This may cause extra function calls, slow you down or make it hard to find a reference.**
+ Do not think that because something is called often that it is bogging everything down
+ Understand you may want to exclude from profiling the startup of a program or initial overhead
+ Optimizing code isn't always a good bet due to maintaining, keeping stable or readablility

# **`Profiling:`**

+ Consider `profiling` as evaluating the time or memory used for a program, function or even a line of code and figuring out the resources it is occupying.  
    + One thing we can consider: number of function calls. If you notice something called frequently take note but this in itself doesn't always declare an issue to resolve.
    

`Two types of Profiling:`

+ **Deterministic:**
    * monitoring events, **`while being accurate will have an effect on performance overhead`**. This would be better run on small functions or operations.
    + Think of it like this: if you put the same inputs you will get the same outputs

+ **Statistical:**
    * **`Less accurate but also uses fewer overhead`** resources by taking samples.
    
Something really useful and pretty cool is the (Call Graph) look into **gprof2dot** for example. It will `convert your script into a graph` like structure showing what functions are calling each other.


# `Other Tools:`

    + vprof
    + pyflame
    + stackImpact
    
https://medium.com/@antoniomdk1/hpc-with-python-part-1-profiling-1dda4d172cdf

# `Native Python Profiling examples:` 

+ **`Timeit:`** benchmark code blocks or lines of code
    + Not used on entire program
    + Code needs to be isolated
    
+ **`Cprofile:`** runs on entire program
    + evaluates each funcation call, then gives average time for those calls and a list of most frequent
        + Downside: high overhead, do not use this for production work! Consider, only for development.

+ **`Time:`** just a stop watch 
    + not able to run an entire program 
    
* If you have code that may take some time use: `time` instead for a speed up but lose some accuracy    

[external resources](https://www.infoworld.com/article/3600993/9-nifty-libraries-for-profiling-python-code.html)

+  **`Always, consider what your 'profiler' is measuring to get an idea of if it is best for your circumstance!`</font>**

# `Time:` measuring the time for our code in a single run

# `Ex.)`

`from time import time`

`start_time = time()`

`your code here`

`end_time = time()`

`print(f'It took {end_time - start_time} this many seconds!')`

# `TimeIT:` execution time over multiple passes "runs"

# `Ex.)`

**`Lastly, if you are using long running calls consider %time instead of %timeit. While it is less precise it is faster`**

*https://scipy-lectures.org/advanced/optimizing/index.html*

`___________________________________________________`


# `Cprofile:`
+ **`Measures wall clock time,think of this as elapsed time`**
    + Consider it as if we are measuring the time for a function to run
        + `You are NOT looking at every line of code!`
            + In that case you would need to do something else like a `line profiler`
+ Deterministic


# `EX.)`

Your basic old script you call in the interpretor:

`$ python3 some_file.py`

If you would like to use Cprofile

`$ python3 -m cprofile some_file.py`

Or Another example:

`python -m cProfile -s tottime some_program.py`

* This will give us a printout of the `-s tottime` which will be a sorted table of total time for each element
    + anything near the top is what you focus on changing/optimizing if possible.
    
    
# `Ex.) `

+ If you would like to run the code on a block instead of the entire program, you can encapsulate everything:

`import cProfile`

`cp = cProfile.Profile()`

`cp.enable()`

and here is your code you want to profile between

`cp.disable()`

`cp.print_stats()`



**'code snippet from Toucan Toco, link below!'**


[command line flags & similar](https://www.ibm.com/docs/en/aix/7.1?topic=names-command-parameters) | [beginner command line flags](https://jgefroh.medium.com/a-beginners-guide-to-linux-command-line-56a8004e2471)

* `There is an issue that you need to consider: the printout of this will generate a table containing the functions called. We have no idea of relationship to each other; such as dependency`

# <font color=red>Cons</font>: 

**`1.) Large overhead
2.) Printout of each function represented by a line
3.) Real world use will be an issue and you should expect slower results
4.) Very important note: you may have slow code for a specific function and you can also have a function slow for specific inputs!`**

In [None]:
# code Example to see print out of table

import cProfile
import pandas as pd



`___________________________________________________`

# `Line Profiler:` *this is a good option if you know what block of code is slowing you down but not sure where exactly* 



+ There is a side note that is important to mention which easily gets overlooked. `If you run some code that uses a library or external package: consider that computations in the background may be run and you are unaware.`

If you would like to install:

`python -m pip install line_profiler`

Or if using Anaconda:

`conda install line_profiler`

or 

`pip install line_profiler`

# `External Software to Visually see usuage`

`Ex.)`

+ Snakeviz
+ gprof2dot
+ 

# `Optimizing:`

+ Be careful when optimizing code due to the fact that you can start introducing difficult code to manage, update or read.


In [3]:
# Real World Usuage (Example)

# Good Reading resource:

https://nyu-cds.github.io/python-performance-tuning/02-cprofile/

https://cloud.google.com/profiler/docs/profiling-python



# `Please Like, Share &` <font color=red>SUB</font>`scribe`

# `Citations & Help:`

# ◔̯◔


https://www.toucantoco.com/en/tech-blog/python-performance-optimization

https://wiki.python.org/moin/PythonSpeed/PerformanceTips#Profiling_Code

https://stackoverflow.com/questions/582336/how-do-i-profile-a-python-script

https://machinelearningmastery.com/profiling-python-code/ 

https://www.infoworld.com/article/3600993/9-nifty-libraries-for-profiling-python-code.html

https://medium.com/@narenandu/profiling-and-visualization-tools-in-python-89a46f578989

https://towardsdatascience.com/how-to-profile-your-code-in-python-e70c834fad89

https://pythonspeed.com/articles/beyond-cprofile/

https://betterprogramming.pub/a-comprehensive-guide-to-profiling-python-programs-f8b7db772e6

https://medium.com/geekculture/profiling-and-optimizing-your-python-code-64fe694b7f7f

https://scipy-lectures.org/advanced/optimizing/index.html

https://docs.nersc.gov/development/languages/python/profiling-debugging-python/

https://medium.com/uncountable-engineering/pythons-line-profiler-32df2b07b290

https://www.toucantoco.com/en/tech-blog/python-performance-optimization