# <font size=12 color=red>Scaling Data with Limited Memory</font> (Part 1 of 3) 
# <font color=blue>Python Profiling:</font> *(using the right tool for the situation)*

# `Mr Fugu Data Science`

# (◕‿◕✿)

*Motivation*: `wouldn't it be better to see what the bottle neck is instead of getting an estimate or theoretical idea?`

+ If you have a program with a function that takes say 10 minutes to run, is this 30% or 90% of the total run time?
    + It would serve us to check if our code is also inefficient. But, without experience or guidance how would you know in the first place you have terribly slow or poorly written code?
    
*`Stepping back and doing a few things will aid in our inspection.`*
+ Investigate the run time of our program
    + Find functions that may be holding us back
        + If needed from that more ganular approach do a line by line search when you start to narrow down areas

# **`Profiling:`** 

+ `Consider needing to find the time certain operations run in a script, NOT benching marking`. This is because you are running statistics on the entire program.
    + In that case you would want to use something such as "timeit"

`Two types of Profiling:`

+ **Deterministic:**
    * monitoring events, **`while being accurate will have an effect on performance overhead`**. This would be better run on small functions or operations.

+ **Statistical:**
    * **`Less accurate but also uses fewer overhead`** resources by taking samples.
    
Something really useful and pretty cools is the (Call Graph) look into **gprof2dot** for example. It will convert your script into a graph like structure showing what functions are calling each other.


# `Other Tools:`

    + vprof
    + pyflame
    + stackImpact
    
https://medium.com/@antoniomdk1/hpc-with-python-part-1-profiling-1dda4d172cdf

# `Native Python Profiling examples :` 

+ **`Timeit:`** benchmark code blocks or lines of code
    + Not used on entire program
    + Code needs to be isolated
    
+ **`Cprofile:`** runs on entire program
    + evaluates each funcation call, then gives average time for those calls and a list of most frequent
        + Downside: high overhead, do not use this for production work! Consider, only for development.

+ **`Time:`** just a stop watch 
    + not able to run an entire program 
    
    

[external resources](https://www.infoworld.com/article/3600993/9-nifty-libraries-for-profiling-python-code.html)

**`Always, consider what your 'profiler' is measuring to get an idea of if it is best for your circumstance!`**

# `Time:` measuring the time for our code in a single run

# `Ex.)`

# `TimeIT:` execution time over multiple passes "runs"

# `Ex.)`

# `Cprofile:`
+ Measures wall clock time,think of this as elapsed time
    + Consider it as if we are measuring the time for a function to run
        + `You are NOT looking at every line of code!`
            + In that case you would need to do something else like a `line profiler`
+ Deterministic


# `EX.)`

Your basic old script you call in the interpretor:

`$ python3 some_file.py`

If you would like to use Cprofile

`$ python3 -m cprofile some_file.py`

[command line flags & similar](https://www.ibm.com/docs/en/aix/7.1?topic=names-command-parameters) | [beginner command line flags](https://jgefroh.medium.com/a-beginners-guide-to-linux-command-line-56a8004e2471)

* `There is an issue that you need to consider: the printout of this will generate a table containing the functions called but have no idea of relationship to each other such as dependency`

**`Problems 'Cons': 
1.) Large overhead
2.) Printout of each function represented by a line
3.) Real world use will be an issue and you should expect slower results
4.) Very important note: you may have slow code for a specific function and you can also have a function slow for specific inputs!`**

# `Profiling Static & Dynamic data`

In [None]:
# Line Profiler: 

# `External Software to visually see usuage`

`Ex.)`

# `Optimizing:`

+ Be careful when optimizing code due to the fact that you can start introducing difficult code to manage, update or read.


In [3]:
# Real World Usuage (Example)

# Good Reading resource:

https://nyu-cds.github.io/python-performance-tuning/02-cprofile/

# `Please Like, Share &` <font color=red>SUB</font>`scribe`

# `Citations & Help:`

# ◔̯◔


https://www.toucantoco.com/en/tech-blog/python-performance-optimization

https://wiki.python.org/moin/PythonSpeed/PerformanceTips#Profiling_Code

https://stackoverflow.com/questions/582336/how-do-i-profile-a-python-script

https://machinelearningmastery.com/profiling-python-code/ 

https://www.infoworld.com/article/3600993/9-nifty-libraries-for-profiling-python-code.html

https://medium.com/@narenandu/profiling-and-visualization-tools-in-python-89a46f578989

https://towardsdatascience.com/how-to-profile-your-code-in-python-e70c834fad89

https://pythonspeed.com/articles/beyond-cprofile/

https://betterprogramming.pub/a-comprehensive-guide-to-profiling-python-programs-f8b7db772e6

https://medium.com/geekculture/profiling-and-optimizing-your-python-code-64fe694b7f7f