# Python + HPC
### Jefferson Fialho Coelho
### 17200455 - ADS - Noite



### Agenda

* HPC
    * HPC Cluster Architecture
    * Parallel programming and CPU ark
    * Vectorization
* Python
    * A little bit about language
    * GIL
    * Experiments
        * numexp
* Final explanations
* When am I going to use this thing in my life?
* Career and opportunities


# HPC
## High Performance Computing

The practice of aggregating **computing power** in a way that **delivers much higher performance** than one could get out of a typical desktop computer or workstation in order to solve large **problems in science**, **engineering**, or **business**.




### HPC Cluster Architecture

![title](img/HPC-cluster-architecture.png)

## Parallel programming

![title](img/parallel.png)

### CPU ark example

![title](img/tile.png)

# Vectorization

![title](https://datascience.blog.wzb.eu/wp-content/uploads/10/2018/02/vectorization.png)

### [ TOP 1 = SUMMIT: DOE/SC/OAK RIDGE NATIONAL LABORATORY](https://www.top500.org/resources/top-systems/summit-doescoak-ridge-national-laboratory/)
### TOP500.org

![title](img/summit-supercomputer-800x450.jpg)

![title](https://sempreupdate.com.br/wp-content/uploads/2018/12/python.jpg)

### Python

* Created by **Guido van Rossum** and first released in **1991**
* Python as a programming language has enjoyed nearly a decade of usage in both **industry and academia**
* **Interpreted, high-level, general-purpose programming language**
* This **high-productivity language** has been one of the most popular abstractions to **scientific computing** and **machine learning**
    * Yet the base Python language remains **single-threaded** =(

* The language's core philosophy ([The Zen of Python](https://www.python.org/dev/peps/pep-0020/)):
    * **Beautiful** is better than ugly
    * **Explicit** is better than implicit
    * **Simple** is better than complex
    * **Complex** is better than complicated
    * **Readability** counts
    
## Just how is productivity in these fields being maintained with a single-threaded language?

## The reason is the level of abstraction the language design adopted. 

* It ships with many tools to wrap C code
* It prefers multiprocessing over multithreading in the base language (multiprocessing package in the native Python library)
* The community has adopted the paradigm to dispatch to higher-speed C-based libraries, and has become the preferred method to implement parallelism in Python (e.g. Intel® MKL, OpenBLAS) 

##  GIL
* The **Global Interpreter Lock**, or [GIL](https://wiki.python.org/moin/GlobalInterpreterLock), is a mutex that **protects access** to Python objects, **preventing multiple threads** from executing Python bytecodes at once. 
* This lock is necessary mainly because **CPython's memory management is not thread-safe**.

## How to cheat? 🤔

![title](img/python-gil-visualization.png)

# Experiments


## NumExpr 2.0

The numexpr package supplies routines for the fast evaluation of array expressions elementwise by using a vectorization.

In [None]:
import math
import sys
import numexpr as ne
import time
import matplotlib.pyplot as plt
import numpy as np

# 10⁶ values between -1.0 and 1.0
x = np.linspace(-1., 1., int(1e6))

# Function to calc sin (serial)
def calc_single(x,i):
    for i in range(i):
        sin = np.sin(2.*np.pi*x)
    return sin

# Function to calc sin (parallel)
def calc_parallel(x,i):
    pi = np.pi
    for i in range(i):
        sin = ne.evaluate('sin(2*pi*x)',
                          optimization = 'aggressive')
    return sin


In [None]:
# Get time of 500 serial runs of sin calc
%timeit calc_single(x,500)

# Plot an example of sin
plt.plot(x, calc_single(x,1))

In [None]:
# Get time of 500 parallel runs of sin calc
%timeit calc_parallel(x,500)

# Plot an example of sin
plt.plot(x, calc_parallel(x,1))

## How does numexpr achieve nearly a 25x speedup?

* **Vectorization commands** from the vector math library in Intel MKL. 
* The entire computation **stays in low-level** code before completing and **returning the result** back to the **Python layer**. 
* This method also **avoids multiple trips through the Python interpreter**, cutting down on single-threaded sections while also providing a concise syntax.

## Final explanations

* ⬆ Complex, but **simple** (concise)
* ⬆ **Easy to learn** (?)
* ⬆ **Readable** documentation
* ⬆ It's possible to extract **high performance**
* ⬇ **black-box** (some situations)

## Nice, but when am I going to use this thing in my life?

![title](https://i.imgur.com/U4SezaC.png)

![title](img/salt.png)

## [Video: ai rough sketches realistic landscapes gaugan](https://mashable.com/video/ai-rough-sketches-realistic-landscapes-gaugan/)

# Career and opportunities


![title](https://www.daxx.com/uploads/most-wanted-languages-in-programming-usa-201820181106.png)
[Source: stackoverflow 2018 survey](https://insights.stackoverflow.com/survey/2018/)

![title](img/js.png)
[Source: Glassdoor](https://www.glassdoor.com/Salaries/javascript-developer-salary-SRCH_KO0,20.htm)

![title](https://www.daxx.com/uploads/average-salary-for-python-developers-by-state20181116.png)
[Source: DAXX](https://www.daxx.com/blog/development-trends/python-developer-salary-usa)

## Brazil - Average Python developer salaries (per month)

![title](img/pysalario.png)
[Source: Lovemondays](https://www.lovemondays.com.br/salarios/cargo/salario-desenvolvedor-python)

## Brazil - Job position salaries (per month)

![title](img/pytabela.png)
[Source: Lovemondays](https://www.lovemondays.com.br/salarios/cargo/salario-desenvolvedor-python)

# TKS! 🤙


### Presentation source 
#### github:

* [http://tiny.cc/tei_jeff](http://tiny.cc/tei_jeff)