# 10:50 AM: Requests under the hood

[Cory Benfield](https://lukasa.co.uk)

[This is the requests library](https://github.com/kennethreitz/requests)

Why does code get messy?
* "You end up with a function where the only defined behavior is ... well it does whatever it does now, so let's not mess with it"
* "When it works, we don't think about it"
* Pragmatism
* The quick hacky fix is never temporary.
* Users don't complain about it
* Over time you enshrine every edge case ever into your code.
* Things are complex AND they evolve


**Hero-worship, while fine, is fundamentally predicated on a lie. Our heros are not dramatically better at writing better code than us**

* Any non-trivial software is flawed.
* We need to not be afraid of our mistakes.

Questions/Discussions:
* How do you balance the need to do things right with a need to immediately fix things? At some point, the hacks are okay, but acknowledge that they are hacks.
* Is it possible to deossify a codebase? When the combined weight of all the historical decisions make it impossible to keep up, someone else will probably come up with something better.

# 11:30 AM: Type uWSGI; Pres Enter; What Happens?

[Asheesh Laroia](https://github.com/paulproteus); [Philip James](https://twitter.com/phildini)

Previous talk: Type Python pres enter, what happens


[Youtube link](https://www.youtube.com/watch?v=XVhSjZYwZJo)

[Slides](https://speakerdeck.com/phildini/type-python-press-enter-what-happens-pydx-2015)

Running UWSGI

```bash 
uwsgi --master --http :8000 --module catserve.wsgi 
```

**How does uWSGI handle processes?**

* WSGI is a synchronous protocol

To un-ruin the service, run two worker processes, and there are now 2 uWSGI workers.
```bash 
uwsgi --master --http :8000 --module catserve.wsgi -p 2
```

**How does uWSGI handle networking?**

* Userland communicates with the kernel through syscalls

To connect to the uWSGI server?
1) uWSGI needs to issue the socket syscall to issue the socket that will connect to the port on the kernel
    * uWSGI is getting back a way of handling the socket
2) Bind connects the port to the socket
3) Listen() tells the kernel to expose the port and make it available for listening
4) This is when uWSGI spawns its workers. Keeps initial file descriptors. These worker processes issue epoll_wait() which tell the kernel that they are ready and waiting for any connections.

So...when the browser opens the port:
* Both processes try to "pick up the phone"
* One wins, and handles the connection through a separate socket.

**Why use uWSGI?**
* Graceful code reloading
* Tuneability
* Security
* Making config files
* Other features

How does code reloading work?

* With a new version of the code
* uWSGI sends sighup to the worker processes
* uWSGI tells anything not working to stop, and marks another one that it "needs to exit" after
* Always checks if it should exit before calling the epoll_wait()
* uWSGI erases its memory and creates a new process with the same pid and same socket, which never closes: thus can queue new code.
* Always waits for all worker processes to exit, and by default waits 60 seconds. You can make it reload worker processes one by one.

**Making sure the resource usage meets your needs**
* The `-p` parameter lets you change the # of processes

**Security**

> "Security is a word, that when you say it, everyone in the room feels inadequate"

```
GET / HTTP/1.1
HOST: catserve.io
```
vs. 
```
GET / HTTP/1.1
HOST: catserve.io
HOST: catserve.biz
```

* Interactions between the thing running your code and the rest of the stuff it runs could turn into security issues

** Config Files **

Translate the bash command from the beginning int:
```
[uwsgi]
master = 1
http = 8000
module = catserve.wsgi
processes = 2
```

**Features**
* Static file serving (high performance implementation in C)
* Max requests per worker
* Queuing systems
* HTTPS support, HTTP2 support
* uwsgitop
* memory-report
* async

# 12:10 PM Grokking the GIL: Write Fast and Thread Safe Python
[Jesse Jiryu Davis](https://twitter.com/jessejiryudavis)

We're going to look at Python 2.7 source code. 

['This is the GIL'](https://github.com/certik/python-2.7/blob/master/Python/ceval.c)

GIL:
* Mutual exclusion lock
* Initialized at the beginning of interpreter start up
* The GIL is born locked by the main thread
* Mutex is a lock that threads have to hold whenever they are executing Python code

> One thread runs Python, while N others sleep or await I/O.

All of your threads can be doing other things. The only thing that two threads can't do at once in Python is run Python.

> "I like to think of my Python programs as an old mainframe computer."

We want to know how to write fast and thread-safe python with CPython today?

2 Kinds of Python multitasking
* Cooperative: voluntarily dropping the GIL while this thread does something else
* Preemptive: interpreter forces a thread to drop Python
    * Happens every 1K bytecodes in Python2.
    * Happens every 15 milliseconds.
    
    
**Cooperative**
    
`Py_BEGIN_ALLOW_THREADS`
   
[Socket Connection source code here](https://github.com/certik/python-2.7/blob/c360290c3c9e55fbd79d6ceacdfc7cd4f393c1eb/Modules/_multiprocessing/socket_connection.c)
```c
Py_BEGIN_ALLOW_THREADS
res = _conn_sendall(conn->handle, message, length+4);
Py_END_ALLOW_THREADS
```


**Preemptive**

Two step Python execution:
* Python text compiled into bytecode
* Byteode interpreted by the interpreter, executes like a little VM according to the instructions in ceval.c

[Back to ceval.c](https://github.com/certik/python-2.7/blob/master/Python/ceval.c)

Debuggers register trace functions, which the interpreter executes every time it starts a function call. 
` if (tstate->c_profilefunc != NULL)`

This is where the GIL forces preemptive multithreading
> `/* Give another thread a chance */` 

**How do we use this to write fast code?**

* Cooperative multitasking: finish jobs faster if jobs are to primarily await i/o.
* Preemptive multitasking: simulate parallelism.

If there's a GIL, do we need to worry about thread safety?

Unsafe:
```python
import threading, sys
sys.setcheckinterval(1)

n=0

def foo():
    global n
    n +=1

threads = []

for i in range(1000):
    t = threading.Thread(target=foo)
    threads.append(t)

for t in threads:
    t.start()

for t in threads:
    t.join()

print(n)
```

Sometimes returns 1000, sometimes returns 998.

* Have to lock around shared mutable state

**So how do we make code go fast?**
* Concurrency: when a job finishes faster by waiting for multiple i/o operations at the same time.
* Parallelism: when a job finishes faster when multiple threads do things at the same time.
    * Python threads can't do this


How to use threads to make a concurrent task go faster. For an i/o bound task, threads can be a good answer.

How to use do parallelism despite the GIL.
* Proceses are completely distinct; they have independent GILs and can run simultaneously on multi-core laptops
    * Communication is a lot more complex
* Threads share memory

[Article version of the talk with code](https://emptysqua.re/blog/series/grok-the-gil/)


# 4:30 PM: The Glory of pdb's set_trace

Nicole Zuckerman

* Inspect variables in real-time
* Pretty print things and can see variables
* look at backtrace of current function
```
(pdb) bt
```
* Look at whatever variables are in that context
* Can move up or move down the call stack
* Traveling through execution.
* q doesn't cleanup
* `l` or `list` allows you to view 5 above and 5 below the lines you're on.
* Changing live code real-time 


**Other debugging tools**

Dunder dicts
`foo.__dict__`

dir
`dir(obj)`

**Pdb gotchas**
* Harder when code is not running in a place wiht a prompt (celery, cron job)
* Try not to name variables that would collide with a pdb command version
* `help` tells you what commands are available
* Hitting the "enter" key will reexecute a command like ```!!``` in bash

**Post-moterm debugging**
* `pdm.pm` takes no arguments and will enter post-mortem debugging of the last traceback.
* But if you use pdb.post_morten you can trace it back to something
* Breakpoints are numbered or you can give it a file name before the line number
* Conditional breakpoints or setting breaks at particular functions.
* Can drop into a pdb if/when a test fails.

# 5:10 PM : Algorithmic Music Generation
Padmaja V Bhagwat

[Github](https://github.com/unnati-xyz/music-generation)

Steps to algorithmically generate music?

1. Convert mp3 files to np-tensors
2. Train the model
3. Generate the music

**To convert the mp3 files to np-tensors**
* Uncompress the music: mp3 to monaural WAV using LAME
* Divide into blocks of equal size by zero padding it
* Convert from time to frequency domain using discrete fourier transform
    * Use a RNN that remembers previous information
    * LSTM: remembering information for a longer period of time that is something that model does by default?
    
LSTM Steps
* Forget gate layer
* Input gate layer + tanh layer decides what to update
* Old cell state gets updated
* Sigmoid layer decides output

Generating music
* Taking first chunk of training data as seed sequence
* Iteratively add sequences to the seed

**Challenges**
* Data intensive
* Memory intensive (32 GiB of memory, 8vCPUs, EBS-only, 64 bit platform)
* 14 hours to do 2K iterations

** Python libraries**
* LAME and SoX to convert mp3 files
* Numpy/Scipy
* Matplotlib
* Used Keras with Theano as backend
