# Crashing Python (but FASTER!)

Or, how I learned to stop worrying and love foreign function interfaces.

## Who

Paul Kehrer - just some guy who makes bad life choices

## What

A brief demo of cffi, a tool for invoking code using the C calling convention from Python...

...and __PyPy__, which you should *really* be using.

## Why

Sometimes you want to...

* leverage the vast body of existing C code out there.
* do something Python is bad at for a variety of reasons (cryptography!)
* wring every last drop of performance out of a specific task

...and sometimes you just want to watch Python burn.

####Disclaimer

Apologies in advance for the verbosity of these slides. The presentation setup here prevents presenter's notes and I have the memory of a goldfish, so you're gonna see some text. (If it's in parentheses those sentences are not for you! Quit reading this part.)

Also, please ask questions. Stop me at any time.

## In the beginning

In the before time, in the long long ago, there was CPython, and it was good. When you wanted to call some C you got out your abacus and protractor and, after a few years of effort, got this.

```c
static PyObject *
ALG_getattro(PyObject *self, PyObject *attr)
{
	if (!PyString_Check(attr))
		goto generic;

	if (PyString_CompareWithASCIIString(attr, "digest_size")==0)
		return PyInt_FromLong(DIGEST_SIZE);
	if (PyString_CompareWithASCIIString(attr, "name")==0)
		return PyString_FromString(_MODULE_STRING);     /* we should try to be compatible with hashlib here */

  generic:
#if PYTHON_API_VERSION >= 1011          /* Python 2.2 and later */
	return PyObject_GenericGetAttr(self, attr);
#else
	if (PyString_Check(attr) < 0) {
		PyErr_SetObject(PyExc_AttributeError, attr);
		return NULL;
	}
	return Py_FindMethod(ALG_methods, (PyObject *)self, PyString_AsString(attr));
#endif
}
```
(Code on loan from the British Museum)

After several centuries of this the world's brightest minds decided they could do better (as well as potentially support alternate runtimes), and so they did

* SWIG (created ~1346 AD; also responsible for the bubonic plague)
* ctypes (created in 1839 concurrently with the daguerreotype)

These provided a variety of conveniences, but lacked a certain je ne sais quoi (that's French for they still sucked).

There's also Cython, but it isn't really meant for the same purpose so we're going to pretend it doesn't exist.

### cffi - a new challenger has entered the ring!

Pretend I knew how to insert a Street Fighter II image here.

I clearly didn't, but this is my presentation so we're operating in my reality. (Wait for knowing nods.)

Oh right, the presentation! Where was I?

In [2]:
import cffi

That little import carries some immense power. Let's do something simple with it. First, get an ffi instance.

In [2]:
ffi = cffi.FFI()

The ffi object has a few functions we care about:

* ```cdef``` -- This lets us define types and function signatures
* ```dlopen``` -- This lets us open a precompiled library and use the functions we defined via ```cdef```
* ```verify``` -- This lets us write our own C code to call libraries, write new functions, et cetera.
* ```new``` -- Create any type the ffi object knows about. This can be C builtins or types from ```cdef```
* ```gc``` -- Registers an object for garbage collection. This means when it goes out of scope the function provided will be called.

With that out of the way let's define a function we want to be available. ```sprintf``` is both dangerous and easily bound so obviously we should use that. C people can't resist danger's siren song.

In [3]:
ffi.cdef("int sprintf(char *, const char *, ...);")
lib = ffi.dlopen(None)

We dlopen'd None because when dealing with the normal C namespace we don't need to specify a library.

Now we'll create a new 50 byte character array and invoke sprintf with some arguments to populate it.

In [4]:
buf = ffi.new("char[50]")
data = ffi.new("char[]", "tech talk")
res = lib.sprintf(buf, "Hello %s!", data)
print(ffi.buffer(buf, res)[:])

Hello tech talk!


"Well that was a boring and obviously contrived example", says the audience while sneering at the gallant presenter.

Don't be so jaded, that was pretty cool. We wrote a few lines of Python and managed to invoke a C function and get its return value!

### Still boring and contrived

In [8]:
ffi = cffi.FFI()
ffi.cdef("char *hello_world(void);")
lib = ffi.verify("""
char * hello_world(void) { 
    char *hello = (char *)malloc(6);
    strcpy(hello, "hello");
    return hello;
}
""")
print(ffi.string(lib.hello_world()))

hello


Rather than ```dlopen``` an existing library, we define a function and an implementation using ```verify``` and then call it. You can use this to write self-contained C like the example above or write code that interfaces with other C libraries to simplify the interface to the Python layer. This particular example leaks memory (this will be a recurring theme).

### Something more useful

Let's encrypt something!

In [5]:
ffi = cffi.FFI()
ffi.cdef("""
typedef ... ENGINE;
typedef ... EVP_CIPHER;
typedef ... EVP_CIPHER_CTX;
int EVP_CipherInit_ex(EVP_CIPHER_CTX *, const EVP_CIPHER *, ENGINE *,
                      const unsigned char *, const unsigned char *, int);
int EVP_CipherUpdate(EVP_CIPHER_CTX *, unsigned char *, int *,
                     const unsigned char *, int);
int EVP_CipherFinal_ex(EVP_CIPHER_CTX *, unsigned char *, int *);
int EVP_CIPHER_CTX_cleanup(EVP_CIPHER_CTX *);
void EVP_CIPHER_CTX_init(EVP_CIPHER_CTX *);
EVP_CIPHER_CTX *EVP_CIPHER_CTX_new(void);
void EVP_CIPHER_CTX_free(EVP_CIPHER_CTX *);
const EVP_CIPHER *EVP_get_cipherbyname(const char *);
""")
lib = ffi.dlopen("libcrypto")

What did we just do?!

* Set up some opaque typedefs
* Defined a bunch of functions
* Opened the libcrypto library (this can also be done with an absolute path)

Warning, you are entering a C zone. Memory leaks and caveats abound. Don't use this code! We are going to...

* Create a key
* Create an initialization vector
* assign some plaintext data (modulo 16 natch)
* Get an EVP_CIPHER \* using a string OpenSSL understands (cffi can automatically convert python strings for use with char \* arguments)
* Initialize a context and set up the encryption operation
* Create a buffer and an int \* for the function to store its output
* fetch the output

In [6]:
import binascii
import os
key = os.urandom(32)
iv = os.urandom(16)
pt = "my data is so very confidential."
evp_cipher = lib.EVP_get_cipherbyname("aes-256-cbc")
assert evp_cipher != ffi.NULL
ctx = lib.EVP_CIPHER_CTX_new()
assert ctx != ffi.NULL
lib.EVP_CIPHER_CTX_init(ctx)
res = lib.EVP_CipherInit_ex(ctx, evp_cipher, ffi.NULL, ffi.NULL, ffi.NULL, 1)
assert res != 0
res = lib.EVP_CipherInit_ex(ctx, ffi.NULL, ffi.NULL, key, iv, 1)
assert res != 0
buf = ffi.new("unsigned char[]", 32)
outlen = ffi.new("int *")
res = lib.EVP_CipherUpdate(ctx, buf, outlen, pt, len(pt))
assert res != 0
ct = ffi.buffer(buf)[:outlen[0]]
res = lib.EVP_CipherFinal_ex(ctx, buf, outlen)
assert res != 0
ct += ffi.buffer(buf)[:outlen[0]]
print(binascii.hexlify(ct))

6a13574342d15d1d42c761951f1b44959cf4b238c8a34a1315fcec92a1c6d4d56250cfa6acf27abc30a7c8ca1f206c54


And now, decryption...

In [7]:
ctx = lib.EVP_CIPHER_CTX_new()
assert ctx != ffi.NULL
lib.EVP_CIPHER_CTX_init(ctx)
res = lib.EVP_CipherInit_ex(ctx, evp_cipher, ffi.NULL, ffi.NULL, ffi.NULL, 0)
assert res != 0
res = lib.EVP_CipherInit_ex(ctx, ffi.NULL, ffi.NULL, key, iv, 0)
assert res != 0
buf = ffi.new("unsigned char[]", 32)
outlen = ffi.new("int *")
res = lib.EVP_CipherUpdate(ctx, buf, outlen, ct, len(ct))
decrypted = ffi.buffer(buf)[:outlen[0]]
res = lib.EVP_CipherFinal_ex(ctx, buf, outlen)
decrypted += ffi.buffer(buf)[:outlen[0]]
print(decrypted)

my data is so very confidential.


### Mission Accomplished

Except for those pesky memory issues...

cffi lets you do C things. That includes all the good and all of the bad. What did we do wrong in the previous example?

* We leaked memory with the EVP_CIPHER_CTX \* we created using EVP_CIPHER_CTX_new.
* We didn't check the return value on some of our decryption calls
* We created a return buffer of exactly 32 bytes. This is safe in this particular case, but knowing the absolute maximum amount of data that can be written back to a buffer is critically important in C

### Crashing!

In [8]:
ffi = cffi.FFI()
ffi.cdef("int sprintf(char *, const char *, ...);")
lib = ffi.dlopen(None)
buf = ffi.new("char[5]")
lib.sprintf(buf, "this is quite a bit longer than 5 bytes oh no")
# buffer overflow!

45

Remember how I said ```sprintf``` was dangerous? It doesn't know or care how many bytes I actually allocated to the buffer I passed it. It will just blithely write the entire string to that char \* until it's done. This means that while we allocated 5 bytes, in reality it wrote 45 bytes...40 of which were into some part of memory it didn't own. Whoops.

In [None]:
ffi = cffi.FFI()
ffi.cdef("""
EVP_CIPHER_CTX *EVP_CIPHER_CTX_new(void);
void EVP_CIPHER_CTX_free(EVP_CIPHER_CTX *);
""")
lib = ffi.dlopen("libcrypto")
ctx = lib.EVP_CIPHER_CTX_new()
ctx = ffi.gc(ctx, lib.EVP_CIPHER_CTX_free)
# do something with the ctx
lib.EVP_CIPHER_CTX_free(ctx)

In this case you'll perform an action and at some later date (when the gc chooses to run) it will crash because you're freeing memory you no longer own.

In [None]:
ffi = cffi.FFI()
ffi.cdef("""
typedef struct {
    int *value;
} THING;
int get_thing_value(THING *);
""")
lib = ffi.verify("""
typedef struct {
    int *value;
} THING;
int get_thing_value(THING *thing) {
    return *thing->value;
}
""")
def create_thing():
    thing = ffi.new("THING *")
    value = ffi.new("int *", 1001)
    thing.value = value
    return thing

thing = create_thing()
print(lib.get_thing_value(thing))

So...what will that code do?

If you guessed "maybe work, maybe give garbage data" you win!

### Other Cool Things

* Invoke C callbacks with functions written in Python
* Define opaque or fully specified structs and any amount in between
* Treat C arrays as iterables with proper bounds checking

Remember at the beginning of this interminable talk when I said "C calling convention"? Other languages can compile libraries that use that convention. Languages like Rust!

```rust
#![crate_type = "dylib"]


#[no_mangle]
pub extern fn square(value: i32) -> i32 {
        value * value
}
```

rustc square.rs

In [3]:
ffi = cffi.FFI()
ffi.cdef("int square(int);")
lib = ffi.dlopen("libsquare.dylib")
lib.square(3)

9

### cffi challenges

* cffi likes to compile on import. That's great for development, terrible for shipping packages. Weird workarounds are required (see: setup.py in pyca/cryptography)
* The nature of the way cffi does things means you have runtime dependencies on cffi, pycparser, and a slow import (~0.7s on a modern Core i7 for pyca/cryptography, tens of seconds on a Raspberry Pi...ugh)

But help is on the way. cffi 1.0 is designed to have a pre-compile stage. This will remove the runtime dependency on cffi and drastically speed up import.

Coming Real Soon Now™ (1.0dev2 currently in PyPI if you're feeling adventurous)

### finally...

When writing cffi code follow these simple rules:

* Don't do it unless you legitimately need it
* Check every return code and verify every memory allocation/release (you can use ```ffi.gc``` to register free functions to be called when the Python variable goes out of scope but that can lead to use after free bugs so be careful!)
* Assume everything can and will fail

### @reaperhulk on Freenode, Twitter, et al

Sometimes I write this thing:
https://github.com/pyca/cryptography