# <span style="color:blue">Bulletproofing Code</span>

## What could go wrong? Unfortunately, lots!

![](images/worst_thing_that_could_happen.png)

## <span style="color:darkorange">Potential Problems: </span>
- Bugs (code crashes, brittle to unexpected inputs)
- Code "works", but gives incorrect results
- Cannot reliably and automatically generate the same results each time
- External resources, like code dependencies and data change outside your control
- Code is slow and/or uses a lot of memory
- Your code is hard to understand
- Your code is hard to change

## <span style="color:darkorange">Make your code work first before trying to optimize it</span>

![](images/knuth.jpg)

# <span style="color:blue">Example: Extracting Information from Earnings Calls</span>

In [None]:
from pathlib import Path
import json

In [None]:
base_path = Path("../tests/data/ciq_transcript_samples")

In [None]:
sample_file = base_path / "Tesla/transcript_id__1025482.json"
text = sample_file.read_text()
transcript_dict = json.loads(text)
components = transcript_dict["components"]
tesla_answers = [c for c in components if c["componenttypename"] == "Answer"]

In [None]:
sample_file = base_path / "GM/transcript_id__1000656.json"
text = sample_file.read_text()
transcript_dict = json.loads(text)
components = transcript_dict["components"]
gm_answers = [c for c in components if c["componenttypename"] == "Answer"]

In [None]:
for answer in tesla_answers:
    text = answer['text']
text

In [None]:
import textacy
doc = textacy.make_spacy_doc(text)

# <span style="color:blue">Testing, Error Detection, and Profiling</span>

![](images/165-minor-change.png)

## Testing

In [None]:
import numpy as np
def exponentiate(num):
    return np.exp(num)

In [None]:
exponentiate(3)

## Exception Handling

In [None]:
try:
    print("trying to open a file...")
    path = Path("a/path/to/nowhere")
    print("found it!")
except:
    print("doh!")
finally:
    print("we're done here")

## Profiling

Example from: https://realpython.com/numpy-array-programming/

In [1]:
import numpy as np
np.random.seed(444)

In [2]:
x = np.random.choice([False, True], size=100000)
x

array([ True, False,  True, ...,  True, False,  True])

In [3]:
def count_transitions(x) -> int:
    count = 0
    for i, j in zip(x[:-1], x[1:]):
        if j and not i:
            count += 1
    return count

In [4]:
%timeit count_transitions(x)

5.47 ms ± 59.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [5]:
%timeit np.count_nonzero(x[:-1] < x[1:])

75.9 µs ± 565 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [9]:
%load_ext memory_profiler

The memory_profiler extension is already loaded. To reload it, use:
  %reload_ext memory_profiler


In [10]:
%%memit import numpy as np
np.count_nonzero(x[:-1] < x[1:])

peak memory: 56.12 MiB, increment: -0.05 MiB


## Configuring your Code Dependencies

Conda environments are cheap to create and easy to delete

In [None]:
! conda env list

Notice how many packages there are, so many opportunities for something to change and potentially break your code! If you're using a package, try to find ones with a sizable support community, not one-offs from an undergraduate class project.

In [None]:
! conda list

Tip: export your (pinned) dependencies to a file. You can use this to re-create your environment reproducibly, anwhere, and any number of times.

In [None]:
! conda env export --from-history

In [None]:
! conda env export --from-history | grep -v "^prefix: " > environment.yml
! sed -i '' 's/workshop-env/test-env/g' environment.yml

In [None]:
! conda env create -f environment.yml

In [None]:
! conda env list

In [None]:
! conda env remove -n test-env

## Configuring the Entire Environment

![](images/horizontal-logo-monochromatic-white.png)

![](images/container-what-is-container.png)

https://www.docker.com/

In [None]:
! docker build .. -t workshop:latest
# Open terminal to docker image: docker run -i -t workshop:latest /bin/bash