# <span style="color:blue">Bulletproofing Code</span>

## What could go wrong? Unfortunately, lots!

![](images/worst_thing_that_could_happen.png)

## <span style="color:darkorange">Potential Problems: </span>
- Bugs (code crashes, brittle to unexpected inputs)
- Code "works", but gives incorrect results
- Cannot reliably and automatically generate the same results each time
- External resources, like code dependencies and data change outside your control
- Code is slow and/or uses a lot of memory
- Your code is hard to understand
- Your code is hard to change

## <span style="color:darkorange">Make your code work first before trying to optimize it</span>

![](images/knuth.jpg)

# <span style="color:blue">Example: Extracting Information from Earnings Calls</span>

In [None]:
from kelloggrs.load_data import *
from sqlalchemy import Table, MetaData, inspect
from pathlib import Path
import pandas as pd

In [None]:
with open(Path("../tests/config.yaml")) as conf_file:
    config = yaml.load(conf_file, Loader=yaml.FullLoader)
engine = create_database(config)
engine = insert_records(Path(config["data_path"]) / "Tesla", engine)
engine = insert_records(Path(config["data_path"]) / "GM", engine)

In [None]:
inspector = inspect(engine)

# Get table information
print(inspector.get_table_names())

# Get column information
columns = inspector.get_columns('Transcript')
for c in columns:
    print(c)

result = engine.execute("select * from Component")
for row in result:
    print(row)

In [None]:
transcript_df = pd.read_sql_table("Transcript", con=engine)
component_df = pd.read_sql_table("Component", con=engine)

In [None]:
transcript_df

In [None]:
component_df

# <span style="color:blue">Testing, Error Detection, and Profiling</span>

![](images/165-minor-change.png)

## Testing

In [None]:
result = engine.execute("select count(*) as cnt from Transcript")
row = result.fetchone()
print(f"Number of Transcripts: {row['cnt']}")
assert row["cnt"] == 137

In [None]:
result = engine.execute("select count(*) as cnt from Component")
row = result.fetchone()
print(f"Number of Components: {row['cnt']}")
assert row["cnt"] == 11445

![](images/project-structure.png)

## Exception Handling

In [None]:
try:
    print("trying divide by 0")
    100/0
    print("Infinity and beyond!")
except ZeroDivisionError:
    print("Can't do that.")
finally:
    print("Time to clean up this mess")

## Profiling

In [None]:
import spacy
nlp = spacy.load("en_core_web_md")

In [None]:
answers_df = component_df.loc[component_df['componenttypename'] == "Answer"]
texts = list(answers_df['text'])[:500]

In [None]:
%%time
docs = []
for text in texts:
    docs.append(nlp(text))

In [None]:
%%time
docs = []
with nlp.disable_pipes('tagger', 'parser'):
    for text in texts:
        docs.append(nlp(text))

### Example from: https://realpython.com/numpy-array-programming/

In [None]:
import numpy as np
np.random.seed(444)

In [None]:
x = np.random.choice([False, True], size=100000)
x

In [None]:
def count_transitions(x) -> int:
    count = 0
    for i, j in zip(x[:-1], x[1:]):
        if j and not i:
            count += 1
    return count

In [None]:
%timeit count_transitions(x)

In [None]:
%timeit np.count_nonzero(x[:-1] < x[1:])

In [None]:
%load_ext memory_profiler

In [None]:
%%memit 
import numpy as np
np.count_nonzero(x[:-1] < x[1:])

## Configuring your Code Dependencies

Conda environments are cheap to create and easy to delete

In [None]:
! conda env list

Notice how many packages there are, so many opportunities for something to change and potentially break your code! If you're using a package, try to find ones with a sizable support community, not one-offs from an undergraduate class project.

In [None]:
! conda list

Tip: export your (pinned) dependencies to a file. You can use this to re-create your environment reproducibly, anwhere, and any number of times.

In [None]:
! conda env export --from-history

In [None]:
! conda env export --from-history | grep -v "^prefix: " > environment.yml
! sed -i '' 's/workshop-env/test-env/g' environment.yml

In [None]:
! conda env create -f environment.yml

In [None]:
! conda env list

In [None]:
! conda env remove -n test-env

## Configuring the Entire Environment

![](images/horizontal-logo-monochromatic-white.png)

![](images/container-what-is-container.png)

https://www.docker.com/

In [None]:
! docker build .. -t workshop:latest
# Open terminal to docker image: docker run -i -t workshop:latest /bin/bash