# Machete-mode debugging:
## Hacking your way out of a tight spot

By Ned Batchelder
http://bit.ly/mdebug


# Python can be chaotic
 * dynamic typing
 * no protected private, final - things can change
 * Python doesn't have a way to prevent things from happening
 * NOthing is off limits

# This chaos can be used to our advantage
we can use it to get out


...this code isn't meant to sit there in production
It's awful, so get rid of it after you're done using it


# Case 1: double importing

Modules imported more than once
classes were defined twice
django was unhappy about this



In [None]:
# See slides for code on how module imports work
# The gist: python tries to return the same module when you import

Somehow, this promise got broken... but how?

To find out, he modified the actual django model in place (which you generally shouldn't do)
Using the `inspect` module, he added some code that printed the stack to a file when the module gets imported

Big shocker... it's imported twice!

## So what happened?

the import was getting to the same module in two different ways:
* foo.bar.bin
* foo.bin

## Why?
Some joker was using `sys.path.append()` to monkey with the path

# Lessons:
* `import` actually executes code
  * We could use this fact to help find the problem
  * however, this generally shouldn't be used in production
* In the case of debugging: "wrong" is okay
  * This code lives 10 minutes... don't commit it
* Avoid `sys.path` manipulation

# Case 2: Tests leaving temp files behind

* ~8000 tests and we're not sure where these directories and files are being created
* as a dynamic language, Python doesn't have the same static analysis tools that other languages have
* can't write stuff into test files because the test will fail

## Let's monkeypatch the standard library

We want to stuff information into the file name (lol)

In [None]:
# We make a function `my_sneaky_function()` that re-defines the function that normally writes the file

### Read the source!
Found : `get_candidate_names` which is where these tempfile things get their names

## How do we get in our patch before the function is used?

Python has a feature where a `*.pth` file gets executed first if:
* the first line starts : `import `

so we can make our own version of `get_candidate_names()` in the standard library

# Lessons:
* std lib is readable
  * It's on your disk... go look at it
  * It's also patchable
* Use whatever you can touch and change
  * That dirty feeling can go away after 10 minutes
  
* Look into AddCleanup for unit testing

# Case 3: Who's modifying `sys.path`?

* Something's modifying it incorrectly
* grep didn't find it
* must be in 3rd party code? (because we didn't find it in grep

## Data breakpoints
"Break when this data changes in a certain way"
`pdb` doesn't have this :(

Instead: Write a trace function


Trace functions are executed on every line (makes it slow, but that doesn't matter)

In [2]:
# See slides for relevant example code'

def trace(frame, event, arg):
    if sys.path[0].endswith('/lib'):
        pdb.set_trace() # Break into the debugger... a badly named function that is useful
    return trace

sys.settrace(trace)

## Culprit:
nose: it has a "helpful" feature that modifies the path based on a folder name... in this case we didn't want that behavior

## Lessons:

* It's not just _your_ code
* dynamic analysis is your friend
* sometimes you have to use a "big hammer"
  * this might be overkill, but so what

# Case 4: Why is random different?


* We want random, but repeatable code
* We seed the random generator
* For some reason despite the seed, the _first_ result was different from all other calls

## Let's divide by 0!
Use `1/0` as a booby trap

Monkey patch `random` with: `lambda: 1/0`

your code probably doesn't throw this error normally

## Culprit:
During import, someone called `random.random()` with a default seed.

## Lessons:
* Exceptions are a useful tool to get information
  * The message can be dynamic
* Don't be afraid to blow things up
* Sometimes you get lucky
  * If not, you have to try something else
* Don't share global state
  * There was one global random number sequence (that anyone can screw up because it's mutable)
* Use your own `random()` object
* _Do_ suspect 3rd party code

# Overall lessons

* Break convention to get what you need
  * But only for debug
* Play with running code (dynamic analysis)
