# Debugging Python Programs

<div style="text-align: center;">
    <img src="../slides/images/propythonbp.png" alt="Best Practices Book" width="400">
</div>

Material download: https://github.com/krother/debugging_tutorial/tree/master

**Dr. Kristian Rother**

www.academis.eu

## Goal: are monkeys and bananas genetically similar?

<div style="text-align: center;">
    <img src="../slides/images/BananaMonkey.png" alt="Banana Monkey proteins" width="700">
</div>

## Input: Protein sequences (strings)
>tr|A0A075B6H5|A0A075B6H5_HUMAN T cell receptor..\
METVVTTLPREGGVGPSRKMLLLLLLLGPGSGLSAV \
...

## Output
Average character count in chimp, banana and human (as reference).

# What is a protein?
Proteins are tiny molecular machines that do all kinds of things in living cells. For example, antibodies, digestive enzymes and spider silk are all made of protein.

Proteins are chains made of 20 chemical building blocks, which is why we can easily represent and analyze them as strings.

<div style="text-align: center;">
    <img src="../slides/images/Protein_S100A8_PDB_1mr8.png" alt="Proteins structure" width="500">
    <p style="margin-top: 5px; font-style: italic; text-align: center;">Protein S100A8 PDB 1mr8 by Emw - Own work, CC BY-SA 3.0</p>
</div>

https://commons.wikimedia.org/w/index.php?curid=8821449

In [49]:
import pandas as pd

# Task: execute parse_uniprot.py

# Fix all bugs

In [68]:
%run parse_uniprot_0.py

  fragment = int(bool(re.search('\(fragment\)', name)))


SyntaxError: expected ':' (parse_uniprot_0.py, line 65)

## ! SyntaxError: Something obvious is wrong

<div style="text-align: center;">
    <img src="../slides/images/indentation.png" alt="indentation" width="500">
</div>

# Technique #1: Read the Error Message

Look the source code at the corresponding location. \
**Rule of thumb: read Python error messages from bottom to top.**

Fixed first issue by putting `:` after `for aa in AMINO_ACIDS`

In [69]:
%run parse_uniprot_1.py

  fragment = int(bool(re.search('\(fragment\)', name)))
  fragment = int(bool(re.search('\(fragment\)', name)))


FileNotFoundError: [Errno 2] No such file or directory: 'output/sample_1.csv'

## ! Exceptions at Runtime

<div style="text-align: center;">
    <img src="../slides/images/fire.png" alt="fire" width="500">
</div>

**Python is not the easiest language to debug**

In [25]:
data = [1, 2, 3, 4

SyntaxError: incomplete input (3141716909.py, line 1)

In [26]:
data = 1, 2, 3, 4]

SyntaxError: unmatched ']' (4121360865.py, line 1)

## ! Nasty fact: Errors propagate

<div style="text-align: center;">
    <img src="../slides/images/propagation.png" alt="propagation" width="700">
</div>

Fixed with `open(output_fn) as outfile:` by inserting `'w'` like `with open(output_fn, 'w') as outfile:`

In [70]:
%run parse_uniprot_2.py

  fragment = int(bool(re.search('\(fragment\)', name)))


Fixed warning by inserting `r` in `fragment = int(bool(re.search('\(fragment\)', name)))` like `fragment = int(bool(re.search(r'\(fragment\)', name)))`

In [71]:
%run parse_uniprot_3.py

# Technique #2: Add print Statements

<div style="text-align: center;">
    <img src="../slides/images/print.png" alt="print" width="500">
</div>

**print() is a bit like shooting holes into a wall to see what is inside**

Inserted a code to print `print(header, seq)` after `for header, seq in read_fasta(input_fn):`

%run parse_uniprot_4.py

In [107]:
pd.read_csv("../proteins/output/sample_4.csv")

Unnamed: 0,accession,name,length,A,C,D,E,F,G,H,...,M,N,P,Q,R,S,T,V,W,Y
0,A0A096LPF7,uncharacterized protein os=homo sapiens pe=4 sv=,61,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## Introspection functions

### more elegant alternatives to print():

| Function                     | Purpose                           |
|------------------------------|-----------------------------------|
| `dir(x)`                     | Examine local namespace           |
| `locals(x)`                  | Examine local namespace           |
| `globals(x)`                 | Examine global namespace          |
| `help(x)`                    | Access help interactively         |
| `type(x)`                    | Examine object type               |
| `isinstance(x, cl)`          | Examine if object is instance of class `cl` |
| `issubclass(cl1, cl2)`       | Examine if `cl1` is a subclass of `cl2` |


# Technique #3: Scientific Method
What could be the reason the program is not doing what it is supposed to do?

<div style="text-align: center;">
    <img src="../slides/images/scimethod.png" alt="scientific method" width="300">
</div>

In `aa_counts.append(seq.count('aa'))` removing `''` from `'aa'`

In [79]:
%run parse_uniprot_5.py

>tr|A0A024R161|A0A024R161_HUMAN Guanine nucleotide-binding protein subunit gamma OS=Homo sapiens GN=DNAJC25-GNG10 PE=3 SV=1 MGAPLLSPGWGAGAAGRRWWMLLAPLLPALLLVRPAGALVEGLYCGTRDCYEVLGVSRSA
GKAEIARAYRQLARRYHPDRYRPQPGDEGPGRTPQSAEEAFLLVATAYETLKVSQAAAEL
QQYCMQNACKDALLVGVPAGSNPFREPRSCALL

>tr|A0A075B6F4|A0A075B6F4_HUMAN T cell receptor beta variable 21/OR9-2 (pseudogene) (Fragment) OS=Homo sapiens GN=TRBV21OR9-2 PE=4 SV=1 XRFLSEPTRCLRLLCCVALSFWGAASMDTKVTQRPRFLVKANEQKAKMDCVPIKRHSYVY
WYHKTLEEELKFFIYFQNEEIIQKAEIINERFSAQCPQNSPCTLEIQSTESGDTARYFCA
NSK

>tr|A0A075B6H5|A0A075B6H5_HUMAN T cell receptor beta variable 20/OR9-2 (non-functional) (Fragment) OS=Homo sapiens GN=TRBV20OR9-2 PE=4 SV=1 METVVTTLPREGGVGPSRKMLLLLLLLGPGSGLSAVVSQHPSRVICKSGTSVNIECRSLD
FQATTMFWYRQLRKQSLMLMATSNEGSEVTYEQGVKKDKFPINHPNLTFSALTVTSAHPE
DSSFYICSAR

>tr|A0A075B6H7|A0A075B6H7_HUMAN Immunoglobulin kappa variable 3-7 (non-functional) (Fragment) OS=Homo sapiens GN=IGKV3-7 PE=1 SV=1 MEAPAQLLFLLLLWLPDTTREIVMTQSPPTLSLSPGERVTLSCRASQSVSS

In [108]:
pd.read_csv("../proteins/output/sample_5.csv")

Unnamed: 0,accession,name,length,A,C,D,E,F,G,H,...,M,N,P,Q,R,S,T,V,W,Y
0,A0A096LPF7,uncharacterized protein os=homo sapiens pe=4 sv=,61,7,3,0,5,4,3,2,...,0,1,5,2,8,8,1,0,1,0


**Making a hypothesis works when we know the output structure.**

# Technique #4: Explain the problem to someone

<div style="text-align: center;">
    <img src="../slides/images/rubber_duck.jpg" alt="Rubber Duck" width="500">
    <p style="margin-top: 5px; font-style: italic; text-align: center;">Rubber Duck by Florentijn Hofman in Hong Kong, CC-BY-SA 3.0</p>
</div>

Also see: https://en.wikipedia.org/wiki/Rubber_duck_debugging

Changed indentation at \
    `row = [accession, name, length] + aa_counts` \
    `writer.writerows([row])`

In [111]:
%run parse_uniprot_6.py

In [112]:
pd.read_csv("../proteins/output/sample_6.csv")

Unnamed: 0,accession,name,length,A,C,D,E,F,G,H,...,M,N,P,Q,R,S,T,V,W,Y
0,A0A024R161,guanine nucleotide-binding protein subunit gam...,156,24,5,4,9,2,15,1,...,3,2,13,7,14,7,4,8,3,7
1,A0A075B6F4,t cell receptor beta variable 21/or9-2 (pseudo...,126,9,7,3,11,8,2,2,...,2,5,5,7,8,9,7,5,2,5
2,A0A075B6H5,t cell receptor beta variable 20/or9-2 (non-fu...,133,6,3,3,7,5,9,3,...,5,4,7,5,7,16,11,10,1,3
3,A0A075B6H7,immunoglobulin kappa variable 3-7 (non-functio...,118,8,2,4,4,4,6,0,...,2,1,10,9,6,16,11,4,2,6
4,A0A075B6H8,immunoglobulin kappa variable 1d-42 (non-funct...,119,7,2,8,2,7,9,1,...,3,1,8,5,5,15,4,5,3,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
250,A0A096LPE4,uncharacterized protein os=homo sapiens pe=4 sv=,129,6,2,3,6,9,8,6,...,3,4,13,6,6,11,8,5,1,1
251,A0A096LPE9,uncharacterized protein os=homo sapiens pe=4 sv=,142,5,5,1,4,14,5,7,...,1,3,19,2,1,12,11,7,2,4
252,A0A096LPF4,uncharacterized protein os=homo sapiens pe=4 sv=,24,1,0,0,0,0,1,0,...,1,0,0,1,3,4,1,4,0,1
253,A0A096LPF6,uncharacterized protein os=homo sapiens pe=4 sv=,65,7,1,1,1,4,8,3,...,1,0,3,4,4,5,3,3,0,2


# Technique #5: Cleaning up

Deleted `fragment = '(fragment)' in name` which is not being used at all and deleted `name` column.

In [123]:
%run parse_uniprot_7.py

In [124]:
pd.read_csv("../proteins/output/sample_7.csv")

Unnamed: 0,accession,length,A,C,D,E,F,G,H,I,...,M,N,P,Q,R,S,T,V,W,Y
0,A0A024R161,156,24,5,4,9,2,15,1,1,...,3,2,13,7,14,7,4,8,3,7
1,A0A075B6F4,126,9,7,3,11,8,2,2,7,...,2,5,5,7,8,9,7,5,2,5
2,A0A075B6H5,133,6,3,3,7,5,9,3,4,...,5,4,7,5,7,16,11,10,1,3
3,A0A075B6H7,118,8,2,4,4,4,6,0,4,...,2,1,10,9,6,16,11,4,2,6
4,A0A075B6H8,119,7,2,8,2,7,9,1,6,...,3,1,8,5,5,15,4,5,3,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
250,A0A096LPE4,129,6,2,3,6,9,8,6,3,...,3,4,13,6,6,11,8,5,1,1
251,A0A096LPE9,142,5,5,1,4,14,5,7,4,...,1,3,19,2,1,12,11,7,2,4
252,A0A096LPF4,24,1,0,0,0,0,1,0,0,...,1,0,0,1,3,4,1,4,0,1
253,A0A096LPF6,65,7,1,1,1,4,8,3,5,...,1,0,3,4,4,5,3,3,0,2


# Technique #6: Assertions

We creating a consistency check for **failing early**. Add the following assertion:

assert sum(aa_counts) == len(seq)

in parse_uniprot.py at the end of the parse() function.

In [125]:
%run parse_uniprot_8.py

AssertionError: 

# Technique #7: Interactive Debugger

**An interactive debugger allows us to watch our program at work in slow motion**

In [133]:
%run parse_uniprot_9.py

> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(57)[0;36mparse[1;34m()[0m
[1;32m     55 [1;33m    [1;32mimport[0m [0mpdb[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     56 [1;33m    [0mpdb[0m[1;33m.[0m[0mset_trace[0m[1;33m([0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m---> 57 [1;33m    [1;32mwith[0m [0mopen[0m[1;33m([0m[0moutput_fn[0m[1;33m,[0m [1;34m'w'[0m[1;33m)[0m [1;32mas[0m [0moutfile[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     58 [1;33m        [0mwriter[0m [1;33m=[0m [0mcsv[0m[1;33m.[0m[0mwriter[0m[1;33m([0m[0moutfile[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     59 [1;33m        [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mLABELS[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  b 68


Breakpoint 8 at c:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py:68


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  n


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(66)[0;36mparse[1;34m()[0m
[1;32m     64 [1;33m            [0mlength[0m [1;33m=[0m [0mlen[0m[1;33m([0m[0mseq[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m---> 66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m    67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  aa_counts


[24, 5, 4, 9, 2, 15, 1, 1, 3, 21, 3, 2, 13, 7, 14]


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  aa_counts


[24, 5, 4, 9, 2, 15, 1, 1, 3, 21, 3, 2, 13, 7, 14, 7, 4, 8]


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(67)[0;36mparse[1;34m()[0m
[1;32m     65 [1;33m            [0maa_counts[0m [1;33m=[0m [1;33m[[0m[1;33m][0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m--> 67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(68)[0;36mparse[1;34m()[0m
[1;32m     66 [1;33m            [1;32mfor[0m [0maa[0m [1;32min[0m [0mAMINO_ACIDS[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m1[1;32m    67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m--> 68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m    69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m7[1;32m    70 [1;33m            [1;32massert[0m [0msum[0m[1;33m([0m[0maa_counts[0m[1;33m)[0m [1;33m==[0m [

ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_9.py[0m(69)[0;36mparse[1;34m()[0m
[1;31m1[1;32m    67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m8[1;32m    68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m3[1;32m--> 69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m7[1;32m    70 [1;33m            [1;32massert[0m [0msum[0m[1;33m([0m[0maa_counts[0m[1;33m)[0m [1;33m==[0m [0mlength[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     71 [1;33m[1;32mif[0m [0m__name__[0m [1;33m==[0m [1;34m'__main__'[0m[1;3

ipdb>  aa_counts


[24, 5, 4, 9, 2, 15, 1, 1, 3, 21, 3, 2, 13, 7, 14, 7, 4, 8, 3, 7]


ipdb>  length


156


ipdb>  sum(aa_counts)


153


ipdb>  q


BdbQuit: 

## Stepwise Execution in ipdb

| Command | Description         |
|---------|---------------------|
| l, ll   | List lines          |
| n       | Execute next line   |
| s       | Step into function  |
| c       | Continue execution  |
| q       | Abort               |
| ?       | See other commands  |


## Breakpoints in ipdb

| Command                  | Description                        |
|--------------------------|------------------------------------|
| b                         | List breakpoints                   |
| b <file:line>             | Add breakpoint at specified line   |
| b <function>              | Add breakpoint at specified function |
| b <file:line>, <condition>| Add breakpoint with condition      |
| cl <number>               | Remove breakpoint                  |


Added `.strip()` after `seq += line`

In [167]:
%run parse_uniprot_10.py

> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_10.py[0m(57)[0;36mparse[1;34m()[0m
[1;32m     55 [1;33m    [1;32mimport[0m [0mpdb[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     56 [1;33m    [0mpdb[0m[1;33m.[0m[0mset_trace[0m[1;33m([0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m---> 57 [1;33m    [1;32mwith[0m [0mopen[0m[1;33m([0m[0moutput_fn[0m[1;33m,[0m [1;34m'w'[0m[1;33m)[0m [1;32mas[0m [0moutfile[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     58 [1;33m        [0mwriter[0m [1;33m=[0m [0mcsv[0m[1;33m.[0m[0mwriter[0m[1;33m([0m[0moutfile[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     59 [1;33m        [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mLABELS[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m


ipdb>  l


[0;32m     52 [0m[1;33m[0m[0m
[0;32m     53 [0m[1;32mdef[0m [0mparse[0m[1;33m([0m[0minput_fn[0m[1;33m,[0m [0moutput_fn[0m[1;33m)[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0;32m     54 [0m    [1;31m# prepare output file[0m[1;33m[0m[1;33m[0m[0m
[0;32m     55 [0m    [1;32mimport[0m [0mpdb[0m[1;33m[0m[1;33m[0m[0m
[0;32m     56 [0m    [0mpdb[0m[1;33m.[0m[0mset_trace[0m[1;33m([0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;32m---> 57 [1;33m    [1;32mwith[0m [0mopen[0m[1;33m([0m[0moutput_fn[0m[1;33m,[0m [1;34m'w'[0m[1;33m)[0m [1;32mas[0m [0moutfile[0m[1;33m:[0m[1;33m[0m[1;33m[0m[0m
[0m[0;32m     58 [0m        [0mwriter[0m [1;33m=[0m [0mcsv[0m[1;33m.[0m[0mwriter[0m[1;33m([0m[0moutfile[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0;32m     59 [0m        [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mLABELS[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0;32m     60 [0m[1

ipdb>  b 69, length != sum(aa_counts)


Breakpoint 9 at c:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_10.py:69


ipdb>  c


> [1;32mc:\users\praka\desktop\springboard\homeworks\unit 24.2.2 debugging_tutorial-master\proteins\parse_uniprot_10.py[0m(69)[0;36mparse[1;34m()[0m
[1;32m     67 [1;33m                [0maa_counts[0m[1;33m.[0m[0mappend[0m[1;33m([0m[0mseq[0m[1;33m.[0m[0mcount[0m[1;33m([0m[0maa[0m[1;33m)[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     68 [1;33m            [0mrow[0m [1;33m=[0m [1;33m[[0m[0maccession[0m[1;33m,[0m [0mlength[0m[1;33m][0m [1;33m+[0m [0maa_counts[0m[1;33m[0m[1;33m[0m[0m
[0m[1;31m9[1;32m--> 69 [1;33m            [0mwriter[0m[1;33m.[0m[0mwriterows[0m[1;33m([0m[1;33m[[0m[0mrow[0m[1;33m][0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     70 [1;33m            [1;32massert[0m [0msum[0m[1;33m([0m[0maa_counts[0m[1;33m)[0m [1;33m==[0m [0mlength[0m[1;33m[0m[1;33m[0m[0m
[0m[1;32m     71 [1;33m[1;33m[0m[0m
[0m


ipdb>  seq


'XRFLSEPTRCLRLLCCVALSFWGAASMDTKVTQRPRFLVKANEQKAKMDCVPIKRHSYVYWYHKTLEEELKFFIYFQNEEIIQKAEIINERFSAQCPQNSPCTLEIQSTESGDTARYFCANSK'


ipdb>  q


BdbQuit: 

In [170]:
set('XRFLSEPTRCLRLLCCVALSFWGAASMDTKVTQRPRFLVKANEQKAKMDCVPIKRHSYVYWYHKTLEEELKFFIYFQNEEIIQKAEIINERFSAQCPQNSPCTLEIQSTESGDTARYFCANSK') - set("ACDEFGHIKLMNPQRSTVWY")

{'X'}

`X` is removed by using `seq.replace('X', '')`

In [178]:
%run parse_uniprot_11.py

In [179]:
pd.read_csv("../proteins/output/sample_11.csv")

Unnamed: 0,accession,length,A,C,D,E,F,G,H,I,...,M,N,P,Q,R,S,T,V,W,Y
0,A0A024R161,153,24,5,4,9,2,15,1,1,...,3,2,13,7,14,7,4,8,3,7
1,A0A075B6F4,122,9,7,3,11,8,2,2,7,...,2,5,5,7,8,9,7,5,2,5
2,A0A075B6H5,130,6,3,3,7,5,9,3,4,...,5,4,7,5,7,16,11,10,1,3
3,A0A075B6H7,116,8,2,4,4,4,6,0,4,...,2,1,10,9,6,16,11,4,2,6
4,A0A075B6H8,117,7,2,8,2,7,9,1,6,...,3,1,8,5,5,15,4,5,3,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
250,A0A096LPE4,126,6,2,3,6,9,8,6,3,...,3,4,13,6,6,11,8,5,1,1
251,A0A096LPE9,139,5,5,1,4,14,5,7,4,...,1,3,19,2,1,12,11,7,2,4
252,A0A096LPF4,23,1,0,0,0,0,1,0,0,...,1,0,0,1,3,4,1,4,0,1
253,A0A096LPF6,63,7,1,1,1,4,8,3,5,...,1,0,3,4,4,5,3,3,0,2


# Technique #8: Minimize the input

our input file is TOO BIG!

Create a smaller file with just 3 entries, e.g.:VXY

`>tr|ABC12345|ABC12345_HUMAN python fake protein one OS=Homo sapiens` \
`PYTHQNMMXVII` 

`>tr|ABC12346|ABC12346_HUMAN python fake protein two OS=Homo sapiens` \
`PYTHQNTVXY`

`>tr|ABC12347|ABC12347_HUMAN python fake protein three OS=Homo sapiens` \
`PYTHQNTVXMMYQN`

In [201]:
%run parse_uniprot_12.py

In [202]:
pd.read_csv("../proteins/output/mini_12.csv")

Unnamed: 0,accession,length,A,C,D,E,F,G,H,I,...,M,N,P,Q,R,S,T,V,W,Y
0,ABC12345,11,0,0,0,0,0,0,1,2,...,2,1,1,1,0,0,1,1,0,1
1,ABC12346,9,0,0,0,0,0,0,1,0,...,0,1,1,1,0,0,2,1,0,2


This error occur when we write file parser ourselves over and over again. \
Added `yield header, seq` at the botton to yield all header and seq.

In [203]:
%run parse_uniprot_13.py

In [204]:
pd.read_csv("../proteins/output/mini_13.csv")

Unnamed: 0,accession,length,A,C,D,E,F,G,H,I,...,M,N,P,Q,R,S,T,V,W,Y
0,ABC12345,11,0,0,0,0,0,0,1,2,...,2,1,1,1,0,0,1,1,0,1
1,ABC12346,9,0,0,0,0,0,0,1,0,...,0,1,1,1,0,0,2,1,0,2
2,ABC12347,13,0,0,0,0,0,0,1,0,...,2,2,1,2,0,0,2,1,0,2


# Technique #9: Code Review

Conduct a code review of `pipeline.py` with your neighbour.

* Which part of the code is clear to you? Which is not?
* Do you find any bugs?
* What would you improve in the code?

## Code Review

The control mechanism of the lock of a vault for nuclear waste has been designed for safe operation. It makes sure that it is only possible to access the vault, if the radiation shields are in place or the radiation level in the vault is below a threshold (DANGER_LEVEL). That means:

* If the remote-controlled radiation shields are in place, the door may be opened by an authorized operator.
* If the radiation level in the room is below the threshold, the door may be opened by an authorized operator.
* An authorized operator may open the door by entering a code.
  
The code below controls the door lock. Note that the safe state is that no entry is possible. Develop an argument for safety that shows that the code is potentially unsafe.

(adopted from I.Sommerville, Software Engineering, 9th edition)

Trivia: Code reviews are seen as superior to automated testing when engineering safety-critical software.

## Code Review
Review the following (fictional!) code for a nuclear vault door:

<div style="text-align: center;">
    <img src="../slides/images/code.png" alt="Code" width="400">
</div>

(code adopted from I.Sommerville, Software Engineering, 9th edition)

# Technique #10: Log information

<div style="text-align: center;">
    <img src="../slides/images/log.png" alt="Logging" width="400">
</div>

## Log verbosity levels

<div style="text-align: center;">
    <img src="../slides/images/verbosity_levels.png" alt="verbosity levels" width="400">
</div>

# Summary: What we know about debugging
* error messages in Python are not always helpful
* syntax errors are when Python does not do anything
* some errors cause a program to stop with an Exception
* read error messages from bottom to top
* semantic errors: the program does not do the right thing
* errors are distinct from the underlying defects
* defects propagate through the program

<div style="text-align: center;">
    <img src="../slides/images/thank_you.png" alt="Thank you" width="400">
</div>