# Debugging

## Syntax Errors

Given two lists, one of people's names and another of their scores, create a list of tuples such that for each person you have a tuple of their name and their score.

You might come up with a solution that looks like:

In [1]:
names = ['a','b','c','d','e']
scores = [90,76,55,82,88]

In [2]:
people_and_scores = []
for i in range(len(names)):
    people_and_scores.append((names[i],scores[i]))
people_and_scores
    

[('a', 90), ('b', 76), ('c', 55), ('d', 82), ('e', 88)]

There's a better way of doing it:  the zip command

Let's take a look at the documentation for the zip command:
https://docs.python.org/3.5/library/functions.html#zip
Hmmmmm.  Not all that useful, so let's try it out:

In [3]:
zip(names,scores)

<zip at 0x70ff08176e80>

In [4]:
for i in zip(names,scores):
    print(i)

('a', 90)
('b', 76)
('c', 55)
('d', 82)
('e', 88)


In [5]:
people_and_scores2 = []
for i in zip(names,scores):
    people_and_scores2.append(i)
people_and_scores2

[('a', 90), ('b', 76), ('c', 55), ('d', 82), ('e', 88)]

In [6]:
people_and_scores3 = list(zip(names,scores))
people_and_scores3

[('a', 90), ('b', 76), ('c', 55), ('d', 82), ('e', 88)]

Ok, but let's say you had a structure that looks like people_and_scores and you wanted to extract just the names.  How would you do that?

In [7]:
names = []
for i in people_and_scores:
    names.append(i[0])
names
    

['a', 'b', 'c', 'd', 'e']

Back to our documentation:
https://docs.python.org/3.5/library/functions.html#zip

There's a blurb there about
> zip() in conjunction with the * operator can be used to unzip a list:

followed by a code example:

```
>>> x = [1, 2, 3]
>>> y = [4, 5, 6]
>>> zipped = zip(x, y)
>>> list(zipped)
[(1, 4), (2, 5), (3, 6)]
>>> x2, y2 = zip(*zip(x, y))
>>> x == list(x2) and y == list(y2)
True
```

Google for "python zip explained", get
https://stackoverflow.com/questions/19339/transpose-unzip-function-inverse-of-zip#19343

In [8]:
list(list(zip(*people_and_scores))[0])

['a', 'b', 'c', 'd', 'e']

Note also, however, that there's a link that talks about using generators:

https://stackoverflow.com/questions/30805000/how-to-unzip-an-iterator


In [9]:
import itertools

In [10]:
names,scores = itertools.tee(people_and_scores)

In [11]:
names

<itertools._tee at 0x70ff08121240>

In [12]:
for n in names:
    print(n,type(n))

('a', 90) <class 'tuple'>
('b', 76) <class 'tuple'>
('c', 55) <class 'tuple'>
('d', 82) <class 'tuple'>
('e', 88) <class 'tuple'>


In [13]:
names = (x[0] for x in names)

In [14]:
for n in names:
    print(n,type(n))

In [15]:
for i in names:
    print(i)

Ok, let's try this again.

### Top-level goal:  to create a list of (lat, lon) tuples where lat is between X and Y

We're going to read a file efficiently using a generator:


In [16]:
filename = 'data/ride_final2.csv'

In [17]:
def read_lat_and_lon_by_line(filename):
    with open(filename) as f:
        while True:
            line = f.readline()
            if not line:
                break
            data = line.split(',')
            yield (data[1],data[2])

In [18]:
f = read_lat_and_lon_by_line(filename)

In [19]:
f

<generator object read_lat_and_lon_by_line at 0x70ff080c9740>

In [20]:
count = 0
for i in read_lat_and_lon_by_line(filename):
    count = count+1
    if count > 5:
        break
    print(i)

('"Latitude"', '"Longitude"')
('"504719750"', '"-998493490"')
('"504717676"', '"-998501870"')
('"504716354"', '"-998506792"')
('"504714055"', '"-998515244"')


Let's get rid of the first line (the header line):

In [21]:
def read_lat_and_lon_by_line(filename):
    with open(filename) as f:
        first = True
        while True:
            line = f.readline()
            if first
                line = f.readline()
                first = False
            if not line:
                break
            data = line.split(',')
            yield (data[1],data[2])

SyntaxError: invalid syntax (855364294.py, line 6)

In [22]:
def read_lat_and_lon_by_line(filename):
    with open(filename) as f:
        first = True
        while True:
            line = f.readline()
            if first:
                line = f.readline()
                first = False
            if not line:
                break
            data = line.split(',')
            yield (data[1],data[2])

In [23]:
count = 0
for i in read_lat_and_lon_by_line(filename):
    count = count+1
    if count > 5:
        break
    print(i)

('"504719750"', '"-998493490"')
('"504717676"', '"-998501870"')
('"504716354"', '"-998506792"')
('"504714055"', '"-998515244"')
('"504711900"', '"-998523278"')


In [24]:
((lat,lon) for (lat,lon) in read_lat_and_lon_by_line(filename) if lon < -998493490 )

<generator object <genexpr> at 0x70ff080c9c10>

In [25]:
import csv
def read_lat_and_lon_with_reader(filename):
    with open(filename, 'r') as csvfile:
        csvreader = csv.DictReader(csvfile)
        for row in csvreader:
            yield (int(row['Latitude']),int(row['Longitude']))
        
    

In [26]:
g = ((lat,lon) for (lat,lon) in read_lat_and_lon_with_reader(filename) if lon < -998493490 )

In [27]:
for r in g:
    print(r)

(504717676, -998501870)
(504716354, -998506792)
(504714055, -998515244)
(504711900, -998523278)
(504709729, -998531192)
(504707299, -998540018)
(504705967, -998544934)
(504703695, -998553170)
(504701326, -998561924)
(504700547, -998564668)
(504698641, -998571568)
(504696909, -998577942)
(504695977, -998581247)
(504695051, -998584562)
(504692926, -998592970)
(504692111, -998597619)
(504691299, -998606407)
(504691173, -998612263)
(504690996, -998620769)
(504690902, -998629455)
(504690919, -998630192)
(504690924, -998630680)
(504691253, -998633926)
(504691172, -998633950)
(504684780, -998633605)
(504677722, -998633704)
(504675542, -998633688)
(504673200, -998633697)
(504666467, -998633752)
(504665599, -998633751)
(504661165, -998633676)
(504656517, -998633739)
(504653635, -998633704)
(504646885, -998633610)
(504645873, -998633596)
(504638364, -998633928)
(504636279, -998634532)
(504635317, -998635011)
(504631201, -998637323)
(504630451, -998638943)
(504630652, -998641677)
(504630408, -998

Next: debugging with print() statements

For example, let's say there's some bad data in the data file.



Understanding error stacks

Passing reference to pdb, set_trace

PixieDebugger?

## Copy-and-paste errors

From https://datascienceplus.com/how-to-achieve-parallel-processing-in-python-programming/

Copy 
```
import multiprocessing as multip
print(“Total number of processors on your machine is: ”, multip.cpu_count())
```

What's wrong?