# Python Lectures (3)

## Back to basics: types and operations

Today, I am going to back to basics:

We have basic types for values:

* Numbers
  * Integers 
  * Floating point numbers
* Characters
* Strings

and some container types

* Arrays
* Maps/Dictionaries
* Lists

Each of these types have their own specific operations associated with them.

### Arrays

In [1]:
A = [1,4,2,3,5,1,2,3,4]
B = [7,6,3,4,5,7]
A + B

[1, 4, 2, 3, 5, 1, 2, 3, 4, 7, 6, 3, 4, 5, 7]

In [2]:
A[1]

4

### Sets

In [3]:
U = {0, 1.2, 3, "a"}
V = {"c", -2, 10.1, 1}
print(U.union(V))
print(U.intersection(V))

{0, 1.2, 1, 3, 10.1, -2, 'a', 'c'}
set()


### Dictionaries

In [4]:
X = {"a": 1, "b": 2}
Y = {"c": 3, "d": 4}
X.update(Y)
print(X)
print(X["d"])

{'b': 2, 'd': 4, 'a': 1, 'c': 3}
4


### Tuples

In [5]:
U = (1,2,3)
V = (5,6,4)
U+V

(1, 2, 3, 5, 6, 4)

## Filters, maps and reductions

Comprehension and is a common pythonic idiom we use to collect objects in a specific container:

In [6]:
[ -1 if(i%2==0) else 1 for i in range(10)]

[-1, 1, -1, 1, -1, 1, -1, 1, -1, 1]

But there is another way of producing the same effect.  Say, you have a collection and you'd like to apply a function to that collection and collect the result:

In [7]:
def fn(x):
    return((-1)**x)
        
map(fn, range(20))

<map at 0x7f6b2c8dbef0>

But, we didn't have to define a separate function for this purpose, we can define it *inline* right there we call `map`:

In [8]:
map(lambda x: (-1)**x, range(20))

<map at 0x7f6b2c076400>

You can also filter the contents of a container using a propositional function.  Python has a construct called *anonymous functions* which are functions without names.

In [9]:
filter(lambda x: x%2==0 and x>3, range(10))

<filter at 0x7f6b2c0766a0>

## Another application

Today, I am going to give a set of *unstructured* (less than JSON or XML, but more than plain text) data.  You can download it from [here](http://download.geonames.org/export/zip/TR.zip). Unzip the file in where you keep this notebook.

The data consists of the postal codes in Turkey, and is in [comma seperated values (CSV)](http://en.wikipedia.org/wiki/Comma-separated_values) format.  And of course, there is a python library for that :)

In [11]:
import csv

data = open("TR.txt",'r',encoding='UTF8')
dataReader = csv.reader(data, delimiter="\t")
raw = [x for x in dataReader]
print(raw[1])
data.close()

['TR', '02700', 'Burçakli', 'Adiyaman', '02', 'Gerger', '8631677', '', '', '38.0283', '38.9567', '4']


The data is separated into columns. The relevant columns for us are the 2nd (the Zip Code,) the 3rd ("mahalle") the 4th ("il") and 5th ("ilce"). I am going to create a table for reverse lookup: given the zip code, where does it belong to?

In [13]:
reverseLookup = {x[1]: [x[2],x[3],x[5]] for x in raw}

def getDistrict(zip):
    y = reverseLookup[zip]
    print(y[1], y[2], y[0])
    
getDistrict('34353')

İstanbul Beşiktaş Abbasağa


But, I would really like the lookup to work in the other direction. That is, I'd like to give the district and get all neighborhoods with all of the zip codes.  For that I am going to need nested dictionaries:

In [14]:
lookup = {}

for x in raw:
    try:
        temp1 = lookup[x[3]]
        try:
            temp2 = temp1[x[5]]
            temp2.update({x[2]: x[1]})
        except:
            temp1.update({x[5]: {x[2]: x[1]}})
    except:
        lookup.update({x[3]: {x[5]: {x[2]: x[1]}}})
        
for i in lookup['İstanbul']['Beşiktaş'].keys():
    print("{:<12}\t{:<5}".format(i, lookup['İstanbul']['Beşiktaş'][i]))

Abbasağa    	34353
Bebek       	34342
Levent      	34330
Etiler      	34337
Türkali     	34357
Arnavutköy  	34345
Akatlar     	34335
Ortaköy     	34347
Levazim     	34340
Gayrettepe  	34349


## A finished version

Again, I will give you a finished version that I will use on the command line:

    #!/usr/bin/python

    ## Backend

    import csv

    FILE = "/home/kaygun/local/lib/TR.txt"

    data = open(FILE,'r')
    dataReader = csv.reader(data, delimiter="\t")
    raw = [x for x in dataReader]
    data.close()
        
    reverseLookup = {x[1].strip(" "): map(lambda x: x.strip(" \t"), [x[2],x[3],x[5]]) for x in raw}
        
    def getDistrict(zip):
        y = map(lambda x: x.strip(" "), reverseLookup[zip])
        print "{:<12} {:<12} {:<12}".format(y[1],y[2],y[0])

    lookupData = {}

    for x in raw:
        try:
            temp1 = lookupData[x[3]]
            try:
                temp2 = temp1[x[5]]
                temp2.update({x[2]: x[1]})
            except:
                temp1.update({x[5]: {x[2]: x[1]}})
        except:
            lookupData.update({x[3]: {x[5]: {x[2]: x[1]}}})

    def getZipCode(a,b):
        if b == "":
           for j in lookupData[a].keys():
               print "{:<12}".format(j)
               for i in lookupData[a][j].keys():
                   print " "*12,
                   print "{:<11} \t {:<5}".format(i, lookupData[a][j][i])
        else:
             for i in lookupData[a][b].keys():
                 print "{:<11} \t {:<5}".format(i, lookupData[a][b][i])
                 
                 
    ## Front end

    import sys

    n = len(sys.argv)

    if n < 3:
       print "error"
       exit
    elif sys.argv[1] == "reverse":
       getDistrict(sys.argv[2])
    elif sys.argv[1] == "zip":
       if n == 3:
          getZipCode(sys.argv[2],"")
       else:
          getZipCode(sys.argv[2],sys.argv[3])

## A homework

You can download IETT busline information from [here](iettbus.json). The data is in [JSON](http://en.wikipedia.org/wiki/JSON) format. Until next time, look at the data, try to understand it and come up with ideas as to what we can do.  We'll discuss.