## Accessing SQL via Python
In this continuation of the previous notebook, we show how to wrap the DB functions in a library to create a more scalable way to use our code. 

We start by reading in the file as before. Note the import of our `wedge_helper` library, which is stored in a file with that same name and the extension `py`. Open that file up (maybe in Spyder?) so that you can see what's inside.

In [1]:
import sqlite3
from wedge_helper import *

input_file = "OwnerTransactions_30.txt"

# Let's just open the file and read the first few lines to the screen.
with open(input_file,'r',encoding="Latin-1") as ifile :
    for idx, line in enumerate(ifile.readlines()) :
        print(line.strip().split("\t"))
        if idx > 3 :
            break

['datetime', 'register_no', 'emp_no', 'trans_no', 'upc', 'description', 'trans_type', 'trans_subtype', 'trans_status', 'department', 'quantity', 'Scale', 'cost', 'unitPrice', 'total', 'regPrice', 'altPrice', 'tax', 'taxexempt', 'foodstamp', 'wicable', 'discount', 'memDiscount', 'discountable', 'discounttype', 'voided', 'percentDiscount', 'ItemQtty', 'volDiscType', 'volume', 'VolSpecial', 'mixMatch', 'matched', 'memType', 'staff', 'numflag', 'itemstatus', 'tenderstatus', 'charflag', 'varflag', 'batchHeaderID', 'local', 'organic', 'display', 'receipt', 'card_no', 'store', 'branch', 'match_id', 'trans_id']
['2010-01-01T14:11:12Z', '16', '54', '119', '0000000040505', 'Oatscreme Shake $5.49', 'I', 'NA', 'NA', '14', '1', '0', '0.8', '5.49', '5.49', '5.49', '0', '1', '0', '1', '0', '0', '0', '1', '0', '0', '0.00000000', '1', '0', '0', '0', '0', '0', 'NA', '0', '5', '0', '0', '0', '0', 'NULL', '0', 'NULL', 'NA', '0', '19134', '1', '0', '0', '1']
['2010-01-01T14:11:28Z', '16', '54', '119', '034

Let's do our standard open the DB in memory and create a cursor.

In [2]:
db = sqlite3.connect("change_me.db") #':memory:') # Make this a directory + file if you want to store the results.
cur = db.cursor()

init_db(cur) # take a look at the .py file to see what happened here.

In [3]:
with open(input_file,'r') as ifile :
    populate_db(db,ifile,delimiter="\t",limit=None)

And now let's run a query and print out the results in a semi-pretty fashion. Check out [this page](https://docs.python.org/3.2/library/string.html#format-specification-mini-language) to learn more about the formatting tricks.

In [4]:
result = cur.execute('''SELECT card_no,
                               date(datetime) as date,
                               sum(total) AS spend
                        FROM transactions
                        WHERE trans_type = "I"
                        GROUP BY card_no, date
                        ''')

for idx,row in enumerate(result) :
    print("On {1}, card_no = {0} spent {2:,.2f}.".format(row[0],row[1],row[2]))
    if idx > 20 :
        break

On 2011-04-06, card_no = 10378 spent 30.67.
On 2011-04-09, card_no = 10378 spent 31.27.
On 2011-05-03, card_no = 10378 spent 179.53.
On 2011-06-06, card_no = 10378 spent 10.13.
On 2011-08-09, card_no = 10378 spent 47.59.
On 2011-09-05, card_no = 10378 spent 89.47.
On 2011-09-06, card_no = 10378 spent 34.64.
On 2011-09-12, card_no = 10378 spent 39.06.
On 2011-09-29, card_no = 10378 spent 40.42.
On 2011-09-30, card_no = 10378 spent 15.95.
On 2011-10-03, card_no = 10378 spent 123.06.
On 2011-10-05, card_no = 10378 spent 24.29.
On 2011-10-10, card_no = 10378 spent 28.04.
On 2011-10-28, card_no = 10378 spent 10.18.
On 2011-10-29, card_no = 10378 spent 2.69.
On 2011-11-06, card_no = 10378 spent 8.41.
On 2011-11-16, card_no = 10378 spent 35.74.
On 2011-11-22, card_no = 10378 spent 37.83.
On 2011-12-01, card_no = 10378 spent 35.07.
On 2011-12-07, card_no = 10378 spent 115.38.
On 2011-12-09, card_no = 10378 spent 22.92.
On 2011-12-11, card_no = 10378 spent -33.63.


Now let's return to our big query that correctly captures things like sales, transactions, and items. 

In [8]:
result = cur.execute('''SELECT card_no,
                                   department,
                                   substr(date(datetime),1,4) AS year,
                                   substr(date(datetime),6,2) AS month,
                                   sum(total) AS spend,
                                   count(distinct(date(datetime) || register_no ||
                                           emp_no || trans_no)) as Transactions,
                                   sum(CASE WHEN (trans_status = 'V' or trans_status = 'R') THEN -1 ELSE 1 END) as Items
                                   FROM transactions
                                   WHERE department != 0 and
                                        department != 15 and
                                        trans_status != 'M' and
                                        trans_status != 'C' and
                                        trans_status != 'J' and
                                       (trans_status = '' or 
                                        trans_status = ' ' or 
                                        trans_status = 'V' or 
                                        trans_status = 'R') and card_no = 18736
                          GROUP BY card_no, department, year, month
                          ORDER BY year, month''')

In [9]:
# Here's a way to print row-by-row results. Using `enumerate` is a good trick to give yourself a counter.
for idx,row in enumerate(result) :
    print(row)
    if idx > 10 :
        break

(18736, 4, '2011', '04', -3.29, 1, -1)
(18736, 4, '2011', '05', -3.29, 1, -1)
(18736, 2, '2011', '06', -2.01, 2, -2)
(18736, 5, '2011', '06', -6.3, 1, -1)
(18736, 4, '2011', '07', -6.58, 2, -2)
(18736, 17, '2011', '07', -2.69, 1, -1)
(18736, 4, '2011', '08', -3.29, 1, -1)
(18736, 4, '2011', '09', -4.99, 1, -1)
(18736, 2, '2011', '10', -7.15, 1, -1)
(18736, 4, '2011', '10', -3.79, 1, -1)
(18736, 2, '2011', '11', -5.0, 2, -2)
(18736, 4, '2011', '12', -3.99, 1, -1)


Run the cell below as a best practice, but the DB will close if you quit the notebook. If it's an in-memory DB then it's also gone!

In [10]:
db.close()