# Tutorial 2: Interacting with Databases in Sessions

In this tutorial we will create and then interact with a *relational* SQLite database, using `sqlalchemy`'s `Session` object.

We will:

* Create a relational database containing two linked tables: "Objects" (with ID, ra and dec) and "Distances" (with ID and distance).
* Retrieve data into numpy arrays so that it can be used For Science.

This tutorial was adapted for astronomy from [the one at `pythoncentral.io`](http://pythoncentral.io/introductory-tutorial-python-sqlalchemy/).

### Requirements

You will need to have `sqlalchemy` installed.
```
pip install sqlalchemy
```

In [15]:
import sqlalchemy as sq
import os
import numpy as np
import random

In the previous tutorial, we used a somewhat manual approach where we separately defined a Table in a database, a mapper that would connect a python object to the database, and the python object itself. This is tedious.

In this tutorial, we will use an alternative approach: declaratives. The advantage of using declaratives is that they allow us to define `Table`, mappers and python objects at once in a class definition. Let's see.

In [3]:
from sqlalchemy.ext.declarative import declarative_base

## Creating Database

Initialize a declarative.

In [4]:
Base = declarative_base()

Now, we create Python class objects for two linked tables: "Objects" and "Distances".

In [5]:
class Object(Base):
    __tablename__ = 'object'
    
    #Define columns like we did in the last notebook
    id = sq.Column(sq.Integer, primary_key=True)
    ra = sq.Column(sq.Float, nullable=False)
    dec = sq.Column(sq.Float, nullable=False)

    
class Distance(Base):
    __tablename__ = 'distance'
    
    #Again define columns like above. These columns
    #are normal python instance attributes.
    id = sq.Column(sq.Integer, primary_key=True)
    object_id = sq.Column(sq.Integer, sq.ForeignKey("object.id"), nullable=False)
    distance = sq.Column(sq.Float, nullable=False)

Create a local engine to store data, and remove one if it already exists.

In [6]:
dbfile = 'sessions.db'

try: os.remove(dbfile)
except: pass

In [7]:
engine = sq.create_engine('sqlite:///'+dbfile)

Now, create tables in the database that we defined above.

In [8]:
Base.metadata.create_all(engine)

Check the size of database. It is filled with metadata, but there aren't any entries yet.

In [9]:
!wc -c $dbfile

   12288 sessions.db


## Inserting Data

For all the subequent operations (insert, update, delete, etc.), we are going to use a `sessionmaker` that establishes all conversations with the database. A session represents a 'stagging zone' for all the objects loaded into the database, just like in git. Any changes won't be persisted into the database until they are committed.

Notice that we use ORM module of sqlalchemy to import sessionmaker. ORM stands for Object-relational mapping and is a technique to make relational data compatible with object oriented languages.

In [10]:
from sqlalchemy.orm import sessionmaker

Bind the engine to the metadata of the Base class so that the declaratives can be accessed through a DBSession instance.

In [11]:
Base.metadata.bind = engine

In [12]:
DBSession = sessionmaker(bind=engine)

In [13]:
session = DBSession()

Insert 100 new objects into the database with `ra` and `dec` values of `10.23` and `-32.36` respectively. 

In [16]:
for i in range(100):
    ra = random.uniform(9.0, 11.0)
    dec = random.uniform(-30.0, -33.0)
    new_object = Object(ra=ra, dec=dec)
    session.add(new_object)
    session.commit()

Create a 100 distance entries for the object we inserted above.

In [18]:
for i in range(100):
    d = random.uniform(40, 60)
    new_distance = Distance(distance=d, object_id=i)
    session.add(new_distance)
    session.commit()

## Querying Data

Make a query for the first object in the database and see its attributes.

In [19]:
obj = session.query(Object).first()
obj.id, obj.ra, obj.dec

(1, 9.491550129901015, -31.495161639778225)

Retrieve all distances in form of a numpy array. 

In [20]:
dist = session.query(Distance)
np.array([d.distance for d in dist])

array([ 41.55631757,  57.90184078,  51.29596897,  45.87167048,
        48.98650734,  43.19694914,  54.90462618,  47.85986943,
        56.73155748,  59.09925831,  47.38922876,  55.26618305,
        58.9605005 ,  48.86112729,  50.92573277,  42.07525971,
        58.19197932,  49.67756417,  48.50427822,  55.6368257 ,
        56.58733926,  54.41680138,  49.73433608,  44.98956906,
        51.47046032,  55.88777029,  55.99326791,  54.82224542,
        40.54903374,  45.4968623 ,  42.01675236,  54.33336558,
        47.30987913,  52.43771133,  58.15421697,  47.81324135,
        51.09859509,  43.04859157,  52.7484232 ,  40.2992613 ,
        55.5925934 ,  42.39493602,  56.35458455,  50.50169572,
        56.20704611,  45.04368455,  53.62119493,  54.19584577,
        53.83046376,  44.16106615,  57.96905467,  43.15174839,
        55.21043269,  54.61693205,  55.83102914,  44.76144331,
        59.85239727,  57.82957266,  52.23613573,  52.94035273,
        58.72359882,  53.52323215,  43.3831426 ,  52.53

Apply `filter()` method after a query to get specific results.

In [27]:
d = session.query(Distance).filter(Distance.object_id==80)
d.first().distance

58.97370028186927