# Django Object Relational Mapper (ORM)

## Introduction

This notebook provides a brief overview of the [Django](https://docs.djangoproject.com/en/3.1/) [ORM](https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping), which allows us to access the database using Python objects. We will assume you have some working knowledge of Python, databases ([PostgreSQL](https://www.postgresql.org/)) and [object-oriented programming](https://pythonguides.com/object-oriented-programming-python/). New concepts relating to the Django ORM will be introduced as we work through the notebook.

The objective of the notebook is to get you familiar with Python commands that you can use for accessing the data. By the end of the notebook hopefully you will understand: 

* How to query the data
* How to write data to the database
* How to write post-processing scripts using the data

using Python with Django. There are cells in the notebook that are intentionally empty. These are left blank so you can fill them out to test your understanding of Django syntax and practice writing the code.

## Initial Setup

To get started, run the cell below which will configure our Python environment to have access to Django shell functionality. **You must run this cell before any other cells**.

In [1]:
# Setup environment
import sys
import os
import django
sys.path.append('src')
os.environ['DJANGO_SETTINGS_MODULE'] = 'demo.settings'
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true"
django.setup()

## Context

For the purposes of this tutorial let's suppose we are managers of book shop franchise. There are a number of stores and books that we stock. We've got a database system to reflect the physical system (because that's what everyone else is doing). 

## Models

The most important concept in the ORM is the [Django Model](https://docs.djangoproject.com/en/3.1/topics/db/models/). Models are a pythonic representation of a database table. They contain information about the name and data type for each field in a table. For those interested, here is how the `models.py` files for our books, stores, inventory and sales look:

```
# src/stores/models.py
class Store(models.Model):
    id = models.AutoField(primary_key=True)
    name = models.CharField(max_length=100)
    address = models.TextField()

# src/books/models.py
class Book(models.Model):
    id = models.AutoField(primary_key=True)
    title = models.CharField(max_length=100)
    author = models.CharField(max_length=200)
    price = models.FloatField()
    store = models.ForeignKey(Store, to_field='id', null=True, blank=True, on_delete=models.SET_NULL)
    
# src/sales/models.py
class Sale(models.Model):
    id = models.AutoField(primary_key=True)
    book = models.ForeignKey(Book, to_field='id', on_delete=models.DO_NOTHING)
    store = models.ForeignKey(Store, to_field='id', on_delete=models.DO_NOTHING)
    sold_at = models.DateTimeField(auto_now=True)
    
# src/inventory/models.py
class Inventory(models.Model):
    id = models.AutoField(primary_key=True)
    book = models.ForeignKey(Book, to_field='id', on_delete=models.DO_NOTHING)
    store = models.ForeignKey(Store, to_field='id', on_delete=models.DO_NOTHING)
    quantity = models.IntegerField()
```

This reflects the types of data contained in each table in the database. From the model definitions, you can immediately underestand the database schema. These files can also be generated automatically from a database!

This [page](https://docs.djangoproject.com/en/3.1/topics/db/models/) provides an in-depth reference for Django Models that you can read for more information.

# Simple Queries

### Check our inventory!

First step will be to check what books we currently have stocked across all of our book stores. Actually, we don't even know what book stores we have, so let's check that too. In order to access the data, you will need to import the relevant Django model. Run the cell below to import the models and run a simple query. 

In [2]:
from books.models import Book
from stores.models import Store

In [3]:
all_books = Book.objects.all()
all_stores = Store.objects.all()

print(all_books)
print(all_stores)

<QuerySet [<Book: Book object (1)>, <Book: Book object (2)>, <Book: Book object (3)>, <Book: Book object (4)>, <Book: Book object (5)>, <Book: Book object (6)>]>
<QuerySet [<Store: Store object (1)>]>


We can see that the model objects have a `.objects.all()` operation that you can apply to them. This returns the objects in a `QuerySet` which is effectively just a list. In this case, since we queried for `all()`, the result is all rows of the table. If we perform some filtering on the query we will get a subset of the rows in our `QuerySet`.

Lets look at one of the books. We can retrieve items (rows) from our `QuerySet` by indexing, like we do for lists. 

In [4]:
print(all_books[0])
print(type(all_books[0]))

Book object (1)
<class 'books.models.Book'>


The book, which is a row in the database, is represented as the Django model object defined above. When it is printed, it just shows `books.models.Book` which is not very informative. Let's look at the properties of this object.

In [5]:
all_books.first().__dict__

{'_state': <django.db.models.base.ModelState at 0x1103d7550>,
 'id': 1,
 'title': 'Sapiens',
 'author': 'Yuval Harari',
 'price': 29.99,
 'store_id': 1}

We can retrieve the information about the book by inspecting this `__dict__` attribute. This is handy for looking at all properties of an object (looking at the values of all fields in the row).

If we wanted to write some script that makes use of a specific property, there is an easier way to access them. Let's print the names of all the books we have in our store.

In [6]:
for book in Book.objects.all():
    print(book.title)

Sapiens
The New Jim Crow: Mass Incarceration in the Age of Colorblindness
Range: Why Generalists Triumph in a Specialized World
The Splendid and the Vile: A Saga of Churchill, Family, and Defiance During the Blitz
The Spy and the Traitor: The Greatest Espionage Story of the Cold War
Breath from Salt: A Deadly Genetic Disease, a New Era in Science, and the Patients and Families Who Changed Medicine


We also have a Store table in the database with one entry. What is the name of the store?

In [15]:
# TODO(user): Retrieve the name of the Store object in the database.




Cool! So we know how to get all rows from the database for a table of interest. We know how to access objects in that list, and look at properties of those objects. 

# Write data

### Opening a new shop

Let's suppose we've just opened up a new shop. We need to update the database to reflect this change.

In [7]:
Store.objects.create(
    name="Awesome Books",
    address="36 Stirling Hwy, Perth, AU 6009"
)

<Store: Store object (2)>

Now if we look at all of the stores, you will see that we have two. Do this in the cell below.

In [8]:
# TODO(user): Check that there are now two stores. Check the names of each of the stores
# to make sure one of them is the one you just created




**NOTE**: If you run a cell twice you will accidentally create two objects! This will result in two writes to the database with the same content. If you accidentally do this, no worries. There are no protections in this database that require no duplication of fields, but there probably will be in a production database.

To access the object where you have created duplicated you can use the `id` field of the object. This is the row id, and is unique for each object. In Django, this is:

``` 
crawley_store = Store.objects.get(id=2)
```

### Add stock to the existing shop!

So we have now two shops and some number of books. Since our database is [normalised](https://en.wikipedia.org/wiki/Database_normalization), the table that tracks the number of books we have in stock is separate to the Books table. We haven't had the chance yet to look at our inventory, so let's do that quickly.

In [17]:
from inventory.models import Inventory

for item in Inventory.objects.all():
    print(f"({item.quantity}, {Book.objects.get(id=item.book_id).title}, {Store.objects.get(id=item.store_id).name})\n")

(4, Sapiens, Perth Book Shop)

(13, The New Jim Crow: Mass Incarceration in the Age of Colorblindness, Perth Book Shop)

(7, Range: Why Generalists Triumph in a Specialized World, Perth Book Shop)

(9, The Splendid and the Vile: A Saga of Churchill, Family, and Defiance During the Blitz, Perth Book Shop)

(15, The Spy and the Traitor: The Greatest Espionage Story of the Cold War, Perth Book Shop)

(341, Breath from Salt: A Deadly Genetic Disease, a New Era in Science, and the Patients and Families Who Changed Medicine, Perth Book Shop)



Great so now we know how much of each book we currently have in each store. Let's add some stock to the new bookshop.

In [18]:
# Adding some Books to the new Store by updating the Inventory table.

Inventory.objects.create(
    quantity=10,
    book_id=1,
    store_id=2
)

Inventory.objects.create(
    quantity=15,
    book_id=4,
    store_id=2
)

Inventory.objects.create(
    quantity=15,
    book_id=6,
    store_id=2
)

<Inventory: Inventory object (9)>

### New books!

We've got a new book that we want to add to our shop in Crawley. There are a lot of students in Crawley so we decided to add a book that targets them. Now that you've seen how to create items in the Inventory, you will fill out the content for this cell. The information for the book is here:

```
title = "This Side of Paradise"
author = "Francis Scott Fitzgerald"
price = 22.99
```

In [None]:
# TODO(user): Create Book and add the Inventory row





Now if you have added the Book and Inventory correctly you should see 4 different items in the new Crawley book store.

In [29]:
# TODO(user): Run this to check that you have added the book and inventory rows to the database
# NOTE: If you ran the cell above multiple times this should still show True.

print(f"Correct number of books in the Crawley store? {Inventory.objects.filter(store_id=2).count() >= 4}")
print(f"Correct number of books in the all stores? {Inventory.objects.count() >= 10}")

Correct number of books in the Crawley store? False
Correct number of books in the all stores? False


# Complex Queries

### Generating sales

Lets generate some random sales to showcase the ability to perform some complex queries. We're going to generate a lot of sales to populate the database over the last 10 years.

In [33]:
# Create some sales records in the database

import random
from datetime import datetime, timedelta
from sales.models import Sale

# variables for generating random datetime values
min_year = 2010
max_year = datetime.now().year
start = datetime(min_year, 1, 1, 0, 0, 0)
years = max_year - min_year + 1
end = start + timedelta(days=365 * years)

# Generate sales
N = 2000
for i in range(N):
    random_date = start + (end - start) * random.random()
    sale = Sale.objects.create(
        book_id = random.randint(1, Book.objects.count()),
        store_id = random.randint(1, Store.objects.count())
    )
    sale.sold_at = random_date

In [34]:
# Confirm that we have a lot of sales

Sale.objects.count()

2001

# Analysis