### In this demo we are going to walk through the basics of creating a table in Apache Cassandra, inserting rows of data, and doing a simple SQL query to validate the information. We will talk about the importance of Denormalization, and that 1 table per 1 query is an encouraged practice with Apache Cassandra. 

In [1]:
import cassandra

In [2]:
from cassandra.cluster import Cluster

In [3]:
cluster=Cluster(['127.0.0.1'])
session=cluster.connect()


## Create a keyspace so that we can create tables

In [4]:
session.execute("create keyspace if not exists udacity with replication={'class':'SimpleStrategy','replication_factor':1}")

<cassandra.cluster.ResultSet at 0x1127a2898>

In [5]:
session.set_keyspace('udacity')

### Because I want to do two different queries, I am going to do need different tables that partition the data differently. 
* My music library table will be by year that will become my partition key, and artist name will be my clustering column to make each Primary Key unique. 
* My album library table will be by artist name that will be my partition key, and year will be my clustering column to make each Primary Key unique. More on Primary keys in the next lesson and demo. 

`Table Name: music_library
column 1: Year
column 2: Artist Name
column 3: Album Name
PRIMARY KEY(year, artist name)`


` Table Name: album_library 
column 1: Artist Name
column 2: Year
column 3: Album Name
PRIMARY KEY (artist name, year)`


In [8]:
query="create table if not exists"
session.execute((query+" music_library(year int, artist_name text,album_name text,primary ksey(year,artist_name))"))

<cassandra.cluster.ResultSet at 0x1127ee080>

In [9]:
session.execute(query+" album_library(artist_name text,year int, album_name text,primary key(artist_name,year))")

<cassandra.cluster.ResultSet at 0x1127e7208>

## Insert into the database 

In [10]:
query = "INSERT INTO music_library (year, artist_name, album_name)"
query = query + " VALUES (%s, %s, %s)"

query1 = "INSERT INTO album_library (artist_name, year, album_name)"
query1 = query1 + " VALUES (%s, %s, %s)"


In [11]:
    session.execute(query, (1970, "The Beatles", "Let it Be"))


<cassandra.cluster.ResultSet at 0x112799c18>

In [12]:
try:
    session.execute(query, (1965, "The Beatles", "Rubber Soul"))
except Exception as e:
    print(e)
    
try:
    session.execute(query, (1965, "The Who", "My Generation"))
except Exception as e:
    print(e)

try:
    session.execute(query, (1966, "The Monkees", "The Monkees"))
except Exception as e:
    print(e)

try:
    session.execute(query, (1970, "The Carpenters", "Close To You"))
except Exception as e:
    print(e)
    
try:
    session.execute(query1, ("The Beatles", 1970, "Let it Be"))
except Exception as e:
    print(e)
    
try:
    session.execute(query1, ("The Beatles", 1965, "Rubber Soul"))
except Exception as e:
    print(e)
    
try:
    session.execute(query1, ("The Who", 1965, "My Generation"))
except Exception as e:
    print(e)

try:
    session.execute(query1, ("The Monkees", 1966, "The Monkees"))
except Exception as e:
    print(e)

try:
    session.execute(query1, ("The Carpenters", 1970, "Close To You"))
except Exception as e:
    print(e)

### This might have felt unnatural to insert duplicate data into two tables. If I just normalized these tables, I wouldn't have to have extra copies! While this is true, remember there are no `JOINS` in Apache Cassandra. For the benefit of high availibity and scalabity denormalization must be how this is done. 


### Let's Validate our Data Model

`select * from music_library WHERE YEAR=1970`

In [17]:
session.execute("select * from music_library WHERE YEAR=1970")



<cassandra.cluster.ResultSet at 0x11289bb70>

## Drop all the tables we just have created 

In [19]:
session.execute("drop table music_library")
session.execute("drop table album_library")

<cassandra.cluster.ResultSet at 0x112788080>

In [20]:
session.shutdown()
cluster.shutdown()