# Lesson 3 Exercise 1: Three Queries Three Tables
![](../images/postgresSQLlogo.png)

Walk through the basics of creating a table in Apache Cassandra, inserting rows of data, and doing a simple CQL query to validate the information. You will practice Denormalization, and the concept of 1 table per query, which is an encouraged practice with Apache Cassandra. 

Remember, replace ##### with your answer.

### Create a connection to the database

In [None]:
from src.database import (
    get_cs_cluster,
    get_cs_session,
    close_cs_session,
    shutdown_cs_cluster,
    create_cs_keyspace,
    drop_cs_keyspace,
    set_cs_keyspace,
)

cluster = get_cs_cluster()
session = get_cs_session(cluster)
keyspace_name = "my_keyspace"
drop_cs_keyspace(session, keyspace_name)  # drop any existing keyspace (nice for reruns)
create_cs_keyspace(session, keyspace_name)
set_cs_keyspace(session, keyspace_name)  # set the keyspace to the one we just created
tables = set()

Let's imagine we would like to start creating a Music Library of albums. 

We want to ask 3 questions of the data
1. Give every album in the music library that was released in a given year
```SQL
SELECT * FROM music_library WHERE YEAR=1970
```
2. Give every album in the music library that was created by a given artist  
```SQL
SELECT * FROM artist_library WHERE artist_name="The Beatles"
```
3. Give all the information from the music library about a given album
```SQL
SELECT * FROM album_library WHERE album_name="Close To You"
```

Because we want to do three different queries, we will need different tables that partition the data differently. 

![](images/table1.png)

![](images/table2.png)

![](images/table0.png)

### TO-DO: Create the tables. 

In [None]:
from src.database import create_cs_table, drop_cs_table

# music_library
table_name_ml = "music_library"
tables.add(table_name_ml)
drop_cs_table(session, table_name_ml)  # drop the table if it exists
column_names = [
    "year",
    "artist_name",
    "album_name",
]
column_types = [
    "int",
    "text",
    "text",
]
columns_ml = dict(zip(column_names, column_types))
primary_keys = ["year", "artist_name"]
create_cs_table(
    session,
    table_name_ml,
    columns_ml,
    primary_keys,
)

# artist_library
table_name_al = "artist_library"
tables.add(table_name_al)
drop_cs_table(session, table_name_al)  # drop the table if it exists
column_names = [
    "artist_name",
    "year",
    "album_name",
]
column_types = [
    "text",
    "int",
    "text",
]
columns_al = dict(zip(column_names, column_types))
primary_keys = ["artist_name", "year"]
create_cs_table(
    session,
    table_name_al,
    columns_al,
    primary_keys,
)

# album library
table_name_alb = "album_library"
tables.add(table_name_alb)
drop_cs_table(session, table_name_alb)  # drop the table if it exists
column_names = [
    "album_name",
    "artist_name",
    "year",
]
column_types = [
    "text",
    "text",
    "int",
]
columns_alb = dict(zip(column_names, column_types))
primary_keys = ["album_name", "artist_name"]
create_cs_table(
    session,
    table_name_alb,
    columns_alb,
    primary_keys,
)

### TO-DO: Insert data into the tables

In [None]:
from src.database import insert_cs_rows

# music_library
insert_cs_rows(
    session,
    table_name_ml,
    list(columns_ml.keys()),
    [
        (1970, "The Beatles", "Let It Be"),
        (1965, "The Beatles", "Rubber Soul"),
        (1965, "The Who", "My Generation"),
        (1970, "The Carpenters", "Close To You"),
        (1966, "The Monkees", "The Monkees"),
    ],
)

# artist_library
insert_cs_rows(
    session,
    table_name_al,
    list(columns_al.keys()),
    [
        ("The Beatles", 1970, "Let It Be"),
        ("The Beatles", 1965, "Rubber Soul"),
        ("The Who", 1965, "My Generation"),
        ("The Carpenters", 1970, "Close To You"),
        ("The Monkees", 1966, "The Monkees"),
    ],
)

# album_library
insert_cs_rows(
    session,
    table_name_alb,
    list(columns_alb.keys()),
    [
        ("Let It Be", "The Beatles", 1970),
        ("Rubber Soul", "The Beatles", 1965),
        ("My Generation", "The Who", 1965),
        ("Close To You", "The Carpenters", 1970),
        ("The Monkees", "The Monkees", 1966),
    ],
)

This might have felt unnatural to insert duplicate data into the tables. If I just normalized these tables, I wouldn't have to have extra copies! While this is true, remember there are no `JOINS` in Apache Cassandra. For the benefit of high availibity and scalabity, denormalization must be how this is done. 


### TO-DO: Validate the Data Model

In [None]:
query = "SELECT * FROM music_library WHERE YEAR=1970"
try:
    rows = session.execute(query)
except Exception as e:
    print(e)

for row in rows:
    print(row.year, row.artist_name, row.album_name)

### Your output should be:
1970 The Beatles Let it Be<br>
1970 The Carpenters Close To You

### TO-DO: Validate the Data Model

In [None]:
query = "SELECT * FROM artist_library WHERE artist_name='The Beatles'"
try:
    rows = session.execute(query)
except Exception as e:
    print(e)

for row in rows:
    print(row.artist_name, row.album_name, row.year)

### Your output should be:
The Beatles Rubber Soul 1965 <br>
The Beatles Let it Be 1970 

### TO-DO: Validate the Data Model

In [None]:
query = "SELECT * FROM album_library WHERE album_name='Close To You'"
try:
    rows = session.execute(query)
except Exception as e:
    print(e)

for row in rows:
    print(row.artist_name, row.year, row.album_name)

### Your output should be:
The Carpenters 1970 Close To You

### And finally close the session and cluster connection

In [None]:
for table in tables:
    drop_cs_table(session, table)
close_cs_session(session)
shutdown_cs_cluster(cluster)