# L1 Exercise 2: Creating a Table with Apache Cassandra
<img src="https://upload.wikimedia.org/wikipedia/commons/5/5e/Cassandra_logo.svg" width="250" height="250">

### Walk through the basics of Apache Cassandra. Complete the following tasks:<li> Create a table in Apache Cassandra, <li> Insert rows of data,<li> Run a simple SQL query to validate the information. <br>
`#####` denotes where the code needs to be completed.
    
Note: __Do not__ click the blue Preview button in the lower taskbar

<center><h1><span style='color:blue'>Environment preparation</span></h1></center>

Udacity environment has been prepared to ease student task, i.e. has a Postgres instance available for training exercises.

Let's create one based on Kubernetes.

* Add Cassandra module to Python
* Load in K8s Apache Cassandra

In [1]:
# Load driver module for Apache Cassandra on Python
!pip install cassandra-driver



<h3><span style='color:blue'>Using K8S Cassandra</span></h3>
​
Obviously you need a k8s avaible like: Minikube, Minishift, Docker (with K8s)
​
Helm is need to, go to [helm.sh](http://helm.sh)

In [2]:
from time import sleep

In [3]:
#Checks if Helm V3 is available
helm_version = !helm version --short
assert helm_version[0][:2] == 'v3', "Expected HELM version not available, visit https://helm.sh"

In [4]:
!helm repo add bitnami https://charts.bitnami.com/bitnami

"bitnami" has been added to your repositories


In [5]:
CHART_INSTANCE_NAME = 'dend-l1e2'

In [6]:
helm_chart_out = !helm install {CHART_INSTANCE_NAME} bitnami/cassandra
for c_out in helm_chart_out: print(c_out)

NAME: dend-l1e2
LAST DEPLOYED: Sun Jan 19 13:43:57 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
** Please be patient while the chart is being deployed **

Cassandra can be accessed through the following URLs from within the cluster:

  - CQL: dend-l1e2-cassandra.default.svc.cluster.local:9042
  - Thrift: dend-l1e2-cassandra.default.svc.cluster.local:9160

To get your password run:

   export CASSANDRA_PASSWORD=$(kubectl get secret --namespace default dend-l1e2-cassandra -o jsonpath="{.data.cassandra-password}" | base64 --decode)

Check the cluster status by running:

   kubectl exec -it --namespace default $(kubectl get pods --namespace default -l app=cassandra,release=dend-l1e2 -o jsonpath='{.items[0].metadata.name}') nodetool status

To connect to your Cassandra cluster using CQL:

1. Run a Cassandra pod that you can use as a client:

   kubectl run --namespace default dend-l1e2-cassandra-client --rm --tty -i --restart='Never' \
   --env CASSANDRA_PASS

In [7]:
cassandra_port_forward_command = helm_chart_out[-2].strip()

In [9]:
# Waits until postgresl is running on 
max_checks_cassandra_run = 40

!kubectl get pods

while max_checks_cassandra_run > 0:

    cassandra_is_running = !kubectl get pods|fgrep {CHART_INSTANCE_NAME}|fgrep "1/1"|fgrep "Running"
    
    if len(cassandra_is_running) > 0 and not cassandra_is_running[0] == 'No resources found.':
        break
    else:
        sleep(5)

        max_checks_cassandra_run -= 1

!kubectl get pods
assert max_checks_cassandra_run > 0, "Probably Cassandra is not running"

NAME                    READY   STATUS    RESTARTS   AGE
dend-l1e2-cassandra-0   1/1     Running   0          2m47s
NAME                    READY   STATUS    RESTARTS   AGE
dend-l1e2-cassandra-0   1/1     Running   0          2m48s


<h3><span style='color:blue'>Open Proxy to Cassandra on K8s</span></h3>
Run next command in a separate terminal (if not run on Jupyter ;-))

In [10]:
%%script env --bg bash --out console_out
nohup kubectl port-forward --namespace default svc/dend-l1e2-cassandra 9042:9042 &

In [11]:
# Checks if proxy is enabled
pids_kubectl_proxy = !ps -ef|fgrep 'kubectl port-forward'|fgrep $CHART_INSTANCE_NAME|cut -d ' ' -f4
assert len(pids_kubectl_proxy) > 1, f"No kubectl proxy found, try in a console: '{cassandra_port_forward_command}'"

In [12]:
# Getting postgresql password from console out
cassandra_password = helm_chart_out[16].split('(')[1][:-1]
cassandra_password = !{cassandra_password}
cassandra_password = cassandra_password[0]
cassandra_password

'ROp5iWjMfr'

<center><h1><span style='color:blue'>Udacity - DEND - L1E2 - Exercise</span></h1></center>

#### Import Apache Cassandra python package

In [13]:
import cassandra

### Create a connection to the database

In [14]:
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
try: 
    # Added connection auth for bitnami / helm / cassandra bundle
    auth_provider = PlainTextAuthProvider(username='cassandra', password=cassandra_password)
    cluster = Cluster(['127.0.0.1'], auth_provider=auth_provider) #If you have a locally installed Apache Cassandra instance
    session = cluster.connect()
except Exception as e:
    print(e)
 



### TO-DO: Create a keyspace to do the work in 

In [15]:
## TO-DO: Create the keyspace
KEYSPACE = 'library'

try: 
    session.execute("""
    
    CREATE KEYSPACE IF NOT EXISTS %s
    WITH REPLICATION = {
    'class' :'SimpleStrategy', 'replication_factor' : '1'
    }
    
    """ % KEYSPACE)
except Exception as e:
    print(f"Failed KeySpace Creation: {e}")

### TO-DO: Connect to the Keyspace

In [16]:
## To-Do: Add in the keyspace you created
try: 
    session.set_keyspace(KEYSPACE)
except Exception as e:
    print(f"Error on connection to Keyspace: {e}")

### Create a Song Library that contains a list of songs, including the song name, artist name, year, album it was from, and if it was a single. 

`song_title
artist_name
year
album_name
single`

### TO-DO: You need to create a table to be able to run the following query: 
`select * from songs WHERE year=1970 AND artist_name="The Beatles"`

In [17]:
## TO-DO: Complete the query below
try: 
    session.execute("""
    
    CREATE TABLE IF NOT EXISTS songs (
        song_title text,
        artist_name text,
        year int,
        album_name text,
        single text,
        PRIMARY KEY( year, artist_name)
    )
    
    """)
except Exception as e:
    print(f"Error on table creation: {e}")

### TO-DO: Insert the following two rows in your table
`First Row:  "Across The Universe", "The Beatles", "1970", "False", "Let It Be"`

`Second Row: "The Beatles", "Think For Yourself", "False", "1965", "Rubber Soul"`

In [18]:
## Add in query and then run the insert statement
try: 
    session.execute("""
    
    INSERT INTO songs 
        (song_title,artist_name,year,album_name,single) 
    VALUES 
        ('Across The Universe', 'The Beatles', 1970, 'Let It Be', 'False')
    
    """)
except Exception as e:
    print(f"Error on INSERT: {e}")
    
try: 
    session.execute("""

    INSERT INTO songs 
        (song_title,artist_name,year,album_name,single) 
    VALUES 
        ('Think For Yourself', 'The Beatles', 1965, 'Rubber Soul', 'Think For Yourself')
    
    """)
except Exception as e:
    print(f"Error on INSERT: {e}")



### TO-DO: Validate your data was inserted into the table.

In [19]:
## TO-DO: Complete and then run the select statement to validate the data was inserted into the table
try: 
    rows = session.execute("""

    SELECT year, album_name, artist_name
    FROM songs
    
    """)
except Exception as e:
    print(f"Error on SELECT: {e}")
    
for row in rows:
    print(row.year, row.album_name, row.artist_name)

1965 Rubber Soul The Beatles
1970 Let It Be The Beatles


### TO-DO: Validate the Data Model with the original query.

`select * from songs WHERE YEAR=1970 AND artist_name="The Beatles"`

In [20]:
##TO-DO: Complete the select statement to run the query 
try: 
    rows = session.execute("""

    SELECT year, album_name, artist_name
    FROM songs
    WHERE year=1970 AND artist_name='The Beatles'
    """)
except Exception as e:
    print(f"Error on SELECT: {e}")
    
for row in rows:
    print(row.year, row.album_name, row.artist_name)

1970 Let It Be The Beatles


### And Finally close the session and cluster connection

In [21]:
session.shutdown()
cluster.shutdown()

<h2><span style='color:blue'>Remove Environment</span></h2>

In [22]:
# Clears proxy
pids_kubectl_proxy = !ps -ef|fgrep 'kubectl port-forward'|fgrep $CHART_INSTANCE_NAME|cut -d ' ' -f4
!kill -9 {pids_kubectl_proxy[0]}

In [23]:
# Removes chart instances
!helm uninstall {CHART_INSTANCE_NAME}

release "dend-l1e2" uninstalled


In [None]:
# Removes persistent Volume
!kubectl get pvc|fgrep {CHART_INSTANCE_NAME}|cut -d ' '  -f1| xargs -t kubectl delete pvc

kubectl delete pvc data-dend-l1e2-cassandra-0
persistentvolumeclaim "data-dend-l1e2-cassandra-0" deleted
