<b> Name : Deepak Gautam <br />
NetID: dg1308 </b>

## Apache Cassandra

The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.

Cassandra's data model offers the convenience of column indexes with the performance of log-structured updates, strong support for denormalization and materialized views, and powerful built-in caching. 

http://cassandra.apache.org/

Manual and Installation: http://docs.datastax.com/en/cassandra/2.2/cassandra/cassandraAbout.html

For Ubuntu: http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installDeb_t.html

How to add repository:

apt-get-add-repository
http://askubuntu.com/questions/493460/how-to-install-add-apt-repository-using-the-terminal

`sudo apt-get install python-software-properties`

Install Java

```
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
```

Install Cassandra 2.0.11

```
echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
curl -L http://debian.datastax.com/debian/repo_key | sudo apt-key add -
sudo apt-get update
sudo apt-get install dsc20=2.0.11-1 cassandra=2.0.11
```

Client cassandra-driver

`pip install cassandra-driver`

Start Cassandara (should automatically start after install)

`sudo service cassandra start`


### Connecting to Cassandra

Before we can start executing any queries against a Cassandra cluster we need to setup an instance of Cluster. As the name suggests, you will typically have one instance of Cluster for each Cassandra cluster you want to interact with.

The simplest way to create a Cluster is like this:

In [1]:
from cassandra.cluster import Cluster

cluster = Cluster(protocol_version=2)

Create a keyspace (kind of similar to creating a database)

In [None]:
session = cluster.connect()

In [14]:
session.execute("CREATE KEYSPACE demo WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };")

The `connect()` method takes an optional keyspace argument which sets the default keyspace for all queries made through that `Session`:

In [15]:
session.set_keyspace('demo')

### Executing Queries

Now that we have a `Session` we can begin to execute queries. The simplest way to execute a query is to use `execute()`:

In [17]:
session.execute("CREATE TABLE users (firstname text,lastname text,age int,email text,city text, PRIMARY KEY (lastname));")

In [20]:
session.execute("""

insert into users (lastname, age, city, email, firstname) values ('Jones', 35, 'Austin', 'bob@example.com', 'Bob')

""")

Query `users` to select values for lastname = Jones

In [21]:
result = session.execute("select * from users where lastname='Jones' ")[0]
print result.firstname, result.age

Bob 35


Update value for Jones

In [22]:
session.execute("update users set age = 36 where lastname = 'Jones'")

In [23]:
result = session.execute("select * from users where lastname='Jones' ")[0]
print result.firstname, result.age

Bob 36


Delete values associated with Jones

In [24]:
session.execute("delete from users where lastname = 'Jones'")
result = session.execute("select * from users")
for x in result:
    print x.age

### ORM using Cassandra


In [25]:
import uuid
from cassandra.cqlengine import columns
from cassandra.cqlengine import connection
from datetime import datetime
from cassandra.cqlengine.management import sync_table
from cassandra.cqlengine.models import Model

In [26]:
class ExampleModel(Model):
    example_id      = columns.UUID(primary_key=True, default=uuid.uuid4)
    example_type    = columns.Integer(index=True)
    created_at      = columns.DateTime()
    description     = columns.Text(required=False)
    def __repr__(self):
        return '%s %s' % (self.example_type, self.description)

In [27]:
connection.setup(['127.0.0.1'], "demo", protocol_version=2)
sync_table(ExampleModel)



In [28]:
em1 = ExampleModel.create(example_type=0, description="example1", created_at=datetime.now())
em2 = ExampleModel.create(example_type=0, description="example2", created_at=datetime.now())
em3 = ExampleModel.create(example_type=0, description="example3", created_at=datetime.now())
em4 = ExampleModel.create(example_type=0, description="example4", created_at=datetime.now())
em5 = ExampleModel.create(example_type=1, description="example5", created_at=datetime.now())
em6 = ExampleModel.create(example_type=1, description="example6", created_at=datetime.now())
em7 = ExampleModel.create(example_type=1, description="example7", created_at=datetime.now())
em8 = ExampleModel.create(example_type=1, description="example8", created_at=datetime.now())

#### Retrieving objects

Once you’ve populated Cassandra with data, you’ll probably want to retrieve some of it. This is accomplished with QuerySet objects. 

http://datastax.github.io/python-driver/cqlengine/queryset.html

#### Retrieving all objects
The simplest query you can make is to return all objects from a table. This is accomplished with the .all() method, which returns a QuerySet of all objects in a table. Using the `ExampleModel`, we would get all objects like this:

In [29]:
all_objects = ExampleModel.objects.all()

#### Iterating over the queryset

In [30]:
for each in ExampleModel.objects.all():
    print each

ExampleModel <example_id=a83cc896-e78e-411c-a0cb-72a0784d2aa2>
ExampleModel <example_id=32bbddc1-84cc-4a76-af86-eeef1c13655f>
ExampleModel <example_id=426bf5d7-f845-40b1-802d-73d77e8f2374>
ExampleModel <example_id=295f0a60-46ff-4040-8e4c-c6123b186335>
ExampleModel <example_id=254241b2-9af8-4e5b-b649-e78f064da65c>
ExampleModel <example_id=5fa8bd8d-0bec-492c-a5eb-7a8aa363ab71>
ExampleModel <example_id=c2fb9e8b-352b-4e2a-8ec8-b4e413ff201a>
ExampleModel <example_id=4ac6edb3-ff7e-4075-8482-1b9871d5d4f9>


In [31]:
for each in all_objects:
    print each.example_type, each.description

1 example6
1 example7
0 example2
1 example8
0 example1
1 example5
0 example4
0 example3


#### Counting the number of objects

In [None]:
ExampleModel.objects.count()

#### Filtering the objects by value

In [None]:
q = ExampleModel.objects(example_type=1)
q.count()

In [None]:
for instance in q:
    print instance.description

In [None]:
for instance in q:
    print instance

In [None]:
q2 = q.filter(example_id=em5.example_id)

In [None]:
for instance in q2:
    print instance.description

Drop a keyspace

In [None]:
from cassandra.cluster import Cluster
cluster = Cluster(protocol_version=2)
session = cluster.connect()
session.execute('DROP KEYSPACE demo;')