<img src="https://d24cdstip7q8pz.cloudfront.net/t/ineuron1/content/common/images/final%20logo.png" height=50 alt-text="iNeuron.ai logo">

## 12 Cassandra

* The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. 
* Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. 
* Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.

**Major Companies which uses Cassandra**

Cassandra is in use at Activision, Apple, BazaarVoice, Best Buy, CERN, Constant Contact, Comcast, eBay, Fidelity, Github, Hulu, ING, Instagram, Intuit, Macy's™, Macquarie Bank, Microsoft, McDonalds, Netflix, New York Times, Outbrain, Pearson Education, Sky, Spotify, Uber, Walmart, and thousands of other companies that have large, active data sets. In fact, Cassandra is used by 40% of the Fortune 100.

### Top Features

**FAULT TOLERANT:** Data is automatically replicated to multiple nodes for fault-tolerance. Replication across multiple data centers is supported. Failed nodes can be replaced with no downtime.

**PERFORMANT:** Cassandra consistently outperforms popular NoSQL alternatives in benchmarks and real applications, primarily because of fundamental architectural choices.

**DECENTRALIZED:** There are no single points of failure. There are no network bottlenecks. Every node in the cluster is identical.

**SCALABLE:** Some of the largest production deployments include Apple's, with over 75,000 nodes storing over 10 PB of data, Netflix (2,500 nodes, 420 TB, over 1 trillion requests per day), Chinese search engine Easou (270 nodes, 300 TB, over 800 million requests per day), and eBay (over 100 nodes, 250 TB).

**DURABLE:** Cassandra is suitable for applications that can't afford to lose data, even when an entire data center goes down.

**YOU'RE IN CONTROL:** Choose between synchronous or asynchronous replication for each update. Highly available asynchronous operations are optimized with features like Hinted Handoff and Read Repair.

**ELASTIC:** Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications.

**PROFESSIONALLY SUPPORTED:** Cassandra support contracts and services are available from third parties.

[More Information](https://cassandra.apache.org/doc/latest/)


### STEPS
**To run Cassandra with Python in Windows machine, follow this step by step process**

- **Java Installation**
- **Cassandra Installation**
- **Python2 Installation**
- **Apache Thrift Installation**
- **Cassandra Execution with Python**

In [None]:
step no 1 - java - https://www.oracle.com/java/technologies/oracle-java-archive-downloads.html
    
step no 2 - download casandra  - https://www.apache.org/dyn/closer.lua/cassandra/3.11.10/apache-cassandra-3.11.10-bin.tar.gz

step no 3 - download apache thirift server - https://thrift.apache.org/download
    


### 12.1 Java Installation
* To run cassandra its necessary to have JAVA installed in the system
* Go the offical ORACLE JAVA [link](https://www.oracle.com/java/technologies/javase-jre8-downloads.html) and download

* Check the system type and download JAVA based upon that

<img src="imgs/systeminfo.jpg" width="600"/>

<img src="imgs/oracle_jre.jpg" width="600"/>

>**Note:** In my case this particular [jre1.8.0_261] JAVA version was throwing an error so i have used jdk1.8.0_181. Previous release [Link](https://www.oracle.com/java/technologies/oracle-java-archive-downloads.html).
>**Note:** Only JRE is required on any 7 or 8 version of JAVA

* Download the file than double click to run and follow these given step as per the images.

<img src="imgs/oracle_jre2.jpg" width="500"/>

<img src="imgs/oracle_jre3.jpg" width="500"/>

<img src="imgs/oracle_jre4.jpg" width="500"/>

* To set the environment in local system, open `Environment Variable`

<img src="imgs/environmentVariable.jpg" width="400"/>
<img src="imgs/environmentVariable2.jpg" width="500"/>

>**Note:** Put name as JAVA_HOME and paste the `C` drive path of java

* In order to successfully verify the installation and PATH, run this given below command in command prompt
```cmd
java -version
```

* In order to verify the `Environment Variable` PATH, run this given below command in command prompt
```cmd
echo %JAVA_HOME%
```


### 12.2 Cassandra Installation

* **[Link](https://cassandra.apache.org/download/)** for various version and operating system available to download.

<img src="imgs/cassandra1.jpg" width="600"/>

* In case of Windows, select the stable version and it will redirect to **[link](https://www.apache.org/dyn/closer.lua/cassandra/3.11.7/apache-cassandra-3.11.7-bin.tar.gz)**

<img src="imgs/cassandra2.jpg" width="600"/>

* Ones its downloaded, extract it.

<img src="imgs/cassandra3.jpg" width="600"/>

* Copy the whole extracted file and paste it on `C` drive.

<img src="imgs/cassandra4.jpg" width="600"/>

* Set the environment variable for `CASSANDRA_HOME`, as shown in below figure

<img src="imgs/cassandra5.jpg" width="500"/>

* In order to verify the `Environment Variable` PATH, run this given below command in command prompt
```cmd
echo %CASSANDRA_HOME%
```

<img src="imgs/cassandra6.jpg" width="500"/>


### 12.3 Python2 Installation

* Here we required Python 2.X version
* **[Link](https://www.python.org/downloads/)** to download the latest python2 version.

* Or else we can directly intall by anaconda prompt.
```cmd
conda create --name python2 python=2.7
```

<img src="imgs/python2.jpg" width="500"/>

* To check installed python version

```cmd
python --version
```

### 12.4 Apache Thrift Installation

* Here we also required Apache Thrift, which is framework cross service scalable deployment.
* **[Link](https://thrift.apache.org/download/)** to download.

<img src="imgs/ApacheThrift1.jpg" width="500"/>

<img src="imgs/ApacheThrift2.jpg" width="500"/>

>**Note:** The installation is very simple as previous


### 12.5 Cassandra Execution with Python

**Step1:** Open anaconda prompt and execute 
```cmd
conda activate python2
```

**Step2:** Install cassandra driver for python
```cmd
pip install cassandra-driver
```
<img src="imgs/cassandra10.jpg" width="500"/>

**Step3:** Now in order to run cassandra service, open new command prompt as administrator and redirect to cassandra folder path. 

<img src="imgs/cassandra7.jpg" width="500"/>

```cmd
cassandra.bat -h
```

>**Note:** It will start cassandra service

**Step4:** Same anaconda command prompt is now getting used, first redirect to cassandra folder path. 

<img src="imgs/cassandra8.jpg" width="500"/>
```cmd
cqlsh
```
>**Note:** cqlsh will allows to intract with cassandra cluster which is running by `Step3`

**Now we wanted to intract with python file. In order to do so we first create `key space`**

**Step4:** Create KEYSPACE 

```cmd
CREATE KEYSPACE test_keyspace WITH replication ={'class':'SimpleStrategy','replication_factor':'1'} AND durable_writes='true';
```
<img src="imgs/cassandra11_keyspace.jpg" width="600"/>

**DROP KEYSPACE [KEYSPACE name];**

**DESCRIBE KEYSPACE;**

**CREATE KEYSPACE;**

These command can be used as per requirements

**Step5:** Create TABLE

<img src="imgs/cassandra11_table.jpg" width="600"/>

* Before creating table it required to select a KEYSPACE and in order to do so execute

```cmd
USE test_keyspace;
```
<img src="imgs/cassandra12_insertINtable.jpg" width="600"/>
* To create table
```cmd
CREATE TABLE python_test(id uuid PRIMARY KEY,first_name,last_name)
```

**DROP TABLE [TABLE name];**

**DESCRIBE TABLE [TABLE name];**

**CREATE TABLE;**

These command can be used as per requirements

**Step6:** Execute Python Script file

* Open anaconda prompt with base of `Python2`, loacte where python script file is present

<img src="imgs/cassandra14.jpg" width="600"/>

```cmd
##To run python script file

pyhton cassandratest
```
<img src="imgs/cassandra17.jpg" width="600"/>

* Similary we can add more records over here.

* Showing table directly at command prompt
```cmd
SELECT * FROM pyhton_test;
```
<img src="imgs/cassandra13.jpg" width="600"/>

* Showing table by running python script file

<img src="imgs/cassandra15.jpg" width="600"/>
<img src="imgs/cassandra16.jpg" width="600"/>

>**Note:** In this all execution process we had opened 2 Anaconda prompt with python2 and 1 Command prompt that used to run cassandra server.