# Simility Cassandra Requests Example

This notebook contains an example of how the Simility Cassandra Requests sub-package can be used to return information related to Cassandra entities in a Simility environment.

## Requirements

To run, you'll need the following:

* Install the Simility Requests package - see the readme for more information.

----

## Import packages

In [2]:
from simility_requests.cassandra_requests import ReturnCassandraDatatypes, ReturnCassandraPipelineOutputMapping, ReturnPipelineOutputDatatypes
from simility_apis.set_password import set_password

import pandas as pd
import numpy as np
import json

---

## Set your password

Before using the *cassandra_requests* module, you need to provide your password that you use to log in to the Simility environment:

In [3]:
set_password()

Please provide your password for logging into the Simility platform:  ·········


---

## ReturnCassandraDatatypes

This class returns the Cassandra datatypes of the fields present in both Cassandra and pipeline output.

Firstly, we need to instantiate the ReturnCassandraDatatypes class. To do this, we need to provide the *url*, *app_prefix*, *user* and *base_entity* for the environment we're interested in.

In [4]:
params = {
    "url": 'http://sim-ds.us-central1.gcp.dev.paypalinc.com',
    "app_prefix": 'james_testing',
    "user": 'james@simility.com',
    "base_entity": 'transaction'
}

In [5]:
r = ReturnCassandraDatatypes(**params)

Then we can run the *.request()* method to return the Cassandra datatypes of the fields present in pipeline output for a given base entity:

In [6]:
cass_dtypes = r.request()

### Outputs

The *.request()* method returns a dataframe containing the Cassandra datatypes of the fields present in both Cassandra and pipeline output:

In [7]:
cass_dtypes.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,PipelineOutputFieldName,CassandraDatatype
Entity,ReferenceField,CassandraFieldName,Unnamed: 3_level_1,Unnamed: 4_level_1
account_number,account_number,avg_order_total_per_account_number_1day,account_number_avg_order_total_per_account_num...,DOUBLE
account_number,account_number,avg_order_total_per_account_number_30day,account_number_avg_order_total_per_account_num...,DOUBLE
account_number,account_number,avg_order_total_per_account_number_7day,account_number_avg_order_total_per_account_num...,DOUBLE
account_number,account_number,avg_order_total_per_account_number_90day,account_number_avg_order_total_per_account_num...,DOUBLE
account_number,account_number,eid,account_number_eid,TEXT


---

## ReturnPipelineOutputDatatypes

This class returns the Cassandra datatype associated with each field in pipeline output.

Firstly, we need to instantiate the ReturnPipelineOutputDatatypes class. To do this, we need to provide the *url*, *app_prefix*, *user* and *base_entity* for the environment we're interested in.

In [8]:
params = {
    "url": 'http://sim-ds.us-central1.gcp.dev.paypalinc.com',
    "app_prefix": 'james_testing',
    "user": 'james@simility.com',
    "base_entity": 'transaction'
}

In [9]:
r = ReturnPipelineOutputDatatypes(**params)

Then we can run the *.request()* method to return the Cassandra datatype associated with each field in pipeline output.

In [10]:
po_dtypes = r.request()

### Outputs

The *.request()* method a dictionary of the Cassandra datatype (values) associated with each field in pipeline output (keys):

In [11]:
po_dtypes

{'account_number_avg_order_total_per_account_number_1day': 'DOUBLE',
 'account_number_avg_order_total_per_account_number_30day': 'DOUBLE',
 'account_number_avg_order_total_per_account_number_7day': 'DOUBLE',
 'account_number_avg_order_total_per_account_number_90day': 'DOUBLE',
 'account_number_eid': 'TEXT',
 'account_number_num_distinct_transaction_per_account_number_1day': 'INT',
 'account_number_num_distinct_transaction_per_account_number_30day': 'INT',
 'account_number_num_distinct_transaction_per_account_number_7day': 'INT',
 'account_number_num_distinct_transaction_per_account_number_90day': 'INT',
 'account_number_num_fraud_transactions_per_account_number_1day': 'INT',
 'account_number_num_fraud_transactions_per_account_number_30day': 'INT',
 'account_number_num_fraud_transactions_per_account_number_7day': 'INT',
 'account_number_num_fraud_transactions_per_account_number_90day': 'INT',
 'account_number_num_fraud_transactions_per_account_number_lifetime': 'INT',
 'account_number_n

---

## ReturnCassandraPipelineOutputMapping

This class returns the Cassandra field name associated with each field in pipeline output.

Firstly, we need to instantiate the ReturnCassandraPipelineOutputMapping class. To do this, we need to provide the *url*, *app_prefix*, *user* and *base_entity* for the environment we're interested in.

In [12]:
params = {
    "url": 'http://sim-ds.us-central1.gcp.dev.paypalinc.com',
    "app_prefix": 'james_testing',
    "user": 'james@simility.com',
    "base_entity": 'transaction'
}

In [13]:
r = ReturnCassandraPipelineOutputMapping(**params)

Then we can run the *.request()* method to return the Cassandra field name associated with each field in pipeline output.

In [14]:
cass_po_mapping = r.request()

### Outputs

The *.request()* method a dictionary of the Cassandra field name (values) associated with each pipeline output field (keys).

In [15]:
cass_po_mapping

{'account_number_avg_order_total_per_account_number_1day': 'account_number.avg_order_total_per_account_number_1day',
 'account_number_avg_order_total_per_account_number_30day': 'account_number.avg_order_total_per_account_number_30day',
 'account_number_avg_order_total_per_account_number_7day': 'account_number.avg_order_total_per_account_number_7day',
 'account_number_avg_order_total_per_account_number_90day': 'account_number.avg_order_total_per_account_number_90day',
 'account_number_eid': 'account_number.eid',
 'account_number_num_distinct_transaction_per_account_number_1day': 'account_number.num_distinct_transaction_per_account_number_1day',
 'account_number_num_distinct_transaction_per_account_number_30day': 'account_number.num_distinct_transaction_per_account_number_30day',
 'account_number_num_distinct_transaction_per_account_number_7day': 'account_number.num_distinct_transaction_per_account_number_7day',
 'account_number_num_distinct_transaction_per_account_number_90day': 'accoun

----

## The End

That's it folks - if you have any queries or suggestions please put them in the *#sim-datatools-help* Slack channel or email James directly.