## Import all necessary Libraries
* **OCI** - Python Library that converts Python Commands into OCI API Requests
* **JSON** - For JSON to Python Dictionary Conversion , File read , and File write
* **PPRINT** - Module for pretty printing all text while debugging
* **DATETIME** - Used to Convert String Datetime Inputs from OCI API into datetime objects for carrying out datetime arithmetic

In [6]:
import oci
import json
import pprint
import pandas as pd
import datetime
from operator import itemgetter
import asyncio
import time
import logging
import concurrent.futures as cf

## Helper Functions
* Some of the helper functions written to make the code cleaner

In [7]:
from helpers import list_region_subscriptions
from helpers import fetch_compartment_heirarchy
from helpers import search_region_and_populate
from oci_clients import clients_init

## Define Resource List to Query
* Initialize Clients that will be leveraged
* Conditional variables - Interested in qeurying for active resources
* Using || Symbol to make sure we fetch resources either in Active, Running or Available State

## Supported Resources in Search
* [List of Supported Resources](https://docs.cloud.oracle.com/en-us/iaas/Content/Search/Concepts/queryoverview.htm#resourcetypes)

In [8]:
resourcetype_list = [
    "instance",
    "dbsystem",
    "vmcluster",
    "odainstance",
    "bootvolume",
    "bootvolumebackup",
    "volumebackup",
    "volumebackuppolicy",
    "volume",
    "datascienceproject",
    "datasciencemodel",
    "datasciencenotebooksession",
    "datacatalog",
    "analyticsinstance",
    "autonomousdatabase",
    "integrationinstance",
    "vcn",
    "subnet",
    "vnic",
    "securitylist",
    "routetable",
    "natgateway",
    "servicegateway",
    "onstopic",
    "onssubscription",
    "stream",
    "connectharness",
    "bucket",
    "vault",
    "filesystem",
    "apigateway",
    "apideployment",
]
condition_list = [
    "lifecycleState = 'RUNNING'",
    "lifecycleState = 'AVAILABLE'",
    "lifecycleState = 'ACTIVE'",
]
resourceString = (", ").join(resourcetype_list)
conditionString = (" || ").join(condition_list)

## Supported Clients & Resources
- Search provides the list of OCIDs based on the query string 
- Use the OCIDs to drill down into the resources further to understand your tenancy better
 - **Identity Client** - To understand how many regions ( Data Center Geographies, the tenancy is subscribed to )
 - **Search Client** - To fetch all resources that satisfy query conditions
 - **Compute Client** - To Drill down into compute resources for Compute/VM/Bare metal server specific Information
 - **Database Client** -  To Drill down into database resources for Database Specific Information 
 - **Analytics Client** - To Drill down into analytics resources for Analytics Instance Specific Information
 - **VCN Client** - To Drill down information of VCN, Subnets, LPGs, DRGs, Load Balancers etc. 
 - **Notifications Client** - To Drill down on Information on Notification Topics, Subscriptions etc. 
 - **API-GW Client** - To Drill down on API Gateway & API Deployments.
 - **Block Storage Client** - To Drill down on Block Volumes and Boot Volumes.
 - **Object Storage Client** - To Drill down on Object Storage Solution.
 - **Streams Client** - To Drill down on Streams, kafka Connect harness etc. 

## Setup the OCI Config
* Read the OCI Config from the default Path / provide the path where the config file is available

In [9]:
config = oci.config.from_file()
tenancy_id = config["tenancy"]

## List all Regions Subscribed in Tenancy
 - **Search** endpoints are regional and hence we will iterate asynchronously over all regions

In [10]:
region_names = list_region_subscriptions(config)

'Fetching all regions in tenancy'
("List of regions subscribed to : ['ap-mumbai-1', 'eu-frankfurt-1', "
 "'ap-hyderabad-1', 'us-phoenix-1']")


## Fetch Compartment Heirarchy
 - **Compartment** is an IAM resource and is Global
 - **Compartment_KV** - Lookup table between Compartment OCID and Compartment name
 - **Compartment Parent OCID  KV** - Lookup table between Compartment OCID, Parent Compartment OCID

In [11]:
compartment_kv, compartment_parent_ocid_kv = fetch_compartment_heirarchy(config)

'Populate Compartment Herirachies in Tenancy'


## Mark Start of Execution
Print Start Time and end time for populating the entire tenancy tree

In [12]:
pprint.pprint("Start Time : {}".format(time.strftime("%b %d %Y %H:%M:%S",time.localtime())))

'Start Time : Apr 07 2020 13:22:52'


## Search and Populate
The Search Region and Populate Function has two stages
1. Fetch the search results based on search query provided
2. Use the Search result to drill down resource specific information
3. Consolidate both and populate a JSON Tree . 

In [13]:
executor = cf.ThreadPoolExecutor( max_workers=20,)
returnFlag = [search_region_and_populate(executor, config, region_name, resourceString, conditionString, compartment_kv, compartment_parent_ocid_kv) for region_name in region_names]

## Asynchronous Parallelism
- Concurrent Futures is used to execute all mutually exclusive tasks as separate Threads
- AsyncIO.gather is used to concurrently spin up asynchronous non-blocking calls to multiple API endpoints. 

In [14]:
await asyncio.gather(*(returnFlag))

'Initializing Resource Specific Clients & Regions '
'Initialize Search Client in Region : eu-frankfurt-1'
'Initialize Compute Client in Region : eu-frankfurt-1'
'Initialize DB Client in Region : eu-frankfurt-1'
'Initialize Analytics Client in Region : eu-frankfurt-1'
'Initialize Networking Client in Region : eu-frankfurt-1'
'Initialize Data Science Client in Region : eu-frankfurt-1'
'Initialize Block Storage Client in Region : eu-frankfurt-1'
'Initialize Object Storage Client in Region : eu-frankfurt-1'
'Initialize Notifications Client in Region : eu-frankfurt-1'
'Initialize API-GW Client in Region : eu-frankfurt-1'
'Initialize Streaming Client in Region : eu-frankfurt-1'
'Initialize Functions Client in Region : eu-frankfurt-1'
'Initialize Integration Client in Region : eu-frankfurt-1'
'Initialize Vaults Client in Region : eu-frankfurt-1'
'Initialize Oracle Digital Assistant Client in Region : eu-frankfurt-1'
'Initialize Data Catalog Client in Region : eu-frankfurt-1'
'Initialize File 

[None, None, None, None]

## Mark End of Execution
Print End Time for populating the entire tenancy tree

In [15]:
pprint.pprint("End Time : {}".format(time.strftime("%b %d %Y %H:%M:%S",time.localtime())))

'End Time : Apr 07 2020 13:23:46'


## Data Load into Data Frame
- Load JSON Data and Append it to Data Frame 

In [19]:
resource_dist_df = pd.DataFrame()
for region_name in region_names:
    if region_name != "ap-hyderabad-1":
        temp = pd.read_json("region_distribution-" + region_name + ".json")
        resource_dist_df = resource_dist_df.append(temp, ignore_index=True, sort=False)       
pprint.pprint(resource_dist_df.head())

        region    availability_domain   resource_type shape OCPU_Qty  \
0  ap-mumbai-1  VPLM:AP-MUMBAI-1-AD-1      BootVolume   N/A      N/A   
1  ap-mumbai-1                   None    SecurityList   N/A      N/A   
2  ap-mumbai-1                   None  ServiceGateway   N/A      N/A   
3  ap-mumbai-1                   None             Vcn   N/A      N/A   
4  ap-mumbai-1                   None          Subnet   N/A      N/A   

  license_model                        display_name  \
0           N/A                   DC1 (Boot Volume)   
1           N/A     Default Security List for d3vcn   
2           N/A  Service Gateway-windows_addemo_vcn   
3           N/A                               d3vcn   
4           N/A   Private Subnet-windows_addemo_vcn   

                                       resource_ocid  \
0  ocid1.bootvolume.oc1.ap-mumbai-1.abrg6ljrwsqsb...   
1  ocid1.securitylist.oc1.ap-mumbai-1.aaaaaaaawsv...   
2  ocid1.servicegateway.oc1.ap-mumbai-1.aaaaaaaaq...   
3  ocid1.vcn

## Sample Data Manipulation to Calculate Number of Active Days 
- A very important factor in our tenancy was to limit the number of active days and turn of resources as required.
- Hence this calculation

In [17]:
resource_dist_df["Qty"] = 1
resource_dist_df["TimeNow"] = datetime.datetime.now(datetime.timezone.utc)
resource_dist_df["CreatedOn"] = pd.to_datetime(resource_dist_df["CreatedOn"])
resource_dist_df["ActiveForDays"] = resource_dist_df["TimeNow"].sub(
    resource_dist_df["CreatedOn"], axis=0
)

## Further Manipulation
* Convert the Results to String for Easy Visualization

In [18]:
resource_dist_df["TimeNow"] = resource_dist_df["TimeNow"].astype(str)
resource_dist_df["CreatedOn"] = resource_dist_df["CreatedOn"].astype(str)
resource_dist_df["ActiveForDays"] = resource_dist_df["ActiveForDays"].astype(str)

## Export Data to 
* JSON
* CSV

In [None]:
csv = resource_dist_df.to_csv("processed_data.csv", sep=",", header=True, index=True)
out = resource_dist_df.to_json(orient="index")
with open("processed_json.json", "w") as f:
    f.write(out)