# Cloudformation Stack Creation

This step is responsible to create all required AWS services to host the cluster. We will use one special AWS service called **CloudFormation** to create required services.
Following AWS services are used to host the cluster:
* A VPC
* Two subnets
* A Security Group allowing traffic to-fro cluster
* An Inteenet Gateway
* One VPC-IGW Attachment
* A Route Table
* One Subnet Association for each Subnet
* One Route to the IGW in the RT

### Importing Required Libraries ::
* __boto3__: Required to connect as operate AWS task
* __botocore__: Required to handle the exceptions related to boto3 tasks
* __paramiko__: Reuired to run commands inside EC2 instances
* __json__: To convert python native dictionaries to string, to write in files
* __pickle__: To store configuration dictionary which will be used in next notebooks
* __datetime__, __pprint__, __sys__, __time__: General purpose use

In [1]:
import boto3, botocore, paramiko
from datetime import datetime
import pprint, sys, time, json, pickle
from botocore.exceptions import ClientError

### Reading user defined configurations & Declaring other variables::
* User is allowed to provide specific configurations using provided format of configuration file. If user does not provide any confguration or provides wrong configuration format, then default values will be used. Please check **README.MD** file for default values.

In [2]:
try:
    with open("cluster_config.json", "r") as config_file:
        user_config = json.load(config_file)

    run_id = datetime.now().strftime('%Y%m%d%H%M%S')
    user_config['RunId'] = run_id
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

### Creating boto3 session, clients and resources::
These resources will be used to connect and run different tasks on AWS.

In [3]:
try:
    session = boto3.session.Session(region_name=user_config.get('Region', "us-east-1"))
    cf_client = session.client('cloudformation')
    cf_resource = session.resource('cloudformation')
except Exception as e:
    print("Unexpected error while creating boto3 session, client and resources: " + str(e))
    exit()

### Check for already running Stck under current user::
For each user only one cluster is allowed. Hence only one ACTIVE stack is allowed for each user. In this section we will check if there are existing Stack in ACTIVE status.

In [4]:
try:
    cf_stack_details = cf_client.list_stacks(
        StackStatusFilter=[
            'CREATE_IN_PROGRESS', 'CREATE_COMPLETE', 'ROLLBACK_IN_PROGRESS', 'DELETE_IN_PROGRESS', 'UPDATE_IN_PROGRESS', 'UPDATE_COMPLETE_CLEANUP_IN_PROGRESS', 'UPDATE_COMPLETE', 'UPDATE_ROLLBACK_IN_PROGRESS', 'UPDATE_ROLLBACK_COMPLETE_CLEANUP_IN_PROGRESS', 'UPDATE_ROLLBACK_COMPLETE', 'REVIEW_IN_PROGRESS', 'IMPORT_IN_PROGRESS', 'IMPORT_COMPLETE', 'IMPORT_ROLLBACK_IN_PROGRESS', 'IMPORT_ROLLBACK_COMPLETE'
        ]
    )
    stack_list = cf_stack_details['StackSummaries']
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()


In [5]:
try:
    stack_exists_flag = False
    for stack in stack_list:
        stack_desc = cf_client.describe_stacks(StackName=stack['StackId'])
        for fetched_stack in stack_desc['Stacks']:
            tag_count = 2
            for tag in fetched_stack['Tags']:
                if ((tag['Key'] == "Project") and (tag['Value'] == user_config.get('ProjectTag', "SparkCluster"))) or ((tag['Key'] == "User") and (tag['Value'] == user_config.get('UserName', "root"))):
                    tag_count -= 1
            if tag_count <= 0:
                print("User('" + user_config.get('UserName', "root") + "') already has one existing ACTIVE Spark Cluster stack. Only one stack is allowed at a time.")
                stack_exists_flag = True
                break
        if stack_exists_flag:
            break
    if stack_exists_flag:
    #     exit()
        stack_exists_flag = False
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

## Stack Creation
If any Spark Cluster is not ACTIVE for current User, one spark cluster stack will be created.

**Note:** *This stack will not launch any node instance, node instances are launched separately. Here supporting services like VPC, Subnet etc will be created.*

In [6]:
try:
    if not stack_exists_flag:
        available_az_list = user_config.get('AZList', ['us-east-1a', 'us-east-1b'])
        if available_az_list:
            az_1 = available_az_list[0]
            if len(available_az_list) >= 2:
                az_2 = available_az_list[1]
            else:
                az_2 = available_az_list[0]
        else:
            az_1, az_2 = ('us-east-1a', 'us-east-1b')
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Adding Base properties to the template
Following parameters will be required while executing the template:
* AWSTemplateFormatVersion
* Description
* Parameters(empty)
* Resources(empty)
* Outputs(empty)

In [7]:
try:
    if not stack_exists_flag:
        stack_template = {
            "AWSTemplateFormatVersion": "2010-09-09",
            "Description" : "This template is used to create Stack for SparkClusterUsingAWSEC2 utility.",
            "Parameters" : {},
            "Resources" : {},
            "Outputs" : ()
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Adding Required parameter definition to the template
Following parameters will be required while executing the template:
* VPC CIDR Block

In [8]:
try:
    if not stack_exists_flag:
        stack_template['Parameters']['SparkClusterCIDR'] = {
            "Description": "Provide the IP4 CIDR block that will be used by the VPC.",
            "Type": "String",
            "AllowedPattern" : "[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}/[0-9]{1,3}"
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Adding VPC to the template
* SparkClusterVPC

In [9]:
try:
    if not stack_exists_flag:
        stack_template['Resources']['SparkClusterVPC'] = {
            "Type" : "AWS::EC2::VPC",
            "Properties" : {
                "CidrBlock" : {
                    "Ref" : "SparkClusterCIDR"
                },
                "EnableDnsHostnames" : True,
                "EnableDnsSupport" : True,
                "Tags" : [
                    {
                        "Key" : "Project",
                        "Value" : user_config.get('ProjectTag', "SparkCluster")
                    },
                    {
                        "Key" : "User",
                        "Value" : user_config.get('UserName', "root")
                    },
                    {
                        "Key" : "Name",
                        "Value" : "SparkClusterVPC_" + run_id
                    }
                ]
            }
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Adding two Subnets to the template
* SparkClusterSubnet1
* SparkClusterSubnet2

In [10]:
try:
    if not stack_exists_flag:
        stack_template['Resources']['SparkClusterSubnet1'] = {
            "Type" : "AWS::EC2::Subnet",
            "Properties" : {
                "AvailabilityZone" : az_1,
                "CidrBlock" : {
                    "Fn::Select" : [
                        0,
                        {
                            "Fn::Cidr" : [
                                {
                                    "Fn::GetAtt" : [ "SparkClusterVPC", "CidrBlock" ]
                                },
                                2,
                                9
                            ]
                        }
                    ]
                },
                "MapPublicIpOnLaunch" : True,
                "Tags" : [
                    {
                       "Key" : "Project",
                       "Value" : user_config.get('ProjectTag', "SparkCluster")
                    },
                    {
                        "Key" : "User",
                        "Value" : user_config.get('UserName', "root")
                    },
                    {
                       "Key" : "Name",
                       "Value" : "SparkClusterSubnet1_" + run_id
                    }
                ],
                "VpcId" : {
                    "Ref" : "SparkClusterVPC"
                }
            }
        }

        stack_template['Resources']['SparkClusterSubnet2'] = {
            "Type" : "AWS::EC2::Subnet",
            "Properties" : {
                "AvailabilityZone" : az_2,
                "CidrBlock" : {
                    "Fn::Select" : [
                        1,
                        {
                            "Fn::Cidr" : [
                                {
                                    "Fn::GetAtt" : [ "SparkClusterVPC", "CidrBlock" ]
                                },
                                2,
                                9
                            ]
                        }
                    ]
                },
                "MapPublicIpOnLaunch" : True,
                "Tags" : [
                    {
                       "Key" : "Project",
                       "Value" : user_config.get('ProjectTag', "SparkCluster")
                    },
                    {
                        "Key" : "User",
                        "Value" : user_config.get('UserName', "root")
                    },
                    {
                       "Key" : "Name",
                       "Value" : "SparkClusterSubnet2_" + run_id
                    }
                ],
                "VpcId" : {
                    "Ref" : "SparkClusterVPC"
                }
            }
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Adding Security Group to the template
* SparkClusterSecurityGroup

In [11]:
try:
    if not stack_exists_flag:
        stack_template['Resources']['SparkClusterSecurityGroup'] = {
            "Type" : "AWS::EC2::SecurityGroup",
            "Properties" : {
                "GroupDescription" : "This securty group will filter inbound and outbound traffic to the cluster.",
                "GroupName" : "SparkClusterSecurityGroup_" + run_id,
                "Tags" : [
                    {
                       "Key" : "Project",
                       "Value" : user_config.get('ProjectTag', "SparkCluster")
                    },
                    {
                        "Key" : "User",
                        "Value" : user_config.get('UserName', "root")
                    },
                    {
                       "Key" : "Name",
                       "Value" : "SparkClusterSecurityGroup_" + run_id
                    }
                ],
                "VpcId" : {
                    "Ref" : "SparkClusterVPC"
                }
            }
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Adding default inbound & outbound rules to the template
* SparkClusterSGIngress1
* SparkClusterSGEgress

In [12]:
try:
    if not stack_exists_flag:
        stack_template['Resources']['SparkClusterSGIngress1'] = {
            "Type": "AWS::EC2::SecurityGroupIngress",
            "Properties": {
                "GroupId": { 
                    "Ref": "SparkClusterSecurityGroup"
                },
                "IpProtocol": "-1",
                "FromPort": "-1",
                "ToPort": "-1",
                "CidrIp": {
                    "Fn::GetAtt" : [ "SparkClusterVPC", "CidrBlock" ]
                }
            }
        }

        stack_template['Resources']['SparkClusterSGEgress'] = {
            "Type": "AWS::EC2::SecurityGroupEgress",
            "Properties": {
                "GroupId": { 
                    "Ref": "SparkClusterSecurityGroup"
                },
                "IpProtocol": "-1",
                "FromPort": "-1",
                "ToPort": "-1",
                "CidrIp": "0.0.0.0/0"
            }
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Create InternetGateway and Attach it to VPC
* Create Internet Gateway to allow traffic to the Spark Nodes
* Attach it to the Spark Cluster VPC

In [13]:
try:
    if not stack_exists_flag:
        stack_template['Resources']['SparkClusterIGW'] = {
            "Type": "AWS::EC2::InternetGateway",
            "Properties" : {
                "Tags" : [
                    {
                       "Key" : "Project",
                       "Value" : user_config.get('ProjectTag', "SparkCluster")
                    },
                    {
                        "Key" : "User",
                        "Value" : user_config.get('UserName', "root")
                    },
                    {
                       "Key" : "Name",
                       "Value" : "SparkClusterSecurityGroup_" + run_id
                    }
                ]
            }
        }
        
        stack_template['Resources']['SparkClusterIGWAttachment'] = {
            "Type" : "AWS::EC2::VPCGatewayAttachment",
            "Properties" : {
                "VpcId" : {
                    "Ref" : "SparkClusterVPC" 
                },
                "InternetGatewayId" : {
                    "Ref" : "SparkClusterIGW"
                }
            }
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Router table creation and configuration
* Create one route table
* Associate created subnets to the RT
* Add one route to the created IGW in the RT

In [14]:
try:
    if not stack_exists_flag:
        stack_template['Resources']['SparkClusterRT'] = {
            "Type" : "AWS::EC2::RouteTable",
            "Properties" : {
                "VpcId" : {
                    "Ref" : "SparkClusterVPC"
                },
                "Tags" : [
                    {
                       "Key" : "Project",
                       "Value" : user_config.get('ProjectTag', "SparkCluster")
                    },
                    {
                        "Key" : "User",
                        "Value" : user_config.get('UserName', "root")
                    },
                    {
                       "Key" : "Name",
                       "Value" : "SparkClusterSecurityGroup_" + run_id
                    }
                ]
            }
        }
        
        stack_template['Resources']['SparkClusterSubnetRTAssociation1'] = {
        "Type" : "AWS::EC2::SubnetRouteTableAssociation",
            "Properties" : {
                "SubnetId" : {
                    "Ref" : "SparkClusterSubnet1"
                },
                "RouteTableId" : {
                    "Ref" : "SparkClusterRT"
                }
            }
        }
        
        stack_template['Resources']['SparkClusterSubnetRTAssociation2'] = {
        "Type" : "AWS::EC2::SubnetRouteTableAssociation",
            "Properties" : {
                "SubnetId" : {
                    "Ref" : "SparkClusterSubnet2"
                },
                "RouteTableId" : {
                    "Ref" : "SparkClusterRT"
                }
            }
        }
        
        stack_template['Resources']['SparkClusterRoute'] = {
            "Type" : "AWS::EC2::Route",
            "DependsOn" : "SparkClusterIGW",
            "Properties" : {
                "RouteTableId" : {
                    "Ref" : "SparkClusterRT"
                },
                "DestinationCidrBlock" : "0.0.0.0/0",
                "GatewayId" : {
                    "Ref" : "SparkClusterIGW"
                }
            }
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### Adding Outputs to the template
* VPCId
* SubnetId1
* SubnetId2
* SecurityGroupId

In [15]:
try:
    if not stack_exists_flag:
        stack_template['Outputs'] = {
            "VPCId" : {
                "Description": "The VPC ID",  
                "Value" : { "Ref" : "SparkClusterVPC" }
            },
            "SubnetId1" : {
                "Description": "The Subnet ID",  
                "Value" : { "Ref" : "SparkClusterSubnet1" }
            },
            "SubnetId2" : {
                "Description": "The Subnet ID",  
                "Value" : { "Ref" : "SparkClusterSubnet2" }
            },
            "SecurityGroupId" : {
                "Description": "The Security Group ID",  
                "Value" : { "Ref" : "SparkClusterSecurityGroup" }
            }
        }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

#### White Listing provided IPs
* Add one separate inbound rule in the securit group for each IP

In [16]:
try:
    if not stack_exists_flag:
        white_listed_ips = user_config.get('IPWhitelist', [])
        if white_listed_ips:
            for i in range(len(white_listed_ips)):
                stack_template['Resources']['SparkClusterSGIngress' + str(i+2)] = {
                    "Type": "AWS::EC2::SecurityGroupIngress",
                    "Properties": {
                        "GroupId": { 
                            "Ref": "SparkClusterSecurityGroup"
                        },
                        "IpProtocol": "-1",
                        "FromPort": "-1",
                        "ToPort": "-1",
                        "CidrIp": white_listed_ips[i] if "/" in white_listed_ips[i] else white_listed_ips[i] + "/32"
                    }
                }
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

## Create Stack Template file
This file will be used to create the stack using boto3 library.

In [17]:
try:
    if not stack_exists_flag:
        pprint.pprint(stack_template)
        with open(user_config['WorkspaceDirectory'] + "/SparkClusterOnAWSEC2_Stack_" + user_config['UserName'] + "_" + str(run_id) + ".json", 'w') as stack_file:
            json.dump(stack_template, stack_file)
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

{'AWSTemplateFormatVersion': '2010-09-09',
 'Description': 'This template is used to create Stack for '
                'SparkClusterUsingAWSEC2 utility.',
 'Outputs': {'SecurityGroupId': {'Description': 'The Security Group ID',
                                 'Value': {'Ref': 'SparkClusterSecurityGroup'}},
             'SubnetId1': {'Description': 'The Subnet ID',
                           'Value': {'Ref': 'SparkClusterSubnet1'}},
             'SubnetId2': {'Description': 'The Subnet ID',
                           'Value': {'Ref': 'SparkClusterSubnet2'}},
             'VPCId': {'Description': 'The VPC ID',
                       'Value': {'Ref': 'SparkClusterVPC'}}},
 'Parameters': {'SparkClusterCIDR': {'AllowedPattern': '[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}/[0-9]{1,3}',
                                     'Description': 'Provide the IP4 CIDR '
                                                    'block that will be used '
                                              

## Create Stack using generated Template::
create_stack API of boto3 library is used to launch the Stack.

In [18]:
try:
    if not stack_exists_flag:
        response = cf_client.create_stack(
            StackName="SparkClusterStack-" + user_config['UserName'] + "-" + str(run_id),
            TemplateBody=json.dumps(stack_template),
            Parameters=[
                {
                    'ParameterKey': "SparkClusterCIDR",
                    'ParameterValue': user_config['CidrBlock'].split("/")[0] + "/22"
                },
            ],
            OnFailure="ROLLBACK",
            Tags=[
                {
                    "Key" : "Project",
                    "Value" : user_config.get('ProjectTag', "SparkCluster")
                },
                {
                    "Key" : "User",
                    "Value" : user_config.get('UserName', "root")
                },
                {
                    "Key" : "Name",
                    "Value" : "SparkClusterVPC_" + run_id
                }
            ]
        )
        user_config['StackId'] = response['StackId']
        print(response)
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

{'StackId': 'arn:aws:cloudformation:us-east-1:928765701029:stack/SparkClusterStack-ccbp-dev-user-saumalya-20200331205712/193a33b0-7364-11ea-a2ca-1272d872aba7', 'ResponseMetadata': {'RequestId': '0298b947-70d5-4824-8eb1-761495fcd5ba', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '0298b947-70d5-4824-8eb1-761495fcd5ba', 'content-type': 'text/xml', 'content-length': '425', 'date': 'Tue, 31 Mar 2020 15:27:14 GMT'}, 'RetryAttempts': 0}}


#### Check creation status:
describe_stack API of boto3 library will be used.

In [19]:
try:
    stack_created_flag = False
    if not stack_exists_flag:
        probing_limit = 30
        for _ in range(probing_limit):
            stack_desc = cf_client.describe_stacks(StackName=user_config['StackId'])['Stacks'][0]
            if stack_desc['StackStatus'] == "CREATE_COMPLETE":
                print("Stack('" + stack_desc['StackName'] + "') is created. It is ready to be used.")
                stack_created_flag = True
                stack_output_list = stack_desc['Outputs']
                break
            elif stack_desc['StackStatus'] == "ROLLBACK_COMPLETE":
                print("Stack('" + stack_desc['StackName'] + "') creation has FAILED. Initiating DELETE STACK process.")
                cf_client.delete_stack(StackName=stack_desc['StackName'])
                print("Deletion process initiated. It will take 3-4 minutes based on the network.")
                break
            else:
                print("Stack creation process is still not Completed or Failed, going to sleep for 10 seconds...")
                time.sleep(10)
        else:
            print("Maximum waitting period(5 mins) is over. Please check using AWS console.")
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

Stack creation process is still not Completed or Failed, going to sleep for 10 seconds...
Stack creation process is still not Completed or Failed, going to sleep for 10 seconds...
Stack creation process is still not Completed or Failed, going to sleep for 10 seconds...
Stack creation process is still not Completed or Failed, going to sleep for 10 seconds...
Stack creation process is still not Completed or Failed, going to sleep for 10 seconds...
Stack creation process is still not Completed or Failed, going to sleep for 10 seconds...
Stack('SparkClusterStack-ccbp-dev-user-saumalya-20200331205712') is created. It is ready to be used.


In [20]:
try:
    if stack_created_flag:
        stack_output_dict = {output['OutputKey']: output['OutputValue'] for output in stack_output_list}
        user_config['VPCId'] = stack_output_dict['VPCId']
        user_config['SubnetList'] = [stack_output_dict['SubnetId1'], stack_output_dict['SubnetId2']]
        user_config['SecurityGroupId'] = stack_output_dict['SecurityGroupId']
        pprint.pprint(user_config)

        with open(user_config['WorkspaceDirectory'] + "/SparkClusterOnAWSEC2_" + user_config['UserName'] + "_CurrentStatus.pkl", 'wb') as pickle_handle:
            pickle.dump(user_config, pickle_handle, protocol=pickle.HIGHEST_PROTOCOL)
except Exception as e:
    print("Unexpected error while creating Cloudformation Stack: " + str(e))
    exit()

{'AZList': ['us-east-1a', 'us-east-1b', 'us-east-1c'],
 'CidrBlock': '172.172.0.0/16',
 'IPWhitelist': ['103.77.137.192', '192.168.247.1', '52.90.143.85'],
 'InstanceType': 't2.micro',
 'KeyPairName': 'SparkCluster',
 'KeyPairPath': '/Volumes/WorkSpace/AWS/Access_Keys',
 'ProjectTag': 'SparkCluster',
 'Region': 'us-east-1',
 'RunId': '20200331205712',
 'SecurityGroupId': 'sg-08da31c8f5d7911e0',
 'SlaveCount': 3,
 'StackId': 'arn:aws:cloudformation:us-east-1:928765701029:stack/SparkClusterStack-ccbp-dev-user-saumalya-20200331205712/193a33b0-7364-11ea-a2ca-1272d872aba7',
 'SubnetList': ['subnet-0bc031e1a7754a079', 'subnet-02a980cb830e1e85b'],
 'UserName': 'ccbp-dev-user-saumalya',
 'VPCId': 'vpc-00f4768a57605928b',
 'WorkspaceDirectory': '/Volumes/WorkSpace/POC/SparkClusterEC2/WrkSpc'}
