# LakeFormation Example Notebook
***Creating LakeFormation and Secured Database with Granular Security Access***

___

## Contents

1. [Introduction](#Introduction)
2. [Setup](#Setup)
  1. [Imports](#Imports)
  2. [Create Low-Level Clients](#Create-Low-Level-Clients)
  3. [Athena Connection](#Athena-Connection)
3. LaKeFormation
    1. Get account lake formation settings
    2. Set account lake formation settings
    3. Create LakeFormation Policy tags
    4. Add policy tag permissions to orbit lake-creator IAM role
    5. Add policy tag permissions mapping to database and tables

___
## Introduction

In this notebook, we demonstrate how to change account LakeFormation settings, create Policy Tags, assign Policy Tag permissions to IAM roles.

This is the first step in setting up our Data Lake before we can securely start analyzing our data, typically through reporting, visualization, advanced analytics and machine learning methodologies.


---

#### Author: AWS Professional Services Emerging Technology and Intelligent Platforms Group
#### Date: June 10 2021


## Setup

#### Imports and Parameters
First, let's import all of the modules we will need for our lake formation. Lets store our session state so that we can create service clients and resources later on.

Next, lets define the location of our unsecured databased, a secured db location, assert we are indeed the lake-creator
(**Note:** We cannot run this notebook if we are not the lake-creator):

In [None]:
# Imports

import boto3
import pprint
# Import orbit helpers
from aws_orbit_sdk.common import get_workspace


In [None]:
# Clients
lfc = boto3.client('lakeformation')
iamc = boto3.client('iam')
ssmc = boto3.client('ssm')
gluec = boto3.client('glue')

In [None]:
# Define parameters
workspace = get_workspace()
catalog_id = workspace['EksPodRoleArn'].split(':')[-2]

orbit_data_lake_admin_role_arn = workspace['EksPodRoleArn']
orbit_data_lake_creartor_role_arn = workspace['EksPodRoleArn'].replace("-admin-", "-creator-")
orbit_data_lake_user_role_arn = workspace['EksPodRoleArn'].replace("-admin-", "-user-")
orbit_env_admin_role_arn = workspace['EksPodRoleArn'].replace("-lake-admin-role", "-admin")

orbit_env_lf_tag_key = workspace['env_name']+'-security-level'

catalog_id = orbit_data_lake_admin_role_arn.split(':')[-2]

team_space = workspace['team_space']
assert team_space == 'lake-admin'

# Get account lake formation settings

In [None]:
lf_get_account_setting_response = lfc.get_data_lake_settings(
    CatalogId=catalog_id
)

In [None]:
assert lf_get_account_setting_response['DataLakeSettings']

In [None]:
pprint.pprint(lf_get_account_setting_response)

# Set account lake formation settings

- Add orbit lake admin to account LakeFormation administrators
- Change account lake formation default permissions for newly created databases and tables

In [None]:
def add_lake_formation_admin(role_arn, role_check_flag=True):
    lf = boto3.client('lakeformation')
    iam = boto3.client('iam')
    initial_settings = lfc.get_data_lake_settings()

    lf_admins = initial_settings['DataLakeSettings']['DataLakeAdmins']
    print(f"previous admins:{lf_admins}")
    new_lf_admins = []

    # remove no longer valid iam roles since put_data_lake_settings cannot handle that
    for admin in lf_admins:
        admin_role_name = admin['DataLakePrincipalIdentifier'].split('/')[-1]
        try:
            if role_check_flag:
                role = iam.get_role(RoleName=admin_role_name)
            new_lf_admins.append(admin)
        except:
            print(f"invalid role name from datalake settings: {admin_role_name}")
            continue

    new_lf_admins.append({
        'DataLakePrincipalIdentifier': role_arn
    })

    print(f"new admins:{new_lf_admins}")

    initial_settings['DataLakeSettings']['DataLakeAdmins'] = new_lf_admins
    print(initial_settings['DataLakeSettings'])

    response = lf.put_data_lake_settings(
        DataLakeSettings=initial_settings['DataLakeSettings']
    )

    if response['ResponseMetadata']['HTTPStatusCode'] == 200:
        return 0
    print("failed putting data lake settings")
    print(response)
    return -1

In [None]:
add_response = add_lake_formation_admin(role_arn=orbit_data_lake_admin_role_arn, role_check_flag=True)

In [None]:
add_response = add_lake_formation_admin(role_arn=orbit_data_lake_creartor_role_arn, role_check_flag=False)


In [None]:
assert add_response == 0


# Create LakeFormation Policy tags

In [None]:
try:
    delete_lf_tag_response = lfc.delete_lf_tag(
        CatalogId= catalog_id,
        TagKey= orbit_env_lf_tag_key
    )
except Exception as e:
    print('Creating lake formation policy tags for the first time.')
    print(e)

In [None]:
create_lf_tag_response = lfc.create_lf_tag(
    CatalogId= catalog_id,
    TagKey= orbit_env_lf_tag_key,
    TagValues=[
        'sec-1',
        'sec-2',
        'sec-3',
        'sec-4',
        'sec-5',
    ]
)

# Add policy tag permissions to orbit lake-creator IAM role.
-  Adding DESCRIBE and ASSOCIATE permissions with grant options.


In [None]:
try:
    lake_creator_revoke_permissions_response = lfc.revoke_permissions(
        CatalogId=catalog_id,
        Principal={
            'DataLakePrincipalIdentifier': orbit_data_lake_creartor_role_arn
        },
        Resource={
            'LFTag': {
                'CatalogId': catalog_id,
                'TagKey': orbit_env_lf_tag_key,
                'TagValues': [
                    'sec-1',
                    'sec-2',
                    'sec-3',
                    'sec-4',
                    'sec-5',
                ]
            }
        },
        Permissions=[ 'DESCRIBE', 'ASSOCIATE' ],
        PermissionsWithGrantOption=['DESCRIBE', 'ASSOCIATE' ]
    )
except Exception as e:
    print(f'Granting Lakeformation policy tag permissions to {orbit_data_lake_creartor_role_arn} first time.')    
    print(e)

In [None]:
lake_creator_grant_permissions_response = lfc.grant_permissions(
    CatalogId=catalog_id,
    Principal={
        'DataLakePrincipalIdentifier': orbit_data_lake_creartor_role_arn
    },
    Resource={
        'LFTag': {
            'CatalogId': catalog_id,
            'TagKey': orbit_env_lf_tag_key,
            'TagValues': [
                'sec-1',
                'sec-2',
                'sec-3',
                'sec-4',
                'sec-5',
            ]
        }
    },
    Permissions=[ 'DESCRIBE', 'ASSOCIATE' ],
    PermissionsWithGrantOption=['DESCRIBE', 'ASSOCIATE' ]
)


# Add policy tag permissions mapping to database and tables

- Add DESCRIBE and SELECT permissions to IAM role(orbit lake user) over database and tables
- Conditional expression for policy tagging is  <env_name>-security-level:sec-5

In [None]:
lu_db_grant_permissions_response = lfc.grant_permissions(
    CatalogId=catalog_id,
    Principal={
        'DataLakePrincipalIdentifier': orbit_data_lake_user_role_arn
    },
    Resource={
        'LFTagPolicy': {
                    'CatalogId': catalog_id,
                    'ResourceType': 'DATABASE',
                    'Expression': [
                        {
                            'TagKey': orbit_env_lf_tag_key,
                            'TagValues': [
                                'sec-5',
                            ]
                        },
                    ]
                }
    },
    Permissions=[
        'DESCRIBE'
    ],
    
)


In [None]:
lu_db_table_grant_permissions_response = lfc.grant_permissions(
    CatalogId=catalog_id,
    Principal={
        'DataLakePrincipalIdentifier': orbit_data_lake_user_role_arn
    },
    Resource={
        'LFTagPolicy': {
                    'CatalogId': catalog_id,
                    'ResourceType': 'TABLE',
                    'Expression': [
                        {
                            'TagKey': orbit_env_lf_tag_key,
                            'TagValues': [
                                'sec-5',
                            ]
                        },
                    ]
                }
    },
    Permissions=[
        'SELECT'
    ],
    
)


# End of Orbit lake admin demo notebook.