# Tag-based Security Access with Lake Formation Demonstration
***Lake User Read Permission Examples Using Tag-based Security Access***
___
---

## Contents
1. [Introduction](#Introduction)
2. [Set Up](#Set-Up)
 1. [Imports and Parameters](#Imports-and-Parameters)
 2. [Establish Athena Connection](#Establish-Athena-Connection)
3. [Demonstrate use of Secured LakeFormation Databases](#Demonstrate-use-of-Secured-LakeFormation-Databases)
 1. Read from a secured database table with sec-5 tag
 2. Read from secured table with sec-4
 3. Read from secured columns with sec-2 policy tagging
4. Check secured data  bucket access
5. [Redshift Demo](#Redshift-Demo)
 1. [Connect to Redshift](#Connect-to-Redshift)
 2. [Create External Schema](#Create-External-Schema)
 3. [Perform Queries with Security Checks](#Perform-Queries-with-Security-Checks)

---
## Introduction
This notebook dives deeps into the Tag-based Security Access in AWS LakeFormation. It illustrates the following:

* Ability to verify the LakeFormation tag based access control over databases, tables and columns.

* Support Athena, Redshift, Glue

---

##### Author: AWS Professional Services Emerging Technology and Intelligent Platforms Group
##### Date: June 10 2020


In [None]:
%reload_ext sql


In [None]:
# Imports
import boto3
# import orbit helpers
from aws_orbit_sdk.database import get_athena
from aws_orbit_sdk.common import get_workspace
workspace = get_workspace()

team_space = workspace['team_space']
region = workspace['region']
assert team_space == 'lake-user'

catalog_id = workspace['EksPodRoleArn'].split(':')[-2] 

In [None]:
lfc = boto3.client('lakeformation')
iamc = boto3.client('iam')
ssmc = boto3.client('ssm')
gluec = boto3.client('glue')

In [None]:
secured_glue_db = "cms_secured_db"

In [None]:
%reload_ext sql
%config SqlMagic.autocommit=False # for engines that do not support autommit
athena = get_athena()
%connect_to_athena -database default

In [None]:
%sql show databases

In [None]:
%connect_to_athena -database cms_secured_db


In [None]:
%catalog -database cms_secured_db

# Establish Athena Connection

In [None]:
%reload_ext sql
athena_url = athena.get_connection_to_athena('secured_glue_db')['db_url']
athena_url

In [None]:
%sql $athena_url

In [None]:
%%sql $athena_url

SELECT 1 as "Test"

# Demonstrate use of Secured LakeFormation Databases

## Read from a secured database table with sec-5 tag

In [None]:
%%sql $athena_url secured_carrier_claims <<

select * from cms_secured_db.carrier_claims limit 2

In [None]:
secured_carrier_claims

## Read from secured table with sec-4

In [None]:
cms_secured_db_response=%catalog -database cms_secured_db

In [None]:
cms_secured_db_tables_for_lake_user = [table_name for table_name in cms_secured_db_response.data.keys() ]

In [None]:
assert ('inpatient_claims' not in cms_secured_db_tables_for_lake_user)

In [None]:
%%sql $athena_url secured_inpatient_claims <<
select * from cms_secured_db.inpatient_claims limit 1

In [None]:

try: secured_inpatient_claims
except NameError: 
    print('Success')

## Read from secured columns with sec-2 policy tagging

In [None]:
%%sql $athena_url secured_beneficiary_summary <<
select * from cms_secured_db.beneficiary_summary limit 1

In [None]:
assert('sp_depressn' not in secured_beneficiary_summary.field_names)
assert('sp_diabetes' not in secured_beneficiary_summary.field_names)

In [None]:
secured_beneficiary_summary

## Check secured data  bucket access

In [None]:
!aws s3 ls s3://orbit-dev-env-${region}-secured-demo-lake-044923722733-smqduj/

## Redshift Demo

Now let's import Redshift and connect to a Redshift cluster to demo how we can use Redshift to query our databases. Redshift will integrate well with our LakeFormation and our read permissions will be the same as with the previous Athena demo for our secured dataset based on our tags:


#### Connect to Redshift
First, lets connect to redshift using our db_url and check to see if our connection was succesful:

In [None]:
%reload_ext sql
from aws_orbit_sdk.database import get_redshift
rs = get_redshift()

In [None]:
%connect_to_redshift -cluster db-lf -reuse -start -func Standard Nodes=3

In [None]:
%%sql

SELECT 1 as "Test"

In [None]:
%%ddl
drop schema if exists cms_secured_db


In [None]:
%%ddl
drop schema if exists cms_raw_db

In [None]:
%create_external_schema -s cms_raw_db -g cms_raw_db
%create_external_schema -s cms_secured_db -g cms_secured_db


In [None]:
redshift_conn = rs.connect_to_redshift('db-lf')
conn_url = redshift_conn['db_url']

In [None]:
%sql $conn_url

#### Create External Schema
Now, let's set our schema in our red shift cluster based on the schema and metadata we have in our existing databases:


In [None]:
%create_external_schema -s cms_raw_db -g cms_raw_db
%create_external_schema -s cms_secured_db -g cms_secured_db

#### Perform Queries with Security Checks
Finally, we can query the same tables and check our read permissions for each table. We will read once from the unsecured database and once from a level 5 tagged table and 2 more secure columns.

As you can see the security permissions match the permissions when using Athena:

In [None]:
%%sql $conn_url unsecured <<

select *
from cms_raw_db.beneficiary_summary
limit 5


In [None]:
unsecured

In [None]:
assert('sp_depressn' in unsecured.field_names)
assert('sp_diabetes' in unsecured.field_names)

In [None]:
%%sql $conn_url secured <<


select *
from cms_secured_db.beneficiary_summary
limit 5

In [None]:
secured

In [None]:
assert('sp_depressn' not in secured.field_names)
assert('sp_diabetes' not in secured.field_names)

In [None]:
# Deleting redshift cluster
%delete_redshift_cluster -cluster db-lf

# End of Lake User demo notebook