# Policy Filter Conversion from C7N DSL to CEL

Wait, we can do that?

Yes. For many (but not **all**) policies, we can convert the C7N DSL for a filter to CEL.

How does it work?

Show me examples.

How do I test it?

## The `xlate.c7n_to_cel` Tool

In [64]:
from xlate.c7n_to_cel import C7N_Rewriter

In [65]:
policy_text = """
name: elb-delete-any-new
resource: elb
comment: 'Any ELB created in any us-west-2 VPC will be immediately deleted, with some
 exceptions.

  '
filters:
- key: CreatedTime
  op: greater-than
  type: value
  value: 0.011
  value_type: age
- key: VPCId
  op: in
  type: value
  value:
  - vpc-12345678
  - vpc-23456789
- key: tag:ASSET
  op: not-in
  type: value
  value:
  - ASSET_1
  - ASSET_2
  - ASSET_3
actions:
- key: custodian_decom
  type: tag
  value: Load Balancer created in VPC that is being decommissioned - ELB deleted
- type: delete
- action_descs:
  - The ELB has been deleted.
  - ' '
  - The West VPC is being decommissioned.  Create any
    new resources in the (Dev) East VPC.
  cc:
  - custodian@enterprise.com
  from: custodian@enterprise.com
  policy_url: NA
  subject: '[custodian][{{ account }}] ELB in unallowed account/VPC - {{ region }}'
  template: fs-default.html
  to:
  - resource-owner
  transport:
    topic: arn:aws:sns:{region}:123456789012:c7n-notifications
    type: sns
  type: notify
  violation_desc: 'The following Load Balancer(s) were created in a VPC that is not
    allowed or is being decommissioned:'
"""

In [66]:
print(C7N_Rewriter.c7n_rewrite(policy_text))

Now - duration("15m50s") > timestamp(Resource["CreatedTime"]) && ['vpc-12345678', 'vpc-23456789'].contains(Resource["VPCId"]) && ! ['ASSET_1', 'ASSET_2', 'ASSET_3'].contains(Resource["Tags"].filter(x, x["Key"] == "ASSET")[0]["Value"])


12345678That's awkward, lets' reformat it.

We'll be able to replace it in the Policy

    - filters:
      -  type: cel
         expr: |
            Now - duration("15m50s") > timestamp(Resource["CreatedTime"]) 
            && ['vpc-12345678', 'vpc-23456789'].contains(Resource["VPCId"]) 
            && ! ['ASSET_1', 'ASSET_2', 'ASSET_3']
            .contains(Resource["Tags"].filter(x, x["Key"] == "ASSET")[0]["Value"])

Which raises an interesting question. 

"15m50s"? Should that be "15m"?  The `0.011` isn't really as accurate as the CEL durations.

The `Resource["Tags"].filter(x, x["Key"] == "ASSET")[0]["Value"])` is not "clear" but it's ultra-precise.

Filter all the values in the `Resource["Tags"]` to create a sub-list where the key is "ASV". Ideally, there's exactly one. Item 0 from this list should have a `["Value"]` which is what we want to examine.

This is a common-enough constract, that we will have a extension function for it: `Resource.key("ASV")`. We'll work without the `c7nlib` extensions to start.

In [67]:
cel = """
        Now - duration("15m50s") > timestamp(Resource["CreatedTime"]) 
        && ['vpc-12345678', 'vpc-23456789'].contains(Resource["VPCId"]) 
        && ! ['ASSET_1', 'ASSET_2', 'ASSET_3']
        .contains(Resource["Tags"].filter(x, x["Key"] == "ASSET")[0]["Value"])
"""

Let's define (or query) some ELB resources and evaluate this CEL expression.

First, let's create a mock `CELFilter` to test with.

In [68]:
import celpy
from typing import Dict, Any

class CELFilter:
    decls = {
        "Resource": celpy.celtypes.MapType,
        "Now": celpy.celtypes.TimestampType,
    }

    def __init__(self, expr: str) -> None:
        env = celpy.Environment(annotations=CELFilter.decls)
        ast = env.compile(expr)
        self.functions = {}  # c7nlib.FUNCTIONS may need to be mocked to help develop or debug.
        self.prgm = env.program(ast, self.functions)
        
    def process(self, resource: celpy.celtypes.Value, now: str) -> bool:
        activation = {
            "Resource": resource,
            "Now": celpy.celtypes.TimestampType(now),
        }
        return self.prgm.evaluate(activation)

In [69]:
filter_1 = CELFilter(cel)

In [70]:
resource = {
    "ResourceType": "elb",
    "Tags": [
        {"Key": "ASSET", "Value": "SOMEAPP"},
    ],
    "CreatedTime": "2020-10-17T18:15:00Z",
    "VPCId": "vpc-23456789",
}


At 18:19, is this ready for action? (Hint: no, it's not old enough.)

In [71]:
now_1 = "2020-10-17T18:19:20Z"
filter_1.process(celpy.json_to_cel(resource), now_1)

BoolType(False)

How about at 19:20? Now it's been over an hour

In [72]:
now_2 = "2020-10-17T19:20:21Z"
filter_1.process(celpy.json_to_cel(resource), now_2)

BoolType(True)

## More Testing Goodness: the `demo/celdemo.py` tool

How can we more fully automate this testing?

There are two paths.

-  A little shell-level thing to do CEL evaluation in the context of AWS CLI describes.

-  A behave-based framework to do more formal acceptance-type tests.

See the `celdemo.py` script.

(Note. `celpy` is NOT installed, and needs to be visible to Python, using `PYTHONPATH` lets us
use the package without installing it.)

In [73]:
!export PYTHONPATH=src 

In [74]:
!python celdemo.py --cel '355./113.'

DoubleType(3.1415929203539825) from Now '2020-11-13T19:41:47.215115', Resource None


In [75]:
!python celdemo.py --cel 'Now+duration("1h")' --now "2020-09-10T11:12:13Z"

TimestampType('2020-09-10T12:12:13Z') from Now '2020-09-10T11:12:13Z', Resource None


This tool creates a CEL activation with two globals.

-  `Now` is set from the command-line `--now` value. Pick a time of day to test against.

-  `Resource` is each resource read from a file or stdin.

You can provide a bunch of resources to apply against a CEL expression and see what the results will be.

You can, for example, use `aws cli` to describe resources and examine them with a `CEL` expression to locate compliant and non-compliant resources.

We'll create a fake `resource1.json` file that simulates an AWS CLI describe.

In [76]:
sample_doc_1 = {
    "Tags": 
    [
        {
            "Key": "ASSET",
            "Value": "Forbidden"
        }
    ]
}
from pathlib import Path
import json
with Path("resource1.json").open("w") as rsrc1:
    json.dump(sample_doc_1, rsrc1)

Now we can process `resource1.json`. 

In [77]:
!python celdemo.py --cel 'Resource.Tags.filter(t, t["Key"]=="ASSET")[0]["Value"]' --now "2020-09-10T11:12:13Z" resource1.json

StringType('Forbidden') from Now '2020-09-10T11:12:13Z', Resource {'Tags': [{'Key': 'ASSET', 'Value': 'Forbidden'}]}


In [78]:
!cat resource1.json | python celdemo.py --cel 'Resource.Tags.filter(t, t["Key"]=="ASSET")[0]["Value"]' --now "2020-09-10T11:12:13Z" --format json -

StringType('Forbidden') from Now '2020-09-10T11:12:13Z', Resource {'Tags': [{'Key': 'ASSET', 'Value': 'Forbidden'}]}


We can try to process multiple JSON files, which gives us multiple results.

In [79]:
!python celdemo.py --cel 'Resource.Tags.filter(t, t["Key"]=="ASSET")[0]["Value"]' --now "2020-09-10T11:12:13Z" *.json

TypeError("'CELEvalError' object is not iterable") from Now '2020-09-10T11:12:13Z', Resource {'creationTimestamp': '2018-07-06T05:04:03Z', 'deleteProtection': False, 'name': 'projects/project-123/zones/us-east1-b/instances/dev/ec2', 'instanceSize': 'm1.standard'}
StringType('Forbidden') from Now '2020-09-10T11:12:13Z', Resource {'Tags': [{'Key': 'ASSET', 'Value': 'Forbidden'}]}
