# Objective

This notebook shows the connection to AWS and a Hello World with some of the services used

1) S3
2) Lambda (try the hello world from midterm)
3) Cloud Watch

To connect a key was made using IAM, and put into a local `.env` file, the credentials are temporary and will expire in 30 days. Care should be used when making these and different keys used for dev and prod.


* outside POC environment consider different security practices such as SSO 


In [1]:
# include in req
# !pip install pandas as pd
# !pip install pydot
# !pip install python-dotenv

In [2]:
import sys

import boto3
from botocore.config import Config
from dotenv import dotenv_values
import pandas as pd


config = dotenv_values("../.env") 

print(sys.version)

3.11.8 (v3.11.8:db85d51d3e, Feb  6 2024, 18:02:37) [Clang 13.0.0 (clang-1300.0.29.30)]


In [3]:
# Initialize a session using aws cred
session = boto3.Session(aws_access_key_id=config["aws_access_key_id"],
                        aws_secret_access_key=config["aws_secret_access_key"],
                        region_name=config["region"])

s3 = session.client('s3')
lamda_func = session.client("lambda")
cloudwatch = session.client('logs')

In [4]:
d = s3.list_buckets()

# show current buckets
b = [n["Name"] for n in d["Buckets"]]


# validate folders needed in connection
assert 'fmi-lambda-demo' in b, "missing the lambda demo" # midterm
assert 'team4-cosmicai' in b, "missing our team4 cosmicai S3 connection" 

b

['aws-athena-query-results-211125778552-us-east-1',
 'cosmicai-data',
 'cosmicai2',
 'fmi-lambda-demo',
 'group2-s3-bucket',
 'group4-s3-bucket',
 'sagemaker-studio-211125778552-3zpozdpwzcx',
 'sagemaker-studio-211125778552-rrp76qgcj1n',
 'sagemaker-us-east-1-211125778552',
 'team-one-cosmic-data',
 'team-one-s3-cosmic',
 'team2cosmicai',
 'team3cosmicai',
 'team4-cosmicai']

We need the log group first and then we can get log streams
***

### Log Groups


In [5]:
# log groups
     

l = []
r = cloudwatch.describe_log_groups()

for group in r['logGroups']:
     l.append(group['logGroupName'])

df_log_groups = pd.DataFrame(l, columns=["log_group_names"])

# general if needed
# df_log_groups[df_log_groups.log_group_names.str.contains("(?!.*sagemaker).*")].head(25)

df_log_groups[df_log_groups.log_group_names.str.contains("cosmic")]


Unnamed: 0,log_group_names
16,/aws/lambda/cosmic-executor
17,/aws/lambda/cosmic-init


### Log Streams
***

In [6]:
LOG_GROUP = "/aws/lambda/cosmic-executor"

l = []

r = cloudwatch.describe_log_streams(logGroupName=LOG_GROUP)
for stream in r['logStreams']:
    l.append(stream['logStreamName'])

df_log_streams_raw = pd.DataFrame(l, columns=["raw_streams"])

In [7]:
# use this as a stream name
df_log_streams_raw.iloc[-10].values[0]

'2024/10/30/[$LATEST]c5a5b6db81604d00b8f11913eec357c0'

In [8]:
# might not always work -- extra step
# to make more readable 

df_log_streams = df_log_streams_raw["raw_streams"].str.split(r"\[\$LATEST\]", expand=True)
df_log_streams.columns = ["date_pulled", "stream_hash"]

df_log_streams

Unnamed: 0,date_pulled,stream_hash
0,2024/10/25/,21e0b49bd8af4286a13e15883d763c13
1,2024/10/25/,2ee42e29a75a4dcbbb204ad968bcb830
2,2024/10/25/,72f0c2d5be274cc08d678d1248aef18f
3,2024/10/25/,f15c35a3fbc942ab94516108b1a63d32
4,2024/10/28/,18a6fd4b9e5f4b00b7defd8ba205e4f5
5,2024/10/28/,57c6ecca5e0a448f8868c32fedb8a138
6,2024/10/29/,0a3156e23c444aa5886e99268b303d58
7,2024/10/29/,1790ee8cfd9149bf82dda03f6306ab56
8,2024/10/29/,206e52bc463d455f9916bc981f0af290
9,2024/10/29/,22bf807bda0a40f7bb9cd10b9fbdb725


In [9]:
# can now get log events

# latest
LOG_STREAM = df_log_streams_raw.iloc[-1].values[0]

r = cloudwatch.get_log_events(logGroupName=LOG_GROUP, logStreamName=LOG_STREAM)

for event in r['events']:
    print(f"Timestamp: {event['timestamp']}, Message: {event['message']}")

In [10]:
for stream in df_log_streams_raw["raw_streams"]:
    r = cloudwatch.get_log_events(logGroupName=LOG_GROUP, logStreamName=stream)

    # any valid events?
    for event in r['events']:
        print(f"Timestamp: {event['timestamp']}, Message: {event['message']}")

In [11]:
# r


In [12]:
# TODO: get out metrics from runs\
    # pipe into data so team can use

# TODO: try a lambda function hello world