<a href="https://colab.research.google.com/github/IrynaGg/Python-and-cyber/blob/main/Testing_lambda_function.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Testing the lambda function code locally as a part of the PythonProject

# Connecting to AWS Simple Storage System (S3)
---
To connect to, read files from and write files to S3 you will need the following:



*   Get an access key (which you should keep hidden)  
*   install the python library `boto3`   
*   write a function get an S3 connection
*   read or write a file as you need to



### Use this code cell to install boto3 for use in this worksheet
---
Use this code to do this:
`!pip install boto3`

***Note 1***:  you will need to do this each time you come back to the worksheet.  If someone else copies your worksheet, they will need to install it.

***Note 2***: once you have installed it in a session, you won't need to install it again, so put the code in a cell that you only run once.

In [1]:
!pip install boto3

Collecting boto3
  Downloading boto3-1.34.59-py3-none-any.whl (139 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.3/139.3 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting botocore<1.35.0,>=1.34.59 (from boto3)
  Downloading botocore-1.34.59-py3-none-any.whl (12.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.0/12.0 MB[0m [31m31.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting jmespath<2.0.0,>=0.7.1 (from boto3)
  Downloading jmespath-1.0.1-py3-none-any.whl (20 kB)
Collecting s3transfer<0.11.0,>=0.10.0 (from boto3)
  Downloading s3transfer-0.10.0-py3-none-any.whl (82 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m82.1/82.1 kB[0m [31m8.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: jmespath, botocore, s3transfer, boto3
Successfully installed boto3-1.34.59 botocore-1.34.59 jmespath-1.0.1 s3transfer-0.10.0


In [9]:
!pip install botocore



### Save your access key, and secret key in an environment variable
---

During testing you will save the keys in environment variables here.  When you create a function in lambda you will use the environment variables there.

1.  In the AWS console - go to the IAM service.  Follow instructions here to create your access keys: https://docs.google.com/document/d/1_FhKLVLSaBdck1e-Pm4mlUQkIj9BGwPfuquMPHpXdls/edit?usp=sharing   
2.  Once you have downloaded the keys in a CSV so that you have a permanent record to copy and paste from (kept on your own device, or in your own cloud storage if you can't store on the device), you will be able to use them to connect to S3.

3.  Create a bucket (a folder) to store your files.  Follow the instructions here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html  (there is quite a bit of information about changing setting, leave everything as the default for this exercise)  

3.  Use the code cell below to allow you to input the public key (AWS_ACCESS_KEY), the private key (AWS_SECRET_ACCESS_KEY), and the bucket name,  and save all three in environment variables only available in this worksheet while you are using it.

***Note 2***:  you will need to do this each time you come back to the worksheet.  If someone else copies your worksheet, they should not be able to upload to your S3 as they won't know the keys and they won't know the bucket URL (as you will also save that in an environment variable)

In [2]:
import os
from IPython.display import clear_output

def set_environment_variable_values():
  ACCESS_KEY = input("Please enter the AWS access key: ")
  SECRET_ACCESS_KEY = input("Please enter the AWS secret access key: ")
  BUCKET_NAME = input("Please enter the name of the bucket in S3: ")
  os.environ['ACCESS_KEY'] = ACCESS_KEY
  os.environ['SECRET_ACCESS_KEY'] = SECRET_ACCESS_KEY
  os.environ['BUCKET_NAME'] = BUCKET_NAME
  clear_output()
  return None

set_environment_variable_values()


### You can use the cell below to check that your environment variables have been saved, but clear the output before you upload this to github
---


In [None]:
print(os.environ.get('ACCESS_KEY'), os.environ.get('SECRET_ACCESS_KEY'), os.environ.get('BUCKET_NAME'))

### Create a connection to the S3 bucket
---

In order to work with files in the bucket you will need a 'client'.  This will be the worker that will do the fetching and storing.  The code below will set up this client and the output will show that a client has been created.

In [3]:
import boto3

def get_S3_client():
	resource = boto3.client(
     "s3",
		aws_access_key_id = os.environ.get('ACCESS_KEY'),
		aws_secret_access_key = os.environ.get('SECRET_ACCESS_KEY')
	)
	return resource

s3_client = get_S3_client()
print(s3_client)

<botocore.client.S3 object at 0x7ad1f4993b80>


### Opening a file from S3
---

You can upload this file to your bucket, through the AWS console.  

*  Download the file (population.csv) from here:  https://drive.google.com/file/d/1Mj2f56YrgWL6eYUF9zOf0Pph8NhJAIe0/view?usp=sharing

*  Open the AWS dashboard and select S3 as the service.  

*  Find your bucket and click on its link to open it.  

*  Click on **upload**, select the file and upload

Now that the file is in the bucket, use the code below to open it.

In [10]:
import pandas as pd
import io

import os
import boto3
import json
import csv
#import botocore

def get_file(filename):
  # get the file from the bucket
  file_object = s3_client.get_object(Bucket=os.environ.get('BUCKET_NAME'), Key=filename)

  # convert the file object to a text-based csv file then read the file contents into a table using the pandas read_csv function
  data_file = io.BytesIO(file_object['Body'].read())
  data = pd.read_csv(data_file)
  return data

data = get_file('schools_list.csv')
print(data)



def add_school_data_to_bucket(client, filename, filedata):
    s3_client = boto3.client('s3')
    s3 = boto3.resource('s3')
    bucket = s3.Bucket(os.environ.get("BUCKET_NAME"))
    try:
        obj = s3_client.get_object(Bucket=os.environ.get("BUCKET_NAME"), Key=filename)
        data = obj['Body'].read().decode('utf-8').splitlines()
        records = csv.reader(data)
        names_array = []
        with open('/tmp/schools_list.csv', 'w', newline='') as f:
            for row in records:
                names_array.append(row[0])
                writer = csv.writer(f)
                writer.writerow(row)
        new_data_rows = []
        with open('/tmp/schools_list.csv', 'a', newline='') as f:
            for i in filedata:
                if i[0] not in names_array:
                    new_data_rows.append(i)
                    writer = csv.writer(f)
                    writer.writerow(i)
        if len(new_data_rows) == 0:
            return "These schools are already in a file", []
        else:
            bucket.upload_file('/tmp/schools_list.csv', filename)
            return "New data been successfully added", new_data_rows
    except Exception as e:
   #  if e.response["Error"]["Code"] == "NoSuchKey":
      headers = ["Name", 'Latitude', 'Longitude']
      with open('/tmp/schools_list.csv', 'w', newline='') as f:
                writer = csv.writer(f)
                writer.writerow(headers)
                for row in filedata:
                    writer.writerow(row)
      bucket.upload_file('/tmp/schools_list.csv', filename)
      return "New file been successfully created", filedata
        # else:
        #     return e.response["Error"]["Code"], []

def show_schools_data_in_bucket(client, filename):
    s3_client = boto3.client('s3')
    s3 = boto3.resource('s3')
    bucket = s3.Bucket(os.environ.get("BUCKET_NAME"))
    try:
        obj = s3_client.get_object(Bucket=os.environ.get("BUCKET_NAME"), Key=filename)
        data = obj['Body'].read().decode('utf-8').splitlines()
        records = csv.reader(data)
        data_array =[]
        for row in records:
                         data_array.append(row)
        print(data_array)
        return "The data has been found", data_array
    except botocore.exceptions.ClientError as e:
        if e.response["Error"]["Code"] == "NoSuchKey":
            return "There is no such file", []
        else:
            return e.response["Error"]["Code"], []
    except:
        return "There was an error", []




                                               Name   Latitude  Longitude
0              Bannockburn Primary School & Nursery  51.486917   0.101556
1              St Margaret Clitherow Primary School  51.501033   0.113299
2                Blackburn Primary School & Nursery  51.486917   0.101556
3                St Greggs Clitherow Primary School  51.501033   0.113299
4                            Parkway Primary School  51.492800   0.134105
5   St Michael's East Wickham C of E Primary School  51.469340   0.119751
6                    Jo Richardson Community School  51.534071   0.126278
7              Woolwich Polytechnic School for Boys  51.503728   0.108188
8                         Eastbury Community School  51.541569   0.091161
9                                 King Henry School  51.473433   0.166117
10                            Sydney Russell School  51.548620   0.133125
11                                  Langdon Academy  51.532661   0.068439
12        The Garard Academy Bexley Pr

### Upload a new file into your bucket

---

For this exercise you are going to make a new data file (using pandas.to_csv to make the csv file, then BytesIO to convert it into a bytes object that can be stored on S3)

In [14]:
import boto3
import json
#from functions import add_school_data_to_bucket, show_schools_data_in_bucket


def lambda_handler(event, context):
    global message, return_data, statuscode
    filename = "schools_list.csv"
    client = boto3.client('s3')
    if event["httpMethod"] == "POST":
        if "body" in event.keys():
            request = event["body"]
            if type(request) is not dict:
                request = json.loads(request)
            if request is not None and "data" in request.keys():
                data = request["data"]
                if len(data) == 0:
                    message, return_data = "Please enter a valid data", []
                    statuscode = 404
                else:
                    return_data= save_a_copy(data,'schools_list.csv')
                    #message, return_data = add_school_data_to_bucket(client, filename, data)
                    message="Yay! Success"
                    statuscode = 200
            else:
                message, return_data = "Error in the POST request occured", []
                statuscode = 404
    elif event["httpMethod"] == "GET":
       return_data= get_file('schools_list.csv')
       message="Yay! Success"
       statuscode = 200
       #message, return_data = show_schools_data_in_bucket(client, filename)
    else:
        message, return_data = "Error occured", []
        statuscode = 404
    return {'statusCode': statuscode,
            'headers': {'Content-Type': 'application/json',
                        'Access-Control-Allow-Headers': 'Content-Type,X-Api-Key',
                        'Access-Control-Allow-Methods': 'POST',
                        'Access-Control-Allow-Origin': '*'},
            'body': json.dumps({"message": message, "data": return_data})
            }


In [19]:
event= {
    "httpMethod": "POST",
    "body": {
    "data": [
        [
            "Bannockburn Primary School & Nursery",
            51.4869172,
            0.1015561
        ],
        [
            "St Margaret Clitherow Primary School",
            51.50103289999999,
            0.1132992
        ]
    ]
}
}


lambda_handler(event, None)

[['Bannockburn Primary School & Nursery', 51.4869172, 0.1015561], ['St Margaret Clitherow Primary School', 51.50103289999999, 0.1132992]]


{'statusCode': 200,
 'headers': {'Content-Type': 'application/json',
  'Access-Control-Allow-Headers': 'Content-Type,X-Api-Key',
  'Access-Control-Allow-Methods': 'POST',
  'Access-Control-Allow-Origin': '*'},
 'body': '{"message": "Yay! Success", "data": {"ResponseMetadata": {"RequestId": "4142VHF92V51S36H", "HostId": "1rj3Oyc1ack/h6+H6QPqF2lLBtjZdEKJQJWOkHp9eEPhSeMd4YtopVWf3dppeGLPajw4k+vsqQg=", "HTTPStatusCode": 200, "HTTPHeaders": {"x-amz-id-2": "1rj3Oyc1ack/h6+H6QPqF2lLBtjZdEKJQJWOkHp9eEPhSeMd4YtopVWf3dppeGLPajw4k+vsqQg=", "x-amz-request-id": "4142VHF92V51S36H", "date": "Mon, 11 Mar 2024 14:44:46 GMT", "x-amz-server-side-encryption": "AES256", "etag": "\\"ddb9933f885c2a08fb1a37b457cb3b00\\"", "server": "AmazonS3", "content-length": "0"}, "RetryAttempts": 0}, "ETag": "\\"ddb9933f885c2a08fb1a37b457cb3b00\\"", "ServerSideEncryption": "AES256"}}'}

In [18]:
def save_a_copy(filedata, filename):
  # first copy the data into a new file (print it so that you know it has been done)
  new_data = filedata.copy()
  print(new_data)

  # make a text file object to store the data in, then convert the data csv format and place inside the file object
  file_object =  io.StringIO()
  pd.DataFrame(new_data).to_csv(file_object, index=False)

  # upload the file to the bucket with the filename and the file contents
  response = s3_client.put_object(Bucket=os.environ.get('BUCKET_NAME'), Body=file_object.getvalue(), Key=filename)
  return response

response = save_a_copy(data, 'schools_list.csv')
print(response)

                                               Name   Latitude  Longitude
0              Bannockburn Primary School & Nursery  51.486917   0.101556
1              St Margaret Clitherow Primary School  51.501033   0.113299
2                Blackburn Primary School & Nursery  51.486917   0.101556
3                St Greggs Clitherow Primary School  51.501033   0.113299
4                            Parkway Primary School  51.492800   0.134105
5   St Michael's East Wickham C of E Primary School  51.469340   0.119751
6                    Jo Richardson Community School  51.534071   0.126278
7              Woolwich Polytechnic School for Boys  51.503728   0.108188
8                         Eastbury Community School  51.541569   0.091161
9                                 King Henry School  51.473433   0.166117
10                            Sydney Russell School  51.548620   0.133125
11                                  Langdon Academy  51.532661   0.068439
12        The Garard Academy Bexley Pr

### Create a serverless function that will read a file and return the data

---


Learn how to read and write files to an S3 bucket, keeping the access keys secret. Write and test a serverless function that will read a csv file from s3 and return the contents.