# Programmatic AWS

### Introduction

In this lesson, we'll see how we can work with AWS using the boto3 library.

### Starting with S3

The boto library is a python library that allows us to interact with aws resources in our account.

Let's get started by working with the s3 resource.

You can get a sense of s3 by logging into aws, and then searching for s3 in the toolbar, and clicking on the s3 service.

<img src="./visit-s3.png" width="100%">

From there, you'll see that s3 allows us to create these things called buckets -- which are just like folders -- and in that bucket we can store objects (ie. files).

Get started by clicking on create bucket.

<img src="./s3-buckets.png" width="100%">

And then let's create a new bucket.

> Your bucket name will need to be different.

<img src="./s3-create-bucket.png" width="80%">

Note that a bucket name must be unique across all of aws.  So you'll need to create a unique one.

From there, click on your bucket, and you can drag and drop a file into the bucket and then click on the upload button.

<img src="./upload-file.png" width="80%">

> You can download the yelp lunch data [here](https://github.com/ledeprogram/courses/blob/master/foundations/mapping/tilemill/yelp-lunch-nyc.csv), if you prefer.

Ok, so we just uploaded some an object to an s3 bucket.

By default the bucket is not available to the public -- it's only available to the creator of the bucket and other users on the account.  (Under permissions, we could change this level of access, but we don't need to here.)

### Access from the command line

It turns out that we should be able to view both the bucket and the object from our command line.  Type in `aws s3 ls` from your terminal.

You should see your bucket listed.  

> Troubleshoot: If this did not work, you may have to take a look at the lessons on [setting up your aws account](https://colab.research.google.com/github/data-engineering-jigsaw/aws-iam/blob/main/index.ipynb) and [setting up the command line](https://colab.research.google.com/github/data-eng-10-21/aws-command-line/blob/main/2-aws-command-line.ipynb).

Ok, so this worked because we previously called `aws configure` and then entered in our access key and secret access key.  Under ths hood, aws stored these credentials in a file on your computer located at `~/.aws/credentials`.

<img src="./creds.png" width="80%">

### Working with Boto3

Ok, now enough of the AWS command line.  Let's move on to working with boto3.

We can get started by running `pip3 install boto3`, or installing via the `requirements.txt` file in the `src` directory.



Now let's move through some of the methods for working with our s3 buckets.  

> You do not need to know these too deeply, but it is good to just see what you can do through the boto library.

### Reading from s3

In the `src/1_read_file.py` script you can see some methods for reading from s3.  Run these interactively with `python3 -i src/1_read_file.py`.  **Remember** that you'll have to change the bucket name you are reading from to match your bucket.

Ok let's walk through some of the code.

> We connect with the s3 client.

```python
import boto3
# we can conntect to the s3 bucket with the following 

s3 = boto3.client('s3')

# And from there, we can list all of the buckets we have access to

s3.list_buckets()

# If we want to list any of the individual files in a bucket, we can do so with the list objects method

s3.list_objects(Bucket='jigsaw-initial-bucket')

# And from there, we can get an individual object (or file)

obj = s3.get_object(Bucket='jigsaw-initial-bucket', Key='yelp-lunch-nyc.csv')
```

So notice that we identify a bucket with the `Bucket=` argument, and an object with the `Key=`.  From there, we can get the contents of the object with the following:

```python
obj['Body'].read()
```

### Write operations

Now let's move through some operations for creating buckets and then uploading objects to those buckets.

> Follow along with this in the `2_write_bucket.py` file.

```python
import boto3

# Once again, we start by connecting to the s3 client
s3 = boto3.client('s3') 

# Then we create a bucket with the following
bucket = s3.create_bucket(Bucket = 'jigsaw-sample-json')
# Again, we use the Bucket argument to specify the name of a bucket

# Next, let's upload a file to our bucket.  We can do so with the s3.upload_file method.

s3.upload_file('./yelp-lunch-nyc.csv', 'jigsaw-sample-json', 'lunch.csv')

# So above we specify the local file we are uploading, then the name of the bucket we are uploading it to, and finally the name of the object (the key).  

# From there, we can see if this was successful by getting our object, and then reading the contents of it.

obj = s3.get_object(Bucket='jigsaw-sample-json', Key='lunch.csv')

obj['Body'].read()
```

### One last thing

Beyond uploading a file to s3, we can also just upload some of our data purely from python.  To do so, we can use the `s3.put_object` method.

This time we specify the Body, Bucket and Key.

> Whereas with our s3.upload_file method we specified the filename, bucket and key.

```python
import json

json_obj = {'hello': 'world'}

s3.put_object(
     Body=json.dumps(json_obj),
     Bucket='jigsaw-sample-json',
     Key='hello_world.json'
)

# Above we use json.dumps to convert our data to json, and then set that as the body, followed by the bucket we want to write to, and the name of our object
```

From there, we can read our data from the bucket, and then use `json.loads` to convert from a json string back into the corresponding python data.

```python
obj = s3.get_object(Bucket='jigsaw-sample-json', Key='hello_world.json')

text = obj['Body'].read()

hello_world_dict = json.loads(text)
```

### Summary

Ok, how's that for a whirlwind tour.  In this lesson, we started using the boto3 library to create both s3 buckets and objects, as well as read from those buckets and objects.

Here are some of the key methods.

* Read
```python
s3 = boto3.client('s3')

s3.list_objects(Bucket='jigsaw-initial-bucket')

obj = s3.get_object(Bucket='jigsaw-initial-bucket', Key='yelp-lunch-nyc.csv')

obj['Body'].read()
```

* Write

```python
bucket = s3.create_bucket(Bucket = 'jigsaw-sample-json')

s3.upload_file('./yelp-lunch-nyc.csv', 'jigsaw-sample-json', 'lunch.csv')

# upload some data
json_obj = {'hello': 'world'}

s3.put_object(
     Body=json.dumps(json_obj),
     Bucket='jigsaw-sample-json',
     Key='hello_world.json'
)
```