# Exercises due by EOD 2018.11.16

## goal

in this homework assignment we will utilize the `aws` `cli`, the `boto3` `python` sdk, and the `s3` service

## method of delivery

as mentioned in our first lecture, the method of delivery may change from assignment to assignment. we will include this section in every assignment to provide an overview of how we expect homework results to be submitted, and to provide background notes or explanations for "new" delivery concepts or methods.

this week you will be submitting the results of your homework via an email to **BOTH** Zach (rzl5@georgetown.edu) and Carlos (chb49@georgetown.edu) titled "2018.11.16 answers", uploading files to an `s3`, and commits to your `gu511_git_hw` on `github`

summary:

| exercise | deliverable | method of delivery |
|----------|-------------|--------------------|
| 1 | an `s3` bucket name | include in your submission email |
| 2 | none | none |
| 3 | a file `iam.py` | uploaded to your `s3` homework submission bucket |
| 4 | a file `spot_price_history.ipynb` | uploaded to your `s3` homework submission bucket |
| 5 | your `aws` "canonical id" value | include in your submission email |
| 6 | the publically available `url` of your `s3`-hosted static webpage | include in your submission email |
| 7 | the publically available `url` of an "alarm clock" message | include in your submission email |
| 7 | a `bash` command for running `alarm_clock.py` | include in your submission email |
| 8 | a file `clientside.py` | uploaded to your `s3` homework submission bucket |
| 9 | two commits on your `master` branch | pushed to `github` |

# exercise 1: create an `s3` bucket for homework submission

<span style="color:red;font-weight:bold">UPDATE: the interface for creating buckets [has changed](https://docs.aws.amazon.com/AmazonS3/latest/dev/WhatsNew.html). the previous behavior is available via checking out previous `git` commits</span>


## .1 create a new `s3` bucket

1. call it whatever you want
1. leave all other permissions alone


## .2: grant us *bucket* permissions

after that bucket has been created:

1. open the `s3` web console page for the bucket you just created
1. grant my `aws` account *bucket* permissions (for listing files, mostly)
    1. click on the "Permissions" tab
    1. click on the "Access Control List" button
    1. click the "+ add account" button
    1. add my email address (`rzl5@georgetown.edu`) and check all four boxes
    1. click "Save"

<br><div align="center"><img src="http://drive.google.com/uc?export=view&id=1wokO60WWYXtxhVrFF0VZBoGUgywmKZv5" width="1000"></div>


## .3: grant us *file* permissions within that bucket

continue in the "Permissions" tab and do the following:

1. open the `s3` web console page for the bucket you just created
1. click on the "Permissions" tab
1. click the "Bucket Policy" button -- you should see an editor
1. click the "policy generator" link at the bottom of the editor
1. generate a policy
    1. change "Select Type of Policy" to "S3 Bucket Policy"
    1. Principal = 134461086921
    1. AWS Service = Amazon S3
    1. Actions = click the "All Actions" button
    1. Amazon Resource Name (ARN) = `arn:aws:s3:::YOUR_BUCKET_NAME_HERE,arn:aws:s3:::YOUR_BUCKET_NAME_HERE/*` (replace both instances of `YOUR_BUCKET_NAME_HERE` with the simple bucket name
        + note that the above is *two* `arn` values separated by a comma, the first is the `arn` of the bucket, and the second is an `arn` matching the path of any key in that bucket. *both* are necessary.
        + for example, my value is `arn:aws:s3:::testshare.lamberty.io,arn:aws:s3:::testshare.lamberty.io/*`
    1. click Add Statement
    1. click "Generate Policy"
    1. copy the pop-up's contents
1. back on the previous policy editor page, in the editor field, paste the `json` policy you just generated and copied
1. click "Save"

<br><div align="center"><img src="http://drive.google.com/uc?export=view&id=1w-St2yR_2Rls6OPngnFSWc1vpf0Dw2xG" width="700"></div>


what you generate in the "generate policy" step should look like the text below:

```json
{
  "Id": "Policy1542229407821",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1542229397978",
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::testshare.lamberty.io",
        "arn:aws:s3:::testshare.lamberty.io/*"
      ],
      "Principal": {
        "AWS": [
          "134461086921"
        ]
      }
    }
  ]
}
```

this has the effect of allowing our AWS account to have full permissions on your *files* instead of just the bucket.


## >> going forward, I will refer to this as your "`s3` homework submission bucket" <<

##### include the name of your bucket `s3://xxxxxxxxxxxx` in your submission email

# exercise 2: using `aws boto3` on your local laptop

in class, we created an `iam role` for our `ec2` servers, and the permissions which are granted to that `iam role` are the permissions we have when using `boto3` on that server.

in one of the below steps, you *must* be using your local laptop -- that `iam role` does not apply to you in that case!

in order to use `boto3` from your local laptop, you will need to install it, and then authenticate with the access keys associated with your `iam` account. to do this, you must:

1. install `boto3` via `conda install boto3` on your local laptop
1. get your `iam` account access key id and value
    1. you can get these via the `iam` web console, the `csv` file you already saved, or the credentials file on your `ec2` instance `~/.aws/credentials`
1. use those credentials when authenticating (see below)
    
assuming you have your credentials, there are two ways you can authenticate. first, you could create a profile, and in `boto3` always use profiles:

```sh
aws configure --profile your_profile_name
# enter the id
# enter the secret
# enter us-east-1
# just press enter (don't write anything)
```

and then in a `python` session

```python
import boto3

session = boto3.session.Session(profile_name='your_profile_name')
```

**alternatively**, you could directly pass your access key and secret information to the session when you create it (that is, never create a profile). in a `python` session this would look like


```python
import boto3

session = boto3.session.Session(
    aws_access_key_id='YOUR_AWS_ACCESS_KEY_ID',
    aws_secret_access_key='YOUR_AWS_SECRET_ACCESS_KEY',
    region_name='us-east-1'
)
```

which of these two you want to do is up to you!

##### there is nothing to submit for this exercise

# exercise 3: using `aws` `boto3` to acquire basic information

complete all incomplete parts of the following three functions. save these functions in a file `iam.py`.

in order to be able to *run* this code on your `ec2` instance (which you will probably want to do, for debugging purposes), you will need to add either the `IAMReadOnlyAccess` or `IAMFullAccess` policy to *whatever* account you are authenticating with, e.g.

+ if you are running code on your `ec2` instance and added an `IAM` service `role` to that instance, you should add that policy to that service `role`
+ if you are using a `default` configuration and authentication keys associated with a user (e.g. `gu511`, or your personal user), you should add that policy to that `user` or a `group` that `user` is in 

```python
#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
Module: iam.py

Description:
    generate lists of users, roles, and groups from the `iam` service
    
"""


import boto3


def get_users():
    # create a boto3 session object
    session = boto3.session.Session()
    
    # create an iam resource object
    iam = session.resource('iam')
    
    # iterate over all `iam users` and extract the 
    # `name` member into a list
    names = [
        user.name
        for user in iam.users.all()
    ]
    
    return names
    
    
def get_roles():
    # createa boto3 session object
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    # create an iam resource object
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    # iterate over all `iam roles` and extract the 
    # `name` member into a list
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    return roles
    
    
def get_groups():
    # createa boto3 session object
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    # create an iam resource object
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    # iterate over all `iam groups` and extract the 
    # `name` member into a list
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    return groups  
```

##### upload `iam.py` to your `s3` homework submission bucket

# exercise 4: using `boto3` to get spot price history

it is possible to pull spot price history for various types of machines, in various regions, and between arbitrary start and end times. In particular, it's possible to pull an entire day's worth of spot prices, all using pretty straight forward functions in the `boto3` library.

download the neighboring `spot_price_history.ipynb` `jupyter` notebook file to your local laptop and launch a `jupyter` notebook server to interact with that notebook. you **must** do this on your local laptop -- `colab` will not work, as you need to have `boto3` installed and your `aws` access key credentials available to you.

the notebook contains an outline of a simple `python` proces which uses `boto3` functions to download spot price information, load it into a `pandas` dataframe, and display that information using `plotly`.

it also includes several code cells which simply read

```python
# --------------- #
# FILL ME IN !!!! #
# --------------- #
```

you should... you know... fill them in.

if you have done everything correctly, the `assert` statements in that notebook should all pass without throwing `AssertionError`s.

once you have filled them all in, **save** all changes

##### upload `spot_price_history.ipynb` to your `s3` homework submission bucket

# exercise 5: getting your `aws` canonical user id

`aws` has two separate unique identifiers for your account -- your aws account id (the 12-digit number we have seen on our account pages, and in our `arn`s) and a second one called your "canonical user id". the canonical user id is, by all accounts, an artifact of the early days of `aws`.

because `s3` was among the first services, many of the steps you might go through in `s3`  to permission accounts will use this other older id (the canonical user id) instead of the account id. hooray for technical debt!!

use one of the methods described [in the amazon account identifiers documentation](https://docs.aws.amazon.com/general/latest/gr/acct-identifiers.html#FindingCanonicalId) to find your account's "canonical user id" value.

##### include your canonical id in your submission email

# exercise 6: create a static webpage

let's make a webpage!


## 6.1: get an `html` file

we're going to need some `html` (hypertext markup language) for our webpage. If you have a page you really want to show off to the world, feel free to use it -- otherwise, feel free to use [our example](https://s3.amazonaws.com/shared.rzl.gu511.com/index.html). edit it. go wild.


## 6.2: upload an `html` file

you may remember, but when we were creating buckets in class we mentioned that it was possible to configure a bucket such that it could be used for "static website hosting". create a new bucket (name it whatever you want), and after creating it, let's configure it to host a static website.

upload the `html` file from the previous step as `index.html`. when you upload it, grant public read access to it in the permissions page. we'll add an error document in a step below, but feel free to do that now if you already know what you want to do with that.

leave the rest of the configurations as-is.


## 6.3: turn your `bucket` into a webpage

open your `bucket` in the `s3` web console and navigate to the properties tab. update the "static website hosting" card and make sure our index document is "index.html" and our error document is "error.html". copy the endpoint `url` they give you on this card.


## 6.4: try it out

navigate to that endpoint you were given while configuring the bucket to be a static webpage. what do you see?


## 6.5: 403 errors

if you missed a step along the way, the default behavior will be to return to you a `403 FORBIDDEN` error. For my bucket, for example:

```
403 Forbidden

    * Code: AccessDenied
    * Message: Access Denied
    * RequestId: A7BA5343504C695B
    * HostId: 7KtvPPnjmQAk2Ry4CeYn58+I1IL1+W+tV633d2/SX5c6XmIFqvewLMTUGwKxrgaY33tzlOF0jek=

An Error Occurred While Attempting to Retrieve a Custom Error Document
    * Code: AccessDenied
    * Message: Access Denied
```

if you didn't receive this error, skip down to the next portion. otherwise:

first, read [what a 403 error is](https://en.wikipedia.org/wiki/HTTP_403) (or [any `html` code](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes), for that matter). 

after this, read [the static website hosting documentation](https://docs.aws.amazon.com/AmazonS3/latest/dev/HowDoIWebsiteConfiguration.html) for details on how to configure permissions to allow users to access this site.

*note*: the documentation tells you how to open *an entire bucket*, so keep in mind this will make all the items in the configured bucket public. It is possible to make a single file public from the file summary page, fwiw.

using the information from the documentation, make sure that the endpoint you tried above (and possibly received a 403 for) is now publicly accessible


## 6.6: add an error page

locally, copy the `html` file you are using as your index page to a new file called `error.html`, and edit that new file to contain an error message. this might be as simple as replacing the text inside the header and first paragraph so that they contain warnings that a "page is missing" or "this url was an error". If you have your own `error.html` file, feel free to use that instead. just make it different in some human-readable way from the `index.html` file.

upload that new `error.html` file, and then go back to the static webpage configuration for your bucket (where we *didn't* enter an `error.html` file before), and add the newly uploaded file. 


## 6.7: verify that missing pages redirect to `error.html`

verify that a url that doesn't exist takes you to that `error.html` page. Take the ip address from before (for example, mine is: http://wp.rzl.gu511.com.s3-website-us-east-1.amazonaws.com/) and add a meaningless url path to the end of that. Again, for example: http://wp.rzl.gu511.com.s3-website-us-east-1.amazonaws.com/pagedontexistyo.php

verify that the page that is displayed is your error page.


## 6.8: let us see!

send us the url of your static webpage. we will visit

+ the path itself, and
+ a path that doesn't exist (e.g. the `http://your.url.at.amazonaws.com/pagedontexistyo.php`)

to verify that both the index and error pages are available.


##### include your `url` in the body of your submission email

# exercise 7: a really sad alarm clock


## 7.1: getting familiar with the script

download [this `python` file](https://s3.amazonaws.com/shared.rzl.gu511.com/alarm_clock.py) and review it to figure out what it does.

the elements below the `command line` comment block implement a command line interface (`cli`) for this python script. check out the `cli` options by trying out

```bash
python alarm_clock.py --help
```

*note*: this file has `boto3` as a prerequisite, so you have to execute the above command in an environment where `boto3` is installed. additionally, it assumes a `default` profile exists, or an `iam` role for an `ec2` server that it is running on, so make sure those exist as well.


## 7.2: create an alarm clock bucket

create a *new* `s3` bucket (i.e. don't use your homework submission `s3` bucket) and make it fully visible to the public.

going forward, I will refer to this as "the alarm clock bucket".


## 7.3: post a message

use the `alarm_clock.py` file to post a message to that new bucket you created in the previous step.


## 7.4: send us some proof!

email us the command you wrote to use `alarm_clock.py` to upload a message as a file to `s3`, and send us the link to the resulting file. verify that the `url` works for other users by opening an incognito browser accessing it (this way you will certainly not be logged in to the `aws` web console).


##### include a single `bash` command and a `url` in your submission email

# exercise 8: encrypting password with `kms` encryption

in the `s3` lecture, I mentioned that `s3` supports *server-side* encryption as a simple check box (that is, whenever a file is received by `s3` it will encrypt it (jumble the contents so they are unreadable to a human) with a secret key, and it will decrypt that file (reverse that jumbling) whenver someone who is approved (e.g. *you*) requests that file. I also mentioned that *client-side* encryption -- where you as a user jumble the contents before you even send them to `s3`, and decrypt the jumbled contents when they're sent back to you -- is another option, but it requires extra effort, and I didn't elaborate on that extra effort.

before that, in the web scraping lectures, I mentioned that it might be possible to store an *encrypted* (jumbled) version of a password in plain text on your local machine and that if you new how to *decrypt* (un-jumble) that encrypted version, and if so, that this would be more secure than saving the regular password in plain text with password protection. but I didn't discuss how you might do that at all.

let's walk through how *client-side encryption* can be done relatively easily using the `aws kms` service and the `python` `boto3` library.

the end result here will be a handful of `python` functions which can encrypt and decrypt messages, upload a secret message to `s3`, and download and decrypt that same message.


## 8.1: create a kms key

go to the `aws iam` web console page. the left-hand menu has, as one of its options, ["Encryption Keys"](https://console.aws.amazon.com/iam/home#/encryptionKeys/us-east-1). navigate to that place, and create a new key.

+ pick an alias you can remember and easily type
+ tag if you want!
+ for the administrator, select your personal `iam` user.
+ for the usage permissions, make sure to add the `iam` `role` you have given your `ec2` server
    + we added this in the lecture on `aws` `cli`, but you can see it in the "description" window for your `ec2` server with a name "IAM role", or you can right click the `ec2` instance, select "Instance Settings > Attach/Replace IAM Role"


## 8.2: encrypt a message

let's assume that in the previous step your named your key `mykey` (replace all occurrences of `mykey` below with whatever you actually used for your key alias name).

encrypting a message is simple with the `kms` client's `encrypt` method:

```python
import boto3

session = boto3.session.Session(region_name='us-east-1')

message = b'evs'
keyalias = 'mykey'

# note: the kms service does not have a *resource* object yet, so we use a client
kms = session.client('kms')

response = kms.encrypt(
    KeyId='alias/{}'.format(keyalias),
    Plaintext=message,
)

encryptedmessage = response['CiphertextBlob']
print(encryptedmessage)
```

if at any point you run into permissions issues, please resolve those issues using the `iam` service.


## 8.3: write encrypted message to an `s3` file

using the process we demoed in a mini-exercise in the `s3` lecture where we upload a *string* (not a local text file) into a file on `s3`, create a file on `s3` with the encrypted message as its contents.


## 8.4: download that file from `s3`

using the process we demoed in a mini-exercise in the `s3` lecture, download the file you just posted to `s3` into a string object (*i.e.* don't download to file).


## 8.5: decrypt the message inside the downloaded file

the `kms` client object you created above has a `decrypt` method function that take an encrypted message and cycles through your encryption keys until one of them successfully decrypts the string.

apply that `kms.decrypt` function to the encrypted message you just downloaded

```python
import boto3

session = boto3.session.Session(region_name='us-east-1')

kms = session.client('kms')

response = kms.decrypt(CiphertextBlob=encryptedmessage)
decryptedmessage = response['Plaintext']
print(decryptedmessage)
```

## 8.6: put it all together

use the code you generated above to fill in the details of the `python` script `clientside.py`, available on my shared `s3` bucket here:

https://s3.amazonaws.com/shared.rzl.gu511.com/clientside.py

fill in the regions marked by comment boxes:

```python
# ---------------- #
# FILL THIS IN !!! #
# ---------------- #
```

## 8.7: validate your script

if your script is working as expected, you should be able to do the following. in `python`:

```python
import clientside

key_alias = 'YOUR_KEY_ALIAS'
assert clientside.decrypt(clientside.encrypt('helloworld', key_alias), key_alias) == b'helloworld' 
```

and from the command line

```sh
python clientside.py upload -k YOUR_KEY_ALIAS -m 'hello world' -b YOUR_BUCKET_NAME -s testencr.txt
python clientside.py download -k YOUR_KEY_ALIAS -b YOUR_BUCKET_NAME -s testencr.txt
b'hello world'
```


## 8.8: epilogue: application to passwords

*you don't have to do the following: this is just an explanation of how you can use the above to work with passwords*

in the "encrypt a message" section above, suppose the "message" you wanted to encrypt was a plain-text password you entered manually as

```python
import getpass

message = getpass.getpass(prompt="Your Password: ")
```

you could now easily take that plain-text password and encrypt it. you could then quite easily write that encrypted password to a file anywhere on your computer -- say, `~/.secrets/mycredentials.json`.

then, when you want to *use* that password, you could do the following with a `kms` client created the same way as above:

```python
encryptedPw = read_pw_from_file("/home/ubuntu/.secrets/mycredentials.json")
plaintextPw = kms.decrypt(CiphertextBlob=encryptedPw)['Plaintext']
```

not too much effort for a little extra security.


##### upload your updated version of `clientside.py` to your `s3` homework submission bucket

# exercise 9: `merge` two `branch`es with overlapping edits to the same file: a `CONFLICT`!

## 9.1: make a local update to `README.md`

you, continuing your unblemished record of being astute and dilligent, notice that we never added a description of the `dspipeline.py` file to our `README.md`. you decide to update that.

update `README.md` to read:

```
# 511 github repo

the primary function of this repo is to develop `git` skills over the course of the year.

## repository contents

+ `helloworld.py`
    + run with `python helloworld.py`
    + this will greet you and then tell you the current time
+ `rzl.py`
    + run with `python rzl.py`
    + this will offer you the ramblings of a teacher who thinks he is funnier than he is
+ `dspipeline.py`
    + a file containing some utilities for building data science pipelines, and an example that trains several models on adult salary data and selects the best based on cross validated metrics
```


## 9.2: update `master`

`add` this change, `commit` it with a message `README: including dspipeline description`, and `push` to `github`


## 9.3: fetch my new `branch`

after pushing to `master` and checking on `github`, you notice that I have sneakily added my *own* updates to `README.md` as a new `branch` called `yolo`.

use `git fetch --all` to create a mirror repository of that `branch`.

*note: this branch will be pushed on Saturday afternoon to make sure all users have had time to update their `github` repos from the previous assignment*


## 9.4: `merge` my changes in with yours

use [`git merge`](https://git-scm.com/docs/git-merge) to `merge` the change that I made on the `yolo` branch into `master`. 

`git` will do it's `diff`-calculating magic and realize it can't simply combine the edits like it did last time. you get

```sh
git merge yolo
```
```
Auto-merging README.md
CONFLICT (content): Merge conflict in README.md
Automatic merge failed; fix conflicts and then commit the result.
```

ugh seriously zach... it's just... like, some communication would be appreciated. some effort.


## 9.5: resolve the `CONFLICT`

if you check `git status` right now, you will be informed that there are conflicts in the `merge` process:

```sh
git merge
```

```
On branch master
Your branch is up to date with 'origin/master'.

You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)

	both modified:   README.md

no changes added to commit (use "git add" and/or "git commit -a")
```

let's listen to `git` -- let's fix conflicts in our file and then `git commit`

in an editor, open `README.md` and look for conflicts (demarked by `<<<<<<< HEAD` and `>>>>>>> yolo`). edit that entire section between those two pieces to include the lines that you think are appropriate (that is, yours).

after you've edited, run `add` the edited `CONFLICT`-less file and `commit` it with message `README: zach is not even trying any more`


`push` the updated `master` `branch` to `github`