<a href="https://colab.research.google.com/github/rzl-ds/gu511_hw/blob/master/hw07.ipynb" target="_parent">
    <img src="https://colab.research.google.com/assets/colab-badge.svg"/>
</a>

# Exercises due by EOD 2020.10.30

## goal

in this homework assignment we will use `iam` to set up some inter-account access, use `boto3` to integrate with several `aws` services, and (as always) update our github repos

## method of delivery

*as mentioned in our first lecture, the method of delivery may change from assignment to assignment. we will include this section in every assignment to provide an overview of how we expect homework results to be submitted, and to provide background notes or explanations for "new" delivery concepts or methods.*

this week you will be submitting the results of your homework via attachments to your submission email and commits to our shared github repos

+ subject: "hw07 answers"
+ to: rzl5@georgetown.edu, ip221@georgetown.edu

summary:

| exercise | deliverable | method of delivery | points |
|----------|-------------|--------------------|--------|
| 1 | an `arn` | included in submission email | 5 |
| 2 | none | none | 5 |
| 3 | a `python` file `myiam.py` | attached to your submission email | 5 |
| 5 | a `jupyter notebook` file `spot_price_history.ipynb` | attached to your submission email | 20 |
| 5 | a `git merge` commit | we will see it on our shared `github` repo | 5 |

total points: 40

<div style="border: 1px solid lightgrey;">

# exercise 1: give us read access to your `iam` `user`s, `group`s, and `role`s

in this exercise you will set up an `iam` `role` to allow list access of your `iam` `user`s, `group`s, and `roles` to *an entirely differen aws account* (mine).

this sort of cross-account permission wrangling can come up when you have separate `aws` accounts for separate teams, departments, or companies working on the same project, or a separate production, UAT, or development environment. it is one way of solving the problem; another (better) way would be for you to create a new `user` in your account that you allow me to log in as.

because we have covered only `ec2` and `iam` for now, we'll focus on granting `iam` permissions. in the future we will share `s3` permissions in a nearly identical way.

## 1.1: understanding `arn`s

`aws` has a proprietary way of uniquely describing `aws` resources called the [Amazon Resource Name](https://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html) or `arn`. these long strings have a standard format:

```
arn:partition:service:region:account-id:the_resource_stuff
```

where

+ `partition` is, for our purposes, basically always `aws`
+ `service` is the `aws` service we are discussing, so `ec2`, `iam`, `s3`, etc.
+ `region` is the geographic region in which the service is being used (our default so far has been `us-east-1`
    + depending on the service, this can sometime be left blank (no characters, just two `::` in a row)
+ `account-id` is your globally unique `aws` account id
    + this can also sometimes be left blank
+ `the_resource_stuff` is a formatted string that is service-dependent and defines unique items within that service
    + in `iam` this might be your `user` name or `group` name
    + if the resource is described with a `path` (like in `s3` or `iam`), these paths often allow wildcards (`*`) to match multiple paths

in a previous homework assignment we asked you to get your `iam` `user`'s `arn` and you did so by navigating to your `user`'s page in the `iam` service. on my page, that `arn` value is:

<br><div align="center"><img src="https://drive.google.com/uc?export=view&id=0ByQ4VmO-MwEEVkJzR1hiTm1zSjg" width="700px"></div>.

we will use my `arn` below

## 1.2: create a policy to allow listing

use the `iam` dashboard to create a new `policy`. if you use the visual editor, you are looking to set

1. the `service` is `iam`
1. the `actions` we want are `list` actions for `user`s, `role`s, and `group`s (3 List level actions)
1. leave all other configuration optionas as-is

name the `policy` `allow_zach_iam_list`.

while you are creating it, you should be able to see it on a second `json` tab. the complete, correct policy will look like:

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "iam:ListRoles",
                "iam:ListUsers",
                "iam:ListGroups"
            ],
            "Resource": "*"
        }
    ]
}
```

## 1.3: create an `iam role` with the above policy

that `policy` is all well and good but it doesn't *apply* to anyone. no one *has* that policy.

so create a `role` for me and I'll use it!

+ create a `role` of type "Another AWS account" and use my account number: `134461086921`
    + don't check either of the two "Options" boxes
+ attach the `policy` you just created (`allow_zach_iam_list`) to this `role`
+ name this `role` `zachs_iam_listing_role`

## 1.4: verify the `arn` of the `role` is correct

after all of this, you should be able to open the `role`'s summary page to see the `role` `arn`, and that `arn` should have a value like

```
arn:aws:iam::YOUR_ACCT_NUMBER_HERE:role/zachs_iam_listing_role
```

and it should have an attached `policy` `allow_zach_iam_list`. verify that that is the case.


##### include the `role` `arn` (e.g. `arn:aws:iam::YOUR_ACCT_NUMBER_HERE:role/zachs_iam_listing_role`) in the body of your submission email

<div style="border: 1px solid lightgrey;">

# exercise 2: using `aws boto3` on your local laptop

in class, we created an `iam role` for our `ec2` servers, and the permissions which are granted to that `iam role` are the permissions we have when using `boto3` on that server.

if you want to use `boto3` from your local laptop, that `iam role` is not an option -- we need to add our auth credentials to do that.

in order to use `boto3` from your local laptop, you will need to install it, and then authenticate with the access keys associated with your `iam` account.

1. on your local laptop: install `boto3` via `conda install boto3`
1. get your `iam` account access key id and value
    1. you can get these from the `csv` file you already saved, or saved in plain text in the credentials file on your `ec2` instance (`~/.aws/credentials`)
1. use those credentials when authenticating (see below)

as we discussed in lecture, assuming you have your secret key id and value, there are two ways you can provide them when using `boto3`. first, you could create a profile using `aws configure`, and then you can reference that profile when starting a `boto3` session:

```sh
aws configure --profile your_profile_name
# enter the id
# enter the secret
# enter us-east-1
# just press enter (don't write anything)
```

and then in a `python` session

```python
import boto3

session = boto3.session.Session(profile_name='your_profile_name')
```

**alternatively**, you could directly pass your access key and secret information to the session when you create it (that is, never create a profile). in a `python` session this would look like


```python
import boto3

session = boto3.session.Session(
    aws_access_key_id='YOUR AWS ACCESS KEY ID GOES HERE',
    aws_secret_access_key='YOUR AWS SECRET ACCESS KEY GOES HERE',
    region_name='us-east-1'
)
```

which of these two you want to do is up to you!

after creating `session` one of the two ways listed above, you can verify this worked by running

```python
s3 = session.resource('s3')

for bucket in s3.buckets.all():
    print(bucket.name)
```


##### nothing to submit here

<div style="border: 1px solid lightgrey;">

# exercise 3: using `aws` `boto3` to acquire basic information

in this exercise we will fill in the missing parts of the python file `myiam.py` below and run the resulting code on your personal `ec2` instance. it is up to you how you'd like to edit that file (e.g. using `nano` on your `ec2`, using your editor on your laptop and `scp`-ing the file up, etc) -- just edit the file wherever you'd like and test it on your `ec2` instance.

## 3.1: saying "pretty pretty please?" with `iam` policies

in order to be able to run this code on your `ec2` instance, you will need to have new permissions: you will need either the `IAMReadOnlyAccess` or `IAMFullAccess` policy attached to *whatever* account you are authenticating with.

in class we authenticated in two ways:

+ "config": we created a profile `gu511` and added our credentials to that
+ "role": we attached an `iam` `role` to our `ec2` instance

choose one of those two (config or role) as the way you will authenticate for this problem (I recommend the `iam` `role`). you then need to add one of the two policies above (I suggest `IAMFullAccess`) to your account. open the `aws` web console in your browser, navigate to the `iam` dashboard, and then:

+ "config": the `gu511` profile we created was assocaited with a `gu511` `iam` `user`, so attach your chosen policy to that `user`
+ "role": find the `role` you attached to your `ec2` instance and attach your chosen policy to that `role`

finally, the choice you make may change how you create your session (the 2nd line in `get_users` function below).

+ "config": if you use the `gu511` profile, you need to update that line to read `session = boto3.session.Session(profile_name='gu511')`
+ "role": no change is needed.

## 3.2: editing the file

once you have the necessary permissions, you should be able to start writing commands to try and fill out the empty blocks in the `myiam.py` template below.


```python
#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
Module: myiam.py

Description:
    generate lists of users, roles, and groups from the `iam` service
    
"""


import boto3


def get_users():
    # create a boto3 session object
    session = boto3.session.Session()
    
    # create an iam resource object
    iam = session.resource('iam')
    
    # iterate over all `iam users` and extract the 
    # `name` member into a list
    names = [user.name
             for user in iam.users.all()]
    
    return names
    
    
def get_roles():
    # createa boto3 session object
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    # create an iam resource object
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    # iterate over all `iam roles` and extract the 
    # `name` member into a list
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    return roles
    
    
def get_groups():
    # createa boto3 session object
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    # create an iam resource object
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    # iterate over all `iam groups` and extract the 
    # `name` member into a list
    # --------------- #
    # FILL ME IN !!!! #
    # --------------- #
    
    return groups  
```

## 3.3: testing your code

to test that your script as written works

1. `cd` into the directory that contains `myiam.py`
1. start a `python` interpreter by typing `python`
1. in that `python` session, run

```python
from myiam import get_users, get_roles, get_groups

print(f'our users are: {get_users()}')
# should see a list of user names printed here

print(f'our roles are: {get_roles()}')
# should see a list of roles names printed here

print(f'our groups are: {get_groups()}')
# should see a list of groups names printed here
```

##### attach `myiam.py` to your homework submission email

<div style="border: 1px solid lightgrey;">

# exercise 4: using `boto3` to get spot price history

it is possible to pull spot price history for various types of machines, in various regions, and between arbitrary start and end times. In particular, it's possible to pull an entire day's worth of spot prices, all using pretty straight forward functions in the `boto3` library.

in the MAST 511 homework repository on `github`, there is now a neighboring notebook called `spot_price_history.ipynb`. your task for this exercise is to fill in the details of this skeleton notebook.

## 4.1: running the notebook locally

because we will be using `boto3` and you `aws` access key credentials, we **will not** be filling this in or running it in `colab`. follow these steps to get this notebook running on your local laptop.

1. make sure that you have `aws` credentials on your local laptop
    1. verify this by looking for a file `~/.aws/credentials`
    1. if they don't exist, install `awscli` (into any `python` environment) and run `aws configure`
    1. if you forgot where you saved the `csv` with your access key info, look for them on your `ec2` instance (in `~/.aws/credentials` on **that** server), or refer to the lecture to figure out how to acquire new keys
1. create a working `conda` environment: using `conda` on your laptop's command line,
    1. `conda create -n spot python=3`
    1. `conda activate spot`
    1. `conda install jupyter boto3 pandas botocore plotly`
1. download the neighboring `spot_price_history.ipynb` `jupyter` notebook file to your local laptop
    1. note: you should download the *raw* version of this file on `github`, not the rendered `html` version
    1. the url you want to download is [this one](https://raw.githubusercontent.com/rzl-ds/gu511_hw/master/spot_price_history.ipynb), e.g. `wget https://raw.githubusercontent.com/rzl-ds/gu511_hw/master/spot_price_history.ipynb`
1. run the notebook
    1. be in a `bash` session with the `spot` `conda` environment activated
    1. `cd` into the directory containing `spot_price_history.ipynb`
    1. execute `jupyter notebook`
    1. this should launch your web browser pointing to http://localhost:8888/ and you should be able to click on `spot_price_history.ipynb` on this home page

## 4.2: filling in the notebook.

the notebook contains an outline of a simple `python` proces which uses `boto3` functions to download spot price information, load it into a `pandas` dataframe, and display that information using `plotly`.

it also includes several code cells which simply read

```python
# --------------- #
# FILL ME IN !!!! #
# --------------- #
```

you should... you know... fill them in.

if you have done everything correctly, the `assert` statements in that notebook should all pass without throwing `AssertionError`s.

once you have filled them all in, please **save** that file ("file > save as...") with a name `[YOUR_GU_ID].spot_price_history.ipynb`


##### attach `[YOUR_GU_ID].spot_price_history.ipynb` to your homework submission email

<div style="border: 1px solid lightgrey;">

# exercise 5: adding a `requirements.txt` and `environment.linux.yml` file

that `dspipeline.py` file we just added imports several non-standard-library packages:

+ `numpy`
+ `pandas`
+ `plotly`
+ `sklearn`

in situations such as this it is good practice to keep a `requirements.txt` file (for `pip` installs) and / or an `environment.yml` file (for `conda` installs) in the repository with those `python` files. these two files list the packages a user should install.

in addition, because the `environment.yml` file is OS-dependent (there is a different version for each of `linux`, `mac`, and `windows`), I tend to create different ones for different operating systems. the file name is updated below to indicate this is an environment description for a `linux` OS only

I have put suitable versions of these files up as `gist`s:

+ [`requirements.txt`](https://gist.github.com/RZachLamberty/7492d97c753d086148312a2898a35080)
+ [`environment.linux.yml`](https://gist.github.com/RZachLamberty/5284e01f016af5480af9178fde21c351)

to complete this exercise:

1. make sure you have checked out the `pipeline` branch
1. copy the contents of those two `gist`s into files in your `gu511_git_hw` repository (using file names `requirements.txt` and `environment.linux.yml`)
1. `commit` them **to the `pipeline` branch** with `commit` message `python package lists`
1. `push` this `commit` to `origin pipeline`

<div style="border: 1px solid lightgrey;">