# aws programmatic interfaces

up until this point, all of the work we've done with `aws` services has been via their point-and-click web console. the console is pretty good (inconsistently formatted, but generally self-explanatory), but it is not the only way to interact with `aws` services.

that's good, too -- imagine you learned that the project you were about to start required you to spin up 50 `ec2` servers. think of all the fun you had clicking check boxes, scrolling through options lists, and hammering "Next" buttons when you started up your first server, and then multiply it times 50.

a good programmer knows: anything worth doing once is worth doing *only* once. spend your time automating things, not doing them!

`aws` exposes a number of different programmatic ways to work with their services, allowing us to script menial or complicated tasks like that described above. at the current time, these tools include

+ the `aws cli` (command line interface)
    + a set of `python` scripts using the `python sdk` below to implement common `aws` service tasks in a bash or powershell script setting
+ `SDK`s (software development kits) 
    + includes the following languages: `python`, blah, blah, blah (full list [here](http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html))
    + doesn't include the following languages: `R`
        + just sayin :troll:

let's talk about the `cli` and the `python sdk`

## `aws` command line interface (`cli`)

### what is the `aws cli`?

as I wrote above, amazon has created a command line interface (`cli`) tool which can be used to perform many of the most typical actions one might perform on the console. It does this by leveraging a `python` library which we will discuss below.

it can be used out of the box in `bash` scripts, `PowerShell` scripts, or plain-ol' windows `cmd` scripts. Obviously, both `python` and `boto` are dependencies.

### why use the `cli`

there are many reasons you may wish to use the `cli`, but the primary motivations are that

1. it is possible to script and automate actions with the `cli`, and not so easy (or possible) via the web console
2. it can be called from within `shell` scripts
3. the authentication process is different and -- depending on your perspective -- less onerous
4. the interface from one service to the next is actually more consistent than the web console
5. you can plug it in to other command line tools (e.g. your data science pipeline process!)
6. I'm telling you you should and I have great authority
7. it's in the next homework

#### an example

maybe it would help to understand the sorts of things you might want to do with the `cli`. One project we are working on for "fun" at ERI is predictive modeling of power outages. we found out that a Connecticut power company (Eversource) posted [their reported outages](https://www.eversource.com/clp/outage/outagemap.aspx) on a webpage as a `json` request.

we started downloading those files every 15 minutes, and we used an `aws` `ec2` instance to do that download. We actually saved those files to that machine, but we *could* have pushed them to `s3` instead -- it would have been as easy as

```bash
aws s3 cp outage.json s3://data.eri.com/eversource/outages/
```

at the terminal prompt. the `cli` exposes every service as they were a simple linux command -- that's pretty cool.

### installing the `cli`

the `cli` is available for windows (download page [here](https://aws.amazon.com/cli/)), but we will all be using it from our beautiful new `ec2` instances.

in the linux and mac world, you install the `awscli` python package. let's do that!

first thing's first: log in to your `ec2` instance now.

you may already have the `cli` installed (given that we picked the aws-maintained free tier ubuntu `ami`). let's check:

```bash
which aws
```

<div align="center">**mini exercise: everyone installs `aws`**</div>

assuming the result of the above was a null response, we need to install the `cli`. we will do that using `pip`:

```bash
conda install pip

# awscli not available via conda :(
pip install awscli --upgrade --user
```

### using the `cli`

the `cli` acts as a standalone service when interfacing with all the different `aws` services we may own and operate.

because `aws` considers it to be on the same level as a "user," the `cli` needs to "sign in" to those services the same way we would. the authentication method and credentials we use to sign in to our console account are not fit for sharing in this way, so we use an alternative (and standard) authentication method: "access keys."

often times, when you sign up for a REST api (application program interface), you are given an API key or authentication key -- this is generally a unique public and private key pair that allow the api to know that the "owner" (in this case: you) "knows about" the requests that are being made to that api in their (your) name.

back in the very very very first lecture so many days ago, you *downloaded a csv* with a bunch of information in it, including an access key id and access key value for your account.

you still have that right?

**right?**

<div align="center">**mini-exercise: access key creation**</div>
<div align="center">**https://console.aws.amazon.com/iam/home?region=us-east-1#/users/**</div>
<div align="center">*make sure to save the access key value somewhere secure -- you can't ever get it again*</div>

each `iam` user has the ability to have up to two access keys. to create one for yourself:

1. head over to the [`iam` `users` dashboard](https://console.aws.amazon.com/iam/home?region=us-east-1#/users/).
2. select the `gu511` account we created in the `iam` section.
3. click on the "Security credentials" tab
4. click on the "Create access key" button
5. SAVE THIS ACCESS KEY VALUE!!!
    1. **you can't ever get this again**
    2. you can create other access keys, so it's not the literal end of the world
    3. ok, even if you couldn't, it's still not the literal end of the world. no need for hyperbole.
6. click "ok"

ok. at this point we should all have:

1. an access key ID (recoverable at any time from the `iam` console
2. an access key *value* (if you don't have it, you done goofed, and have to do it again)

so how do we *use* this access key?

let's log in to our `ec2` server and try it out!

<div align="center">**log in to your ec2 instance if you haven't already**</div>

the first thing we do is run the `aws configure` command to add our access key information.

**best practice note**: `cli` supports a concept called "profiles" -- you could add more than one set of access keys to a single `ec2` instance and user account. 

for example, suppose you have one `ec2` server for generic web-scraping or etl work, but several different client projects. you could create a separate "profile" for each project.

I recommend using profiles from the beginning. let's create one with the same name as your user account for now:

```bash
# enter your access key id, access key value, and
# set the region to "us-east-1"
# set the profile name to gu511, or whatever you want
aws configure --profile gu511
```

the prompts that follow will request the access key id and access key value, so I hope you've gotten the message at this point that you should save them somewhere you won't lose them...

just kidding, after you've done this, they're actually saved in plain text on your file system: check out

```bash
less ~/.aws/config
```

what do you think of this? how secure is or isn't this? who can see this?

<div align="center">**mini-exercise: verify `cli` permissions are correct**</div>

run the following from your `ec2` server after creating your profile

```bash
aws ec2 describe-instances --profile [whatever you used as your profile name]
```

you should see a big, sloppy mess of info from this -- if not, let's debug!

### the good news and the bad news

the good news is that we just

1. installed the `aws cli`
2. figured out how to authenticate it using access keys
3. successfully exected a test command

the bad news is that step 2 -- using access keys to authenticate -- is neither the easiest nor best (depending on your use case) way of authenticating.

the reason this is so is that it is possible to [assign an `iam role` to your `ec2` server](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html) and have all `aws cli` calls on that server authenticate through that role.

so why did I drag you all through this? sadism?

that statement above -- "depending on your use case" -- should be a tell-tale sign.

access keys are a complication, so they must either be solving a specific problem, or they are totally worthless. in this case, it's the former. you can only grant one role -- and one set of permissions -- to a single `ec2` instance. for you right now, that might make more sense: you're one person with one server and you feel you should have the same admin-level access to everything you own.

in the real world, you may need better, more granular security settings, and this is where the profiles, separate users, and separate authentication methods come in.

# let's set up that sweet `iam role` thing, then!

this is relatively straigthforward, so let's walk through it together:

+ create the `iam role`
    1. navigate to your `iam` dashboard in the `aws` web console.
    2. click on `roles` on the left hand menu
    3. click "Create new role"
    4. on the "AWS Service Role > Amazon EC2" line, click the "Select" button
    5. for now let's just give it full access to our `ec2` instances and `s3` buckets
        1. in the Filter search bar, type "S3"
        2. select "AmazonS3FullAccess"
        3. click "Next Step" at the bottom
        4. repeat with "`ec2`"
    6. choose whatever name and description you want
    7. click "Create Role"

+ assign it to our `ec2` instance
    1. navigate to your `ec2` dashboard in the `aws` web console
    2. click on `instances` on the left hand menu
    3. right click on your `ec2` instance
    4. select "instance settings > attach/replace iam role"
    5. select the role you just created
    6. you're done!

now, when we make `aws` calls, we shouldn't need to authenticate *at all*, which is pretty great.

we probably still want one piece of our config file though -- the default region:

```
[default]
region=us-east-1
```

let's update it!

a standard act of paranoia is to never delete files you might want back, but rather move them to a backup location (often as simple as naming something `original_name.ext.bak`).

let's move our credential file somewhere for a moment so that our client can't possible rely on it:

```bash
mv ~/.aws/config /tmp/aws.config.bak
```

now, go add back in that default region -- do this either be creating a new `~/.aws/config` file that has that as its contents, or by re-running the `aws configure` command from the terminal and *only* entering the region this time.

now, what happens when we try to execute our `ec2` query again:

```bash
aws ec2 describe-instances
```

that's pretty *NOT BAD*

## `python sdk`, *aka* `boto`

let's look at what's going on under the hood of that `aws` command. try the following:

```bash
less $(which aws)
```

but first -- what does that command actually do?

the `awscli` command is, itself, a `python` script importing and using the `awscli python` library.

*that* library is itself making use of the `boto3` python library

```bash
less $(python -c "import awscli.clidriver; print(awscli.clidriver.__file__)")
```

but first -- what does that command actually do?

`boto3` is a re-implementation of the long-running and popular amazon `python` adapter library `boto`, which itself was named for a species of fresh-water dolphin that commonly swim in rivers in portuguese-speaking South American countries (such as the Amazon river, in Brazil).

there is some overlap in the features and behavior of the two libraries (`boto` and `boto3`). the "3" in `boto3` is not strictly related to the 2 vs 3 holy war (though that is why the number exists). `aws` created the `boto3` package to normalize the way they interfaced with services (using `botocore`).

we will practice using the `boto3` library up close in a later lecture (on `s3`), but for now, let's just replicate the `ec2` information request we made with the `cli`

### install `boto3`

the first step, as usual, is installing the software. let's start by attempting to `conda` install `boto3`:

```bash
conda install boto3
```

now, if we *hadn't* just gone through the process of authentication, we would need to do that all over again. as it stands, though, we are all set with that.

see how I just stacked the deck to make it seem like `python` was sooooooo easy? how unfair! how biased!

## `boto3` organization

`aws` services are basically REST apps. any software which interacts with them will do so via `http` requests, and will implement some functions that replicate some subset of the various REST endpoints.

the `aws cli` `describe-instances` method is basically sending a `curl` request to the [DescribeInstances](http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstances.html) endpoint

`boto3` is no different, but is also a bit more powerful. there are three types of objects we should know about.

#### `session`

a `session` is basically an object encapsulating your configuration parameters and credentials. **you don't have to create a session**, but you may want to: it allows you to skip specifying configuration parameters every time you create a `client` or `resource` (more on those below.

```python
session = boto3.session.Session(region_name='us-east-1')
```

*note*: if you have created a configuration file at `~/.aws/credentials`, this may be redundant. Or, this may be an easy way of stipulating a profile name once and only once pers cript -- it's up to you!

#### `client`

the `client` objects available in `boto3` are objects created as direct, one-to-one implementations of the REST api endpoints for most services. They are auto-generated and fully functional, but a little bit more low-level and clunky than their object-oriented counterparts (`resource`s, below).

basically, you should use `client` objects if you can't figure out how to use the `session` objects. I wouldn't say "never", but that is clearly the intention of the module architects.

<div align="center">**mini-exercise: obtain our `ec2` information using `boto3 clients`**</div>

let's duplicate our `ec2` example from before:

```python
import boto3

session = boto3.session.Session(region_name='us-east-1')

ec2 = session.client('ec2')

insts = ec2.describe_instances()

print(insts)
print(insts['Reservations'][0]['Instances'][0]['IamInstanceProfile'])
```

#### `resource`

the `boto3` architects have created more object-oriented intefaces to the different service apis. the motivation, as best I can tell, is to try and create a normalized, shared interface for all services (even though the apis themselves are not necessarily similar).

in any case, `resource` implementations are recommended, and will often be "more pythonic" in syntax (e.g. favoring iteration and dot notation).

currently, the following `aws` services are supported with `resource` object interfaces:

+ cloudformation
+ cloudwatch
+ dynamodb
+ ec2
+ glacier
+ iam
+ opsworks
+ s3
+ sns
+ sqs

<div align="center">**mini-exercise: obtain our `ec2` information using `boto3 resources`**</div>

let's duplicate our `ec2` example from before:

```python
import boto3

session = boto3.session.Session(region_name='us-east-1')

ec2 = session.resource('ec2')

insts = list(ec2.instances.all())

i0 = insts[0]
print(i0.iam_instance_profile)
```

note the "pythonic" differences here: via the `aws cli` and `boto3 client` methods, we get a structured `json` object. as one example, the `iam` instance profile is accessible as

```python
insts['Reservations'][0]['Instances'][0]['IamInstanceProfile']
```

in the `boto3 resource` paradigm, we instead get

1. an `ec2` service
2. an iterator over `ec2` instances
3. individual instance objects
4. instnace objects have `iam_instance_profile` member properties

### general workflow

the general workflow when using the `boto3` library is as follows:

+ *optional*: create a "session" object to encapsulate connection and credential info
+ create a `resource` object
    + will likely have, as members, `collection` iterables
    + may need to authenticate on the spot (if no `config` file exists) or specify region (if no `session`)
    + example service types: `ec2`, `iam`, `s3`, `rds`
+ use the member functions of that service object to perform standard tasks
    + call `help(serviceobject)` to find out more
+ when all else fails, try a direct `client` object

<!--div align="center">***all aboard the terminal train***</div>
<img align="middle" src="https://i.stack.imgur.com/KbxXW.png"></img-->

# END OF LECTURE

next lecture: [AWS `s3` (simple storage service)](008_s3.ipynb)