# CS 207 : Lab 11

### Amazon Web Services / Project Software Stack

#### A Distributed System

Thus far Rahul has had you use Python mostly to explore advanced programming concepts and techniques. You have used some external services and data sources, but for the most part you have been designing, developing, testing, and implementing stand-alone programs on your laptop computers. 

But stand alone programs on a single computer are rare. Your final project is not a program; it is a <strong>system</strong>, and current scientific computing systems are increasingly <strong>distributed</strong>. 

<strong>Distributed systems</strong> consist of multiple computers or processors connected at least intermittently by a communications link. There is a sense in which even a laptop computer is a distributed system, but for our purposes it's simpler to think of networked computers collaborating on a task. <br>

Distributed systems present a host of challenges, ranging from determining basic communications and data interchange protocols, hardware reliability, to subtle and complex problems of failure, asynchrony, state, concurrency, encryption, and security. <br>

<strong>Amazon Web Services (AWS)</strong> is a distributed system environment - a collection of cloud-based computing infrastructure services. It has become economical, effective, and fairly safe to lease and use these services in great part because AWS has addressed the basic practical problems of distributed computing. AWS services range from small virtual PCs running Windows or Linux to virtual private clouds (VPCs), from servers optimized for computation, memory, or network throughput, to massive data warehouses with column-store petabyte-sized databases.  <br>

Using AWS, all we have to do to build the infrastructure of your first CS207 distributed system is set up an account, get an access key, select, configure, and start the services we need, and start using them remotely. <br>

In this lab we'll use two AWS services (an EC2 service, for "elastic" - i.e. scalable - computing, and an S3 service, for dedicated storage). We'll create a remote instance of Ubuntu 16.04 server (a flavor of Linux), and we'll provision it with the software stack you'll use on your final project. We'll also set up a new PostgreSQL database and table.

(<i>If we have time, we'll run a simple Python program on the EC2 instance that populates the PostgreSQL table with random numbers, then saves them as a browser accessible HTML document.</i>)<br>

This remote EC2 instance will become the deployment target platform for your project, so it's important that you follow and complete the various steps I'll be showing you.

### Create an Amazon Web Services account

You should already have created an Amazon Web Services account per the notice on Piazza. If you haven't, please do it now. at https://aws.amazon.com/<br><br>
Once all the accounts have been created, we'll start setting up your security credentials.

### AWS security credentials: your Access Key ID and Secret Key

1) Click "MyAccount" on the AWS menu bar and select "Security Credentials"<br>
2) Expand the "Access Keys (Access Key ID and Secret Access Key)" tab<br>
3) Click "Create new access key"<br>
4) Download the new access key<br>

### Set up an EC2 instance

Now that you've created your AWS account and security credentials, we'll configure and launch an EC2 instance<br>

<strong>[PLEASE FOLLOW THE INSTANCE LAUNCH DEMO STEP BY STEP ON YOUR MACHINE. AND PLEASE DO NOT EXECUTE THE CODE CELLS BELOW - YOU WILL CUT AND PASTE THEM INTO A TERMINAL WINDOW CONNECTED TO THE EC2 INSTANCE]</strong>

### A word of warning about the state of your EC2 instance

Your instance is now launched and running.<br>

You can use the AWS EC2 dashboard 'running instances' page to change the state of your EC2 instance, but please be aware of the following:<br>

* if you STOP the instance, the public IP or DNS address will change. This can break software that depends on a hard-coded public address, but the work you did inside the instance itself is preserved <br>
* you can RESTART an instance without losing the public IP / DNS address (and your work is preserved)
* if you TERMINATE an instance all work inside the instance is lost
* remember to STOP the instance after the course is over to avoid incurring charges
* even though we are using a "free tier" service, you should periodically monitor billing and charges via your AWS account page to ensure you aren't being charged



### Connect to the new EC2 instance using SSH

From your local (i.e. laptop) working folder, use ssh to connect to the EC2 instance, for example:

In [None]:
#substitute your key pair name and the public IP or DNS address of your EC2 instance in the command below

#set the privileges for your security key
chmod 0400 cs207pair.pem

#establish a secure terminal to your EC2 instance
sudo ssh -i "cs207pair.pem" ubuntu@107.22.137.53  


(Instead of SSH you can also use Putty on a Mac or something like SecureCRT in Windows)

You should now be logged into the remote EC2 instance - if you're using SSH, your prompt should now contain the user name and IP address specified in the ssh command. If not, please let us know!<br>

In your SSH terminal session, enter the following to update currently installed software packages:

In [None]:
sudo apt-get update

### About the final project software stack ...

The final project stack includes:<br><br> <strong> Python requirements (including numpy) not included with Ubuntu 16.04<br>Flask - web development tool<br> SQL Alchemy <t><t><t>- database tool (part of Flask) <br>nginix - web server<br>PostgreSQL - relational database management system<br></strong>

### Get the provisioning script from an S3 bucket and run it

Now we need the provisioning shell script that installs the final project software stack on your EC2 instance.<br>

The provisioning script is in an AWS S3 instance we set up for CS207 in another part of the AWS cloud.<br>

In the ssh session connected to your EC2 instance, get the script and two Python test programs:


In [None]:
wget http://s3.amazonaws.com/cs207-bucket/cs207_aws_ec2_stack.sh
wget http://s3.amazonaws.com/cs207-bucket/cs207_aws_ec2_stack_test.py
wget http://s3.amazonaws.com/cs207-bucket/cs207_aws_ec2_postgres_test.py

After the script has been downloaded, make it executable:

In [None]:
chmod a+x cs207_aws_ec2_stack.sh

And execute it:

In [None]:
sudo ./cs207_aws_ec2_stack.sh

While the script is running you may occasionally see the message "sudo: unable to resolve host". For now you can ignore this message.<br>

When the script has finished, the software stack should be installed on your EC2 instance. To be sure, <strong>review the installation and services report displayed at the end</strong>, which should report that a Python3 program successfully imported Flask, SQLAlchemy, etc - also, the nginx web server should be reported as "active".

### PostgreSQL - set up new user, database, and table

Before using PostgreSQL from Python, you need to create a user, a database, and a table.<br>

Still in the SSH terminal session on your EC2 instance, enter the following: 

In [None]:
sudo -u postgres psql

At the postgres prompt, change the postgres user password (don't forget the closing ';' !!):

In [None]:
alter user postgres password 'password';

Still at the postgres prompt, we'll create a new PostgreSQL user called <strong>ubuntu</strong>.

In [None]:
create user ubuntu createdb createuser password 'cs207password';

And create a database (also called 'cs207user') owned by the user 'cs207user':

In [None]:
# after executing the command below, type \q to exit back to your EC2 instance command prompt
create database ubuntu owner ubuntu;

Now log back into into the new PostgreSQL 'ubuntu' database

In [None]:
psql -d ubuntu -U ubuntu

Now we'll create a new table called <strong>random_numbers</strong>. Enter the following

In [None]:
# after executing the command below, type \q to exit back to your EC2 instance command prompt
CREATE TABLE random_numbers (
    rn_id serial PRIMARY KEY,
    rn float(8) NOT NULL
);

Now we can run a test to exercise the table. At the EC2 instance prompt type:

In [None]:
python3 cs207_aws_ec2_postgres_test.py

### Check that nginx is accessible from an external browser

Enter the public IP or DNS address of your EC2 instance into a browser - it should display an nginx default 'welcome' page.

### You're all set! The final project software stack is loaded and ready to use!

[will continue with html page demo if there's time]