# How do I set up a database locally and in the cloud?

The next set of cases will involve working with databases and writing queries. In order to do this, you will need to install PostgreSQL on your local machine, and also create an RDS instance of PostgreSQL in AWS.

## Setting up PostgreSQL (20 min)

We are going to use PostgreSQL because it is easy to install and because it is open-source. As you become more proficient with SQL, you can try the non-commercial version of SQL Server as it offers a lot of advanced features as well as an excellent tool to manage the database (SQL Server Management Studio), and furthermore it is very likely that your organization has a SQL Server database laying around. Later, we will also use Amazon RDS in order to work with databases in-cloud. (Don't worry if you don't understand these concepts yet; we will cover them in the upcoming cases!)

PostgreSQL was originally developed for UNIX-like platforms, but it was also designed to be portable. This means that PostgreSQL can also run on other platforms such as Mac OS X, Solaris, and Windows. To download PostgreSQL, first go to the download page of PostgreSQL installers: https://www.enterprisedb.com/downloads/postgres-postgresql-downloads, and select your installer version. Once the installer is downloaded, follow these steps:

1. Double-click on the installer file. An installation wizard will appear and guide you through multiple steps where you can choose different options that you would like to have in PostgreSQL

<table class="tab">
   
  <tr>
    <td class="second" width="60%"><div align="left">2. Click the "Next" button</div></td>
    <td class="second"><img src="images/pg2.png" width="400"></td>
  </tr>
  
  <tr>
    <td class="second" width="60%"><div align="left">3. Specify the installation folder; choose your own or keep the default folder suggested by PostgreSQL installer and click the "Next" button</div></td>
    <td class="second"><img src="images/pg3.png" width="400"></td>
  </tr>
  
  <tr>
    <td class="second" width="60%"><div align="left">4. Select components to install and click the "Next" button. Note, we only need the server but if you already know the pgAdmin tool, you are free to use this tool.</div></td>
    <td class="second"><img src="images/pg4.png" width="400"></td>
  </tr>  
  
  <tr>
    <td class="second" width="60%"><div align="left">5. Select the database directory to store the data in. Just leave it as the default or choose your own and click the "Next" button</div></td>
    <td class="second"><img src="images/pg5.png" width="400"></td>
  </tr>  
  
  <tr>
    <td class="second" width="60%"><div align="left">6. Enter the password for the database superuser (postgres)</div></td>
    <td class="second"><img src="images/pg6.png" width="400"></td>
  </tr>  
  
  <tr>
    <td class="second" width="60%"><div align="left">7. Enter the port for PostgreSQL. Make sure that no other applications are using this port. Leave it as the default if you are unsure</div></td>
    <td class="second"><img src="images/pg7.png" width="400"></td>
  </tr>  
  
  <tr>
    <td class="second" width="60%"><div align="left">8. Choose the default locale used by the database and click the "Next" button</div></td>
    <td class="second"><img src="images/pg8.png" width="400"></td>
  </tr>  
 
  <tr>
    <td class="second" width="60%"><div align="left">9. You are now ready to install PostgreSQL! Click the "Next" button to start installing. (The installation may take a few minutes to complete.)</div></td>
    <td class="second"><img src="images/pg9.png" width="400"></td>
  </tr>  
  
  <tr>
    <td class="second" width="60%"><div align="left">10. Click the "Finish" button to complete the PostgreSQL installation</div></td>
    <td class="second"><img src="images/pg10.png" width="400"></td>
  </tr>  
</table>

<style>
.tab {border-collapse:collapse;}
.tab .first {border-bottom:1px solid #EEE;}
.tab .second {border-top:1px solid #CCC;box-shadow: inset 0 1px 0 #CCC}
</style>

### Connecting to ```PostgreSQL``` 

After installing the database, let's try connecting to it, in order to ensure proper installation.

In [8]:
from sqlalchemy import create_engine, text

DB_USERNAME = 'postgres'
DB_PASSWORD = 'admin'

engine=create_engine(f'postgresql://{DB_USERNAME}:{DB_PASSWORD}@localhost/postgres', max_overflow=20)
list(engine.connect().execute(text("SELECT 'Hello world';")))

[('Hello world',)]

## Setting up a cloud database using RDS and importing data (45 min)

Let's now set up a real database so that we can see the different design considerations. To do this, we'll be using Amazon AWS's ```RDS``` product. Once we have the database created, we will connect to it using the `psql` command which you should have just finished installing:

1. Log into your AWS account and select "RDS" from the service list. You should see a screen like the one below, where you can hit the "Create database" button:

![Create Database](images/create_db.png)

2. The next option you'll see asks you if you want to use "standard create" or "easy create". Easy might sound tempting, but **choose "standard"** as we'll have to set up our database for public use so we can connect to it locally.

3. Choose "PostgreSQL" as the database type, leave the version at the default AWS has chosen for you (11.6-R1 at the time of writing), and choose "Free Tier"

4. Under "Storage" turn off "Storage autoscaling". This will prevent any unexpected future charges.

![Turn off autoscaling](images/autoscale.png)

5. Under the next section, choose a name for your database instance. Remember this is the machine that is hosting the database software, not the database itself (one RDS instance can host many databases), so I'm calling mine `ds4a-demo-instance` to reflect this, although we'll only be creating a single database for now. 

6. You can leave the master username as `postgres` and ask RDS to autogenerate a password (we'll be able to see this password at the next step):

![Set DB password](images/set_db_password.png)

7. You can leave the next settings as their defaults until you get to the "Connectivity" section. Usually, you'll set up an RDS instance to play with other infrastructure within your AWS account, such as EC2 servers. In our case, we want to push data in and out of the database directly from our local machine as the client, so we'll have to set our database up for "public access". This is generally less secure, but we'll add some firewall rules in a bit to make sure that only we can access it:

      * Expand the "Additional connectivity configuration" section

      * Set "publicly accessible" to "Yes"

      * Under "VPC security group", choose to "Create new", and give it a name like `allow-local-access`. This will create a firewall rule that will allow you to connect to your database on port 5432 (the default for PostgreSQL) using your current IP address. If you are using public WiFi, a hotspot, or if you think your IP address is likely to change soon for any reason, note that you'll have to modify this security group any time your IP address changes:

![Create Security Group](images/create-sec-group.png)

8. Press the "Create database" button in the bottom right, and you'll be taken back to the overview page where you can see your database being created. At the top, there'll be a notification where you can press "View credential details" to access your master password that was automatically generated. **Take note of this as you can only see it once.** Note: this creates a database in the default VPC. If your default VPC is not configured for DNS connections, you will need to create a new VPC. Please see 'Appendix 1: Troubleshooting RDS creation' for instructions on how to do achieve this.

![View credentials](images/view_creds.png)

9. Once your database becomes "available" (you might need to press the "refresh" button indicated below to see the change), you can connect to it. Click on the name of the database (`ds4a-demo-instance` in our example), to find out the connection details:

![DB available](images/db-available.png)

10. Once you click on the database, you should see the endpoint that you need on a screen similar to the one shown below. You need this endpoint to connect to the database from your local machine.

![DB Endpoint](images/db-endpoint.png)

11. Locally, open a terminal and run the following command, substituting [endpoint] with the one that you noted from the RDS console above.

```bash
psql -h [endpoint] -U postgres
```

This will connect to our instance's default database using the master username. It will prompt you for the password and you can enter the autogenerated password from above. You should now see a SQL prompt, similar to the image below:

![PSQL prompt](images/psql-prompt.png)

We've successfully created a cloud database and connected to it!