# Fullstack Terraform Lab

### Introduction

In this lesson, we'll use terraform to automatically deploy a our flask application to AWS.  In doing this, we'll need to set up an RDS instance, as well as an EC2 instance.  We'll also need to automatically setup our EC2 instance to download the images for the flask backend, the streamlit frontend, and of course start up the containers.

Let's get started.

### Building our backend

If you look at our llm-scraper codebase, you'll see that we currently have folders for `api` and `frontend`.  These folders are for holding our frontend streamlit application and our backend flask application.

Let's start with our backend application.  Inside the api folder, we want it to look like the following.

```bash
Dockerfile

├── app
│   ├── __init__.py
│   ├── data
│   ├── models
│   ├── requirements.txt
│   ├── server.py
│   ├── settings.py
│   └── setup.py
    |   .env
    |  .flaskenv
```

* So notice that we moved the `.env`, `.flaskenv`, `server.py` files into the `api/app` folder, as everything here is specific to the api.
* We also changed variables like `dev_db` to `db_conn` in our `server.py` file, and changed `DEV_DB` to `DB_CONN` in the `.env` file.  This makes sense as we will not always be connecting to the development database.

Now, we cannot directly just build our codebase in a Docker image -- as there is some initial setup that we'll need to complete.  Namely, we'll want to add code that will allow our flask application to setup our database.

* Database setup

To do this we'll add two methods into the `server.py` file.

```python
@app.cli.command("init-db")
def init_db_command():
    """Create database tables and seed data."""
    db.create_all()

# seed-db command
```

Ok, so the first function adds a cli command called `init-db` that will create all of the tables -- derived from our sqlalchemy models.  

> Give it a shot by setting the environmental variables to connect to a local database.  (You can just create a new database, and replace the `db_conn` variable with the connection to the database.  For example, if you connect to postgres, and create a database called `sample_scraper`.

You can comment out the original db_conn, and update `db_conn` to be:

`db_conn = 'postgresql://localhost/sample_scraper'`

Then from the folder that has `server.py` defined, run `flask init-db`, and the connect to the `sample_scraper` database to confirm that both the positions and scrapings tables have been created.

> <img src="./sample_tables.png" width="40%">

Next will be your turn to add a command line function.  In the `server.py` file, 

The command should be `seed-db`, which should decorate a function called `seed_db` which does the following:

* Counts the number of scrapings
* Counts the number of positions
* prints the number of scrapings and positions with some text like, "`Will seed scrapings and positions if there are none in the db.  Currently there are ... scrapings and ... positions`"
* Then only seed scrapings if there are zero in the database, and only seed positions if there are zero in the database.
* use the `seed_scrapings_from_csv` and `seed_positions_from_csv` functions, which are already defined in the `setup.py` file.

Test out your function by calling `seed-db` from the command line and confirm that there are scrapings and positions in the database.

Also, call `flask run`, and visit `localhost:5000/positions` to confirm that our flask api is serving our seeded positions.

* **Reset the db_conn:** Ok, so now we'll want to go back to the `settings.py` file, and make sure we are back to using our original `db_conn` string.

```python
db_conn = f'postgresql://{username}:{password}@{host}/{database}'
```

This is because we want to make sure that our db_conn string references the environmental variables, as docker will let us to pass environmental variables when we boot up our container.

> Note: Even if we have environmental variables in the `.env` file, any environmental variables we specify with the `docker run -e ` command will overwrite those in the `.env` file.  This is a good thing -- it allows us to change those variables more easily.

* Set up an AWS database

At this point, it's probably good to set up an rds instance, and record the variables of `username`, `password`, `host`, and `database`.  You can place them in the `.env` file if you like, or pass them when booting up the container (ie. at runtime). 

### Setting up docker

Ok, so now let's build the docker image.  The Dockerfile is a little tricky, so we have done this for you.  The key issue is that we want multiple things to occur when we boot up our docker container (aka "at runtime").  When we create a docker container we want to:

* Create our database tables (if they do not already exist)
* Seed our `positions` and `scrapings` tables if they do not already have data in them.
* Run our flask application by default.  

Ok, so to achieve this we do a couple of things:

1. Using entrypoint and command in our Dockerfile

If you look at the Dockerfile, you'll see the following towards the bottom.

```Dockerfile
ENTRYPOINT ["sh", "./setup.sh"]

CMD ["flask", "run", "--host=0.0.0.0"]
```

The `Entrypoint` is always run at run time, and CMD are the default arguments passed to what's specified in entrypoint.  So in this case, it's as if we are doing:

`sh ./setup.sh flask run --host=0.0.0.0`

So this will run the `./setup.sh` and then pass `flask run --host=0.0.0.0` to that file.

What will the `setup.sh` file do with the `flask run --host=0.0.0.0` argument?

2. `setup.sh` file

If you look at the setup.sh file you'll see the following:

```bash
flask init-db
flask seed-db

exec "$@"
```

So this will call our `init-db` and `seed-db` functions to create and seed our tables.  The `exec "$@"` allows us to pass optional bash commands to the script.  So when we setup our Dockerfile to run the script with:

`sh ./setup.sh flask run --host=0.0.0.0`

The arguments of `flask run --host=0.0.0.0` will be run in that last line.

We can play around with this.  For example, if we run `sh setup.sh echo hello world`, then we will have created and seeded our tables and run displayed hello world at the end.

So in this scenario, the `CMD ["flask", "run", "--host=0.0.0.0"]` says to pass `flask run --host=0.0.0.0` to our entrypoint `sh setup.sh`, and then the `setup.sh` file executes the `flask run` command after first creating the tables and seeding the database.

And remember we can override that default command at run time with something like:

`docker run image_name flask run --debug=True`

And that means that the setup.sh script will catch those arguments and run that instead of the original command.

* Building our image

Ok, so back to the show.  Now build the image, but do so with tagging the image with your dockerhub usernamem first.  Here's an example, so swap our `jek2141` with your username.

```bash
docker build -t jek2141/scraper_backend .
```

So now we'll want to bootup our image locally before trying it on our ec2 instance, but doing so will be a fairly long line.  So you may want to write it out in the `ec2-setup.sh` file, and the copy and paste it into your terminal.  

Ok, so boot up your container, but make sure you pass through environmental variables `docker build -e` for all of the database environmental variables.  

> Note: These database variables should point to an rds instance on Amazon (that is publicly available).  Connecting to your laptop's database from Docker is much more difficult.

If it works, you should be able to go to `localhost:5000/positions` and see the positions in the flask application.  

* Make sure that your environmental variables are properly getting passed through by passing through some incorrect information (like a wrong password) that should cause your application to break.  If this doesn't work it means you are likely reading from the .env file but not from your `docker build -e` arguments.

Once you have a docker run command that is properly working, copy it into your `ec2-setup.sh` file, as you'll want this (or something like it) later on.

* One more thing

Now we're about to move onto terraform, but there is one issue with our docker image that we'll likely run into.  It's that there may be a mismatch between our laptop where we built our image on and the ec2 machine we ultimately use.  So before moving on, let's rebuild the image and tag it.  

> Just replace `jek2141` with your username.

`docker build -t jek2141/scraper_backend:amd_v2 --platform=linux/amd64/v2 .`

And now this is the image we'll ultimate want to use on our ec2 machine, so let's push it up to dockerhub.

`docker push jek2141/scraper_backend:amd_v2`

### Resources

[Terraform working with Following](https://discuss.hashicorp.com/t/template-v2-2-0-does-not-have-a-package-available-mac-m1/35099/3)