## Solution

**Task 1: Start the Postgres server.**

```sh
start_postgres
```

**Task 2: Create the table.**

Create a table named `access_log` to store the timestamp, latitude, longitude and visitorid.

```sh
psql --username=<username> --host=<hostname>
\c postgres;
```

Once you connect to the database, run the command to create the table called 'access_log':

```sh
CREATE TABLE access_log(timestamp TIMESTAMP, latitude float, longitude float, visitor_id char(37));
```

Once you receive the confirmation message 'CREATE TABLE', quit from psql:

```sh
\q
```

**Task 3. Unzip the gzip file.**

Run the gunzip command to unzip the .gz file and extract the .txt file.

```sh
# Unzip the file to extract the .txt file.
cd data
gunzip -f web-server-access-log.txt.gz
```

The **-f** option of gunzip is to overwrite the file if it already exists.

**Task 4. Extract required fields from the file.**

Extract timestamp, latitude, longitude and visitorid which are the first four fields from the file using the `cut` command.

The columns in the web-server-access-log.txt file is delimited by '#'.

```sh
# Extract phase
cd data
echo "Extracting data"
cut -d"#" -f1-4 web-server-access-log.txt > extracted-data.txt
```

**Task 5. Transform the data into CSV format.**

The extracted columns are separated by the original "#" delimiter.

We need to convert this into a "," delimited file.

Use 'tr' command for transformation.

```sh
# Transform phase
echo "Transforming data"

# read the extracted data and replace the colons with commas.
tr "#"  ","  < extracted-data.txt > transformed-data.csv
```

**Task 6. Load the data into the table `access_log` in PostgreSQL**

PostgreSQL command to copy data from a CSV file to a table is `COPY`.

The basic structure of the command is,

```sh
COPY table_name FROM 'filename' DELIMITERS 'delimiter_character' FORMAT;
```

The file comes with a header. So use the 'HEADER' option in the 'COPY' command.

Invoke this command from the shellscript, by sending it as input to 'psql' filter command.

```sh
# Load phase
echo "Loading data"

# Send the instructions to connect to 'postgres' and
# copy the file to the table 'access_log' through command pipeline.
echo "\c postgres;\COPY access_log  FROM 'data/transformed-data.csv' DELIMITERS ',' CSV HEADER;"  | psql --username=<username> --host=<hostname>
```

Run the command below at the shell prompt to verify that the table accesss_log is populated with the data.

```sh
echo '\c postgres; \\SELECT * from access_log;'  | psql --username=postgres --host=localhost
```

You should see the records displayed on screen.