### <p style="text-align: center;"> GamesBattling - Documentation of architecture and test run of a streaming game</p>

#### Adam Sohn notebook for Fall'19 W205 Project3
https://github.com/mids-w205-schioberg/project-3-adamsohn<br>
Note: 
* Inclusive to this document are all commands, scripts, and links to supporting documents (Google Slides w/ view-only permission for UC Berkeley-only).
* The following files are duplicated in the repo:
   * docker-compose.yml
   * game_api_sohn.py
   * ab.sh
   * stream_filtered_writes_sohn.py
* Citations shown as superscript.

#### 1. High Level Project Pipeline
[Link to High Level Pipeline](https://docs.google.com/presentation/d/1Uu4vafYO2MlwttxE7PO-hTXIU6fpC33LFKId9Jg_D9M/edit?usp=sharing)

#### 2. Create firewall rule to allow external http calls

	1: Google Cloud Platform > View Network Details
	2: VPC network > Firewall Rules
	3: Create Firewall Rules
	4: Configure firewall:
		Name: midsw205rule
		Source IP ranges 0.0.0.0/0 - Network wildcard (allow all IP addresses)
		TCP: 8888 for Jupyter nb (Note: Jupyter nb not used in this project for code execution)
		TCP: 5000 for web server external to VM
        
See [Firewall Rule](https://docs.google.com/presentation/d/1fqNpJ4HfRsm-FBSFzXGcbgbJdi1HF0LskuApdH8q6Ns/edit?usp=sharing) for graphical representation of firewall rule flow.

#### 3. Set docker-compose.yml to define services and their configurations
```bash
adam_sohn@myw205tools:~/w205/P3$ cp ~/w205/course-content/13-Understanding-Data/docker-compose.yml .
adam_sohn@myw205tools:~/w205/P3$
```

Notes:
* Week 13 docker-compose.yml was chosen as it has all service configuration necessary for execution of Project3 (Zookeeper/Kafka, Spark, Cloudera (Hadoop), Presto). 
* **docker-compose.yml** is a configuration file for a multi-container application.
* **Zookeeper** is a service to manage the external linkages (storage and retrieval of messages) and failover-management of data over the distributed Kafka architecture. This project is a simplistic Kafka application with only a single Kafka broker, and a replication factor of one.[<sup>1</sup>](#cite)<br>
* **Cloudera** is a curated collection of open-source Apache Hadoop software designed for turnkey implementation. [<sup>2</sup>](#cite)<br>
* **extra_hosts: - "moby:127.0.0.1"** is an alternative connection method which enables access using Windows CLI. This is not needed for Linux use and will not be used in this project.
* **Expose vs. Ports** - 'Ports' allow for traffic between the host system and other systems. 'Expose' allow for traffic only within the host system's container environment.[<sup>3</sup>](#cite)<br>
* **Presto** is a relatively quick-performant (compared to legacy Apache Hive) SQL query engine that is able to query large, distributed hadoop system data without needing to write intermediate results to disk.[<sup>4</sup>](#cite)<br>
* **Hive** is not used (in this project) as a query tool (Presto is used for that purpose), but for registering table schemas.<br>
* **Spark** is a unified analytics engine used in this project to manage data streams, creating linkages between Kafka, Hadoop, and Hive.[<sup>5</sup>](#cite)<br>




Week13 docker-compose.yml (For Reference)
```yml
---
version: '2'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 32181
      ZOOKEEPER_TICK_TIME: 2000
    expose:
      - "2181"
      - "2888"
      - "32181"
      - "3888"
    extra_hosts:
      - "moby:127.0.0.1"

  kafka:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    expose:
      - "9092"
      - "29092"
    extra_hosts:
      - "moby:127.0.0.1"

  cloudera:
    image: midsw205/hadoop:0.0.2
    hostname: cloudera
    expose:
      - "8020" # nn
      - "8888" # hue
      - "9083" # hive thrift
      - "10000" # hive jdbc
      - "50070" # nn http
    ports:
      - "8888:8888"
    extra_hosts:
      - "moby:127.0.0.1"

  spark:
    image: midsw205/spark-python:0.0.6
    stdin_open: true
    tty: true
    volumes:
      - ~/w205:/w205
    expose:
      - "8888"
    #ports:
    #  - "8888:8888"
    depends_on:
      - cloudera
    environment:
      HADOOP_NAMENODE: cloudera
      HIVE_THRIFTSERVER: cloudera:9083
    extra_hosts:
      - "moby:127.0.0.1"
    command: bash

  presto:
    image: midsw205/presto:0.0.1
    hostname: presto
    volumes:
      - ~/w205:/w205
    expose:
      - "8080"
    environment:
      HIVE_THRIFTSERVER: cloudera:9083
    extra_hosts:
      - "moby:127.0.0.1"

  mids:
    image: midsw205/base:0.1.9
    stdin_open: true
    tty: true
    volumes:
      - ~/w205:/w205
    expose:
      - "5000"
    ports:
      - "5000:5000"
    extra_hosts:
      - "moby:127.0.0.1"
```

#### 4. Spin up the cluster

Note: 
* `-d` option will run docker container in detached mode. Detached mode runs container in the background of the current terminal.
***

```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose up -d
Creating network "p3_default" with the default driver
Creating p3_mids_1 ...
Creating p3_cloudera_1 ...
Creating p3_zookeeper_1 ...
Creating p3_presto_1 ...
Creating p3_cloudera_1
Creating p3_mids_1
Creating p3_zookeeper_1
Creating p3_zookeeper_1 ... done
Creating p3_kafka_1 ...
Creating p3_cloudera_1 ... done
Creating p3_spark_1 ...
Creating p3_spark_1 ... done
adam_sohn@myw205tools:~/w205/P3$
```
***

Validating docker status
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose ps
     Name                   Command               State                                Ports
--------------------------------------------------------------------------------------------------------------------------
p3_cloudera_1    /usr/bin/docker-entrypoint ...   Up      10000/tcp, 50070/tcp, 8020/tcp, 0.0.0.0:8888->8888/tcp, 9083/tcp
p3_kafka_1       /etc/confluent/docker/run        Up      29092/tcp, 9092/tcp
p3_mids_1        /bin/bash                        Up      0.0.0.0:5000->5000/tcp, 8888/tcp
p3_presto_1      /usr/bin/docker-entrypoint ...   Up      8080/tcp
p3_spark_1       docker-entrypoint.sh bash        Up      8888/tcp
p3_zookeeper_1   /etc/confluent/docker/run        Up      2181/tcp, 2888/tcp, 32181/tcp, 3888/tcp
```



#### 5. Create Kafka topic: 'events'
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec kafka \
>   kafka-topics \
>     --create \
>     --topic events \
>     --partitions 1 \
>     --replication-factor 1 \
>     --if-not-exists --zookeeper zookeeper:32181
Created topic events.
adam_sohn@myw205tools:~/w205/P3$
```

#### 6. Customizing Python/Flask template to create `game_api.py` 
Notes:
* Using Week13 `game_api.py`as starting template.
* `#!/usr/bin/env python` instructs OS to excecute the script in a python environment.
* After connection string, format is `/<event_type>/<metadata_characteristic>`, with metadata_characteristic being a requirement for all event_type in `[join_guild, purchase_sword, take_nap, consume_fermented_beverage]`.
* `event` variable includes both the event_type and metadata_characteristic.
* `<string: name>` allows for open user input of metadata_characteristic casted as string.
* Easter egg if `/funtime` is entered.
* Neither `default_response` nor `funtime` are logged responses in Kafka. All other event_types are logged responses in Kafka.
* Choosing to log all event_type(s) to a singular Kafka topic `events`, as all data written to Kafka can work with a singular schema. This will not be a hinderence to filtering later in Spark script and will simplify writing to a singular table.
***
`game_api_sohn.py`, as stored in `adam_sohn@myw205tools:~/w205/P3`:
```py
#!/usr/bin/env python
import json
from kafka import KafkaProducer
from flask import Flask, request

app = Flask(__name__)
producer = KafkaProducer(bootstrap_servers='kafka:29092')

def log_to_kafka(topic, event):
    event.update(request.headers)
    producer.send(topic, json.dumps(event).encode())

@app.route("/")
def default_response():
    event = {'event_type': 'default', 'metadata_characteristic': ''}
    log_to_kafka('events', event)
    return "This is the default response!\n"

@app.route("/join_guild/<string:name>")
def join_guild(name):
	event = {'event_type': 'joined_guild', 'metadata_characteristic': name}
	log_to_kafka('events', event)
	return 'Congratulations. You joined the ' + name + ' guild.\n'

@app.route("/purchase_a_sword/<string:name>")
def purchase_a_sword(name):
	event = {'event_type': 'purchd_sword', 'metadata_characteristic': name}
	log_to_kafka('events', event)
	return 'Congratulations. You purchased a ' + name + ' sword.\n'

@app.route("/take_nap/<string:name>")
def take_nap(name):
	event = {'event_type': 'took_nap', 'metadata_characteristic': name}
	log_to_kafka('events', event)
	return 'Congratulations. You took a ' + name + ' nap.\n'

@app.route("/consume_fermented_beverage/<name>")
def consume_fermented_beverage(name):
	event = {'event_type': 'consumed_fermented_beverage', 'metadata_characteristic': name}
	log_to_kafka('events', event)
	return 'Congratulations. You drank ' + name + '.\n'
	
@app.route("/funtime")
def funtime():
    return "I can not stand sitting. Rofl.\n"
```

#### 7. Serve Flask app game_api_sohn.py 
Note: 
* CLI serving Flask app must stay open for service to continue.
* `0.0.0.0` in IP address is a placeholder and should be substituted by either `localhost` (if in VM) or `VM external IP` (if accessing external to VM).
***
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec mids   env FLASK_APP=/w205/P3/game_api_sohn.py   flask run --host 0.0.0.0
 * Serving Flask app "game_api_sohn"
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
 ```

#### 8. Generate Events: Single events using http call (Get) on external browser
Note:
* Assuming firewall rule in place to open port 5000, use external (to VM) browser, insert `<VM external IP>:5000/<event_type>/<metadata_characteristic>`
* If firewall rule is NOT in place to open port 5000, the browser connection will fail.
* This is a preliminary test before the entire pipeline is built to validate the Flask app and Kafka are working.
* First term in Flask return data is client IP.
***
Below are the commands & in-browser response as entered from an external browser
```bash
    http://35.247.54.185:5000/join_guild/lollipop
        Congratulations. You joined the lollipop guild.
    http://35.247.54.185:5000/purchase_a_sword/wood
        Congratulations. You purchased a wood sword.
    http://35.247.54.185:5000/
        This is the default response!
    http://35.247.54.185:5000/funtime
        I can not stand sitting. Rofl
```
***
Below is response to the above commands as seen on the terminal serving `game_api_sohn.py`
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec mids   env FLASK_APP=/w205/P3/game_api_sohn.py   flask run --host 0.0.0.0
 * Serving Flask app "game_api_sohn"
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
205.153.50.97 - - [09/Dec/2019 22:23:59] "GET /join_guild/lollipop HTTP/1.1" 200 -
205.153.50.97 - - [09/Dec/2019 22:24:00] "GET /favicon.ico HTTP/1.1" 404 -
205.153.50.97 - - [09/Dec/2019 22:25:16] "GET /purchase_a_sword/wood HTTP/1.1" 200 -
205.153.50.97 - - [09/Dec/2019 22:25:35] "GET / HTTP/1.1" 200 -
205.153.50.97 - - [09/Dec/2019 22:25:47] "GET /funtime HTTP/1.1" 200 -
```

***
Below is response to the above commands as seen in Kafka messages
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec mids   kafkacat -C -b kafka:29092 -t events -o beginning
{"Accept-Language": "en-US,en;q=0.9", "event_type": "joined_guild", "Host": "35.247.54.185:5000", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "Connection": "keep-alive", "Cookie": "jupyterhub-session-id=738e9a4e7efa4874b3155678caa22b1b; _xsrf=2|7ee3e6cf|4275e10258eea4aa1bd7a57c620e968c|1575928522", "Upgrade-Insecure-Requests": "1", "metadata_characteristic": "lollipop", "Accept-Encoding": "gzip, deflate"}
{"Accept-Language": "en-US,en;q=0.9", "event_type": "purchd_sword", "Host": "35.247.54.185:5000", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "Connection": "keep-alive", "Cookie": "jupyterhub-session-id=738e9a4e7efa4874b3155678caa22b1b; _xsrf=2|7ee3e6cf|4275e10258eea4aa1bd7a57c620e968c|1575928522", "Upgrade-Insecure-Requests": "1", "metadata_characteristic": "wood", "Accept-Encoding": "gzip, deflate"}
{"Accept-Language": "en-US,en;q=0.9", "event_type": "default", "Host": "35.247.54.185:5000", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "Connection": "keep-alive", "Cookie": "jupyterhub-session-id=738e9a4e7efa4874b3155678caa22b1b; _xsrf=2|7ee3e6cf|4275e10258eea4aa1bd7a57c620e968c|1575928522", "Upgrade-Insecure-Requests": "1", "metadata_characteristic": "", "Accept-Encoding": "gzip, deflate"}
```

#### 9. Customizing Spark Streaming template to stream Kafka to hdfs
Note:
* For this script, larger notes are below in this section to retain file readability. Minor notes are commented in code.
* Using Week13 `write_swords_stream.py` as a starting template.
* `def guild_sword_nap_event_schema()` contains a portion of the schema definition to be used for loading to Hadoop (both for storage and for queryable table). The remainder of the schema (`timestamp`) is noted as setting `.select('timestamp', 'json.*')` for `filtered_events`. 

***
`stream_filtered_writes_sohn.py`, as stored in `adam_sohn@myw205tools:~/w205/P3`:
```py
#!/usr/bin/env python
"""Extract events from kafka and write them to hdfs
"""
#Importing Libraries
import json
from pyspark.sql import SparkSession
from pyspark.sql.functions import udf, from_json
from pyspark.sql.types import StructType, StructField, StringType

#Defining schema. Fixed User-Agent to User_Agent for universal compatability.
def guild_sword_nap_event_schema():
    """
    root
    |-- Accept: string (nullable = true)
    |-- Host: string (nullable = true)
    |-- User_Agent: string (nullable = true)
    |-- event_type: string (nullable = true)
    |-- metadata_characteristic: string (nullable = true)
    """
    return StructType([
        StructField("Accept", StringType(), True),
        StructField("Host", StringType(), True),
        StructField("User_Agent", StringType(), True),
        StructField("event_type", StringType(), True),
        StructField("metadata_characteristic", StringType(), True),
    ])


@udf('boolean')
def is_guild_sword_nap(event_as_json):
    """
    Boolean function returns True for event_type in ['joined_guild', 'took_nap', 'purchd_sword']. Returns False for 
    others ('consume_fermented_beverage'). No other event_types expected in Kafka per game_api_sohn.py.
    """
    event = json.loads(event_as_json)
    if event['event_type'] == 'joined_guild' or event['event_type'] == 'purchd_sword' or event['event_type'] == 'took_nap':
        return True
    else:
        return False
	
def main():
    """main
    """
    spark = SparkSession \
        .builder \
        .appName("ExtractEventsJob") \
        .enableHiveSupport() \ #Without this setting, the file will throw exception.
        .getOrCreate()

    raw_events = spark \
        .readStream \
        .format("kafka") \
        .option("kafka.bootstrap.servers", "kafka:29092") \
        .option("subscribe", "events") \ #Kafka topic 'events' previously created.
        .load()

#Filtering ['joined_guild', 'purchd_sword', 'took_nap'] events.
    filtered_events = raw_events \
        .filter(is_guild_sword_nap(raw_events.value.cast('string'))) \
        .select(raw_events.value.cast('string').alias('raw_event'),
                raw_events.timestamp.cast('string'),
                from_json(raw_events.value.cast('string'),
                          guild_sword_nap_event_schema()).alias('json')) \
        .select('timestamp', 'json.*') #Matches Hive schema.

# Writing any new events hdfs in 20 second intervals. Generous for low volume to ensure no falling behind.
    sink = filtered_events \
        .writeStream \
        .format("parquet") \
        .option("checkpointLocation", "/tmp/checkpoints_for_filtered_events") \
        .option("path", "/tmp/filterd_evnts") \
        .trigger(processingTime="20 seconds") \ 
        .start()

# Registering filtered_events_tbl in Hive according to contents in hdfs /tmp/filterd_evnts. 
# Schema matches filtered_events above.
# Important to be after sink statement to avoid waiting an additional cycle for loading to filtered_event_tbl.
    sql_string = "drop table if exists default.filtered_events_tbl"
    spark.sql(sql_string)
    sql_string = """
    create external table if not exists default.filtered_events_tbl (
        timestamp string,
        Accept string,
        Host string,
        User_Agent string,
        event_type string,
        metadata_characteristic string
    )
    stored as parquet
    location '/tmp/filterd_evnts'
    tblproperties ("parquet.compress"="SNAPPY")
    """
    spark.sql(sql_string)        

# Streaming script runs until terminated
    sink.awaitTermination()

# Kick-off command for main()
if __name__ == "__main__":
    main()
```


#### 10. Serve stream_filtered_writes_sohn.py
Note:
* Key items in output are `"timestamp"` and `"numImputRows"`. Also watch that `"triggerExecution"` is not approaching processingTime of 20 seconds.
* CLI serving Spark script must stay open for script to continue.

```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec spark spark-submit /w205/P3/stream_filtered_writes_sohn.py
Using Spark\'s default log4j profile: org/apache/spark/log4j-defaults.properties
19/12/09 22:42:31 INFO SparkContext: Running Spark version 2.2.0

... log break ...

19/12/09 22:45:30 INFO StreamExecution: Streaming query made progress: {
  "id" : "06514a3e-034f-4483-b6b9-b3018fed2aa2",
  "runId" : "6e13af80-f95d-4bf1-8d6a-44c931d80809",
  "name" : null,
  "timestamp" : "2019-12-09T22:45:30.000Z",
  "numInputRows" : 0,
  "inputRowsPerSecond" : 0.0,
  "processedRowsPerSecond" : 0.0,
  "durationMs" : {
    "getOffset" : 6,
    "triggerExecution" : 6
  },
  "stateOperators" : [ ],
  "sources" : [ {
    "description" : "KafkaSource[Subscribe[events]]",
    "startOffset" : {
      "events" : {
        "0" : 3
      }
    },
    "endOffset" : {
      "events" : {
        "0" : 3
      }
    },
    "numInputRows" : 0,
    "inputRowsPerSecond" : 0.0,
    "processedRowsPerSecond" : 0.0
  } ],
  "sink" : {
    "description" : "FileSink[/tmp/purchase_events]"
  }
}

```

#### 11. Generate events: Single events using curl cmd in VM CLI
Notes:
* **Curl** stands for 'Client URL'. Curl is a CLI tool for file transfer with support for http[<sup>6</sup>](#cite)<br>
* **localhost** is used instead of the VM external IP address as the VM's CLI is being used.
***
Below are the commands as entered from a CLI on the VM
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec mids curl http://localhost:5000/join_guild/lollipop
Congratulations. You joined the lollipop guild.
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec mids curl http://localhost:5000/purchase_a_sword/wood
Congratulations. You purchased a wood sword.
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec mids curl http://localhost:5000/
This is the default response!
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec mids curl http://localhost:5000/funtime
I can not stand sitting. Rofl.
adam_sohn@myw205tools:~/w205/P3$
```

***
Below is response to the above commands as seen on the terminal serving game_api_sohn.py
```bash
127.0.0.1 - - [09/Dec/2019 22:47:21] "GET /join_guild/lollipop HTTP/1.1" 200 -
127.0.0.1 - - [09/Dec/2019 22:47:32] "GET /purchase_a_sword/wood HTTP/1.1" 200 -
127.0.0.1 - - [09/Dec/2019 22:47:45] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [09/Dec/2019 22:47:56] "GET /funtime/ HTTP/1.1" 404 -
127.0.0.1 - - [09/Dec/2019 22:48:06] "GET /funtime HTTP/1.1" 200 -
```
***

Below is response to the above commands as seen in Kafka messages
```bash
{"Accept": "*/*", "Host": "localhost:5000", "event_type": "joined_guild", "metadata_characteristic": "lollipop", "User-Agent": "curl/7.47.0"}
{"Accept": "*/*", "Host": "localhost:5000", "event_type": "purchd_sword", "metadata_characteristic": "wood", "User-Agent": "curl/7.47.0"}
{"Accept": "*/*", "Host": "localhost:5000", "event_type": "default", "metadata_characteristic": "", "User-Agent": "curl/7.47.0"}
```

#### 12. Generate Events: Mutliple events using Apache Bench
Note:
* **Apache Bench** is included in mids container as part of Apache Utils.
* **Apache Bench** is a tool designed for the purposes of benchmark-testing an http server by sending a custom volume of requests as configured by user. [<sup>7</sup>](#cite)<br>
* Only a single request to be tested via Apache Bench in this demonstration. The next demonstration will show multiple Apache Bench-generated requests in a shell script.
* `-n` precedes integer for number of repetitions.
* `-H` precedes custom headers for the request. In this case, a simulated host.
***
Below are the commands as entered from a CLI on the VM
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec mids ab -n 2 -H "Host: user2.att.com" http://localhost:5000/purchase_a_sword/gold
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient).....done


Server Software:        Werkzeug/0.14.1
Server Hostname:        localhost
Server Port:            5000

Document Path:          /purchase_a_sword/gold
Document Length:        45 bytes

Concurrency Level:      1
Time taken for tests:   0.011 seconds
Complete requests:      2
Failed requests:        0
Total transferred:      400 bytes
HTML transferred:       90 bytes
Requests per second:    178.02 [#/sec] (mean)
Time per request:       5.617 [ms] (mean)
Time per request:       5.617 [ms] (mean, across all concurrent requests)
Transfer rate:          34.77 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      1       1
Processing:     5    5   0.0      5       5
Waiting:        0    1   1.7      2       2
Total:          5    6   0.1      6       6
ERROR: The median and mean for the initial connection time are more than twice the standard
       deviation apart. These results are NOT reliable.

Percentage of the requests served within a certain time (ms)
  50%      6
  66%      6
  75%      6
  80%      6
  90%      6
  95%      6
  98%      6
  99%      6
 100%      6 (longest request)
adam_sohn@myw205tools:~/w205/P3$
```
***
Below are response to the above commands as seen on the terminal serving game_api_sohn.py
```bash
127.0.0.1 - - [09/Dec/2019 22:50:08] "GET /purchase_a_sword/gold HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:50:08] "GET /purchase_a_sword/gold HTTP/1.0" 200 -
```
***
Below is response to the above commands as seen in Kafka messages
```bash
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "purchd_sword", "metadata_characteristic": "gold", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "purchd_sword", "metadata_characteristic": "gold", "User-Agent": "ApacheBench/2.3"}
```


#### 13. Generate Events: Multiple events using Apache Bench in VM CLI (shell script ab.sh)
Note
* `chmod +x ab.sh` is a system call to change the access permissions (chmod) to be executable (+x) of the file (shell script ab.sh)[<sup>8</sup>](#cite)<br>
* `#!/bin/sh` communicates to the OS to run the script in a bourne shell.[<sup>9</sup>](#cite)<br>

`ab.sh`, as stored in `adam_sohn@myw205tools:~/w205/P3:`
```py
#!/bin/sh
docker-compose exec mids ab -n 2 -H "Host: user2.att.com" http://localhost:5000/purchase_a_sword/foam
docker-compose exec mids ab -n 3 -H "Host: user2.att.com" http://localhost:5000/purchase_a_sword/strongium
docker-compose exec mids ab -n 4 -H "Host: user1.comcast.com" http://localhost:5000/join_guild/guildilocks
docker-compose exec mids ab -n 5 -H "Host: user1.comcast.com" http://localhost:5000/join_guild/mids_united
docker-compose exec mids ab -n 6 -H "Host: user1.comcast.com" http://localhost:5000/take_nap/cat
docker-compose exec mids ab -n 7 -H "Host: user1.comcast.com" http://localhost:5000/take_nap/power
docker-compose exec mids ab -n 8 -H "Host: user2.att.com" http://localhost:5000/consume_fermented_beverage/beer
docker-compose exec mids ab -n 9 -H "Host: user2.att.com" http://localhost:5000/consume_fermented_beverage/mead
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/
docker-compose exec mids ab -n 11 -H "Host: user1.comcast.com" http://localhost:5000/funtime
```
***
Below are the commands as entered from a CLI on the VM

```bash
adam_sohn@myw205tools:~/w205/P3$ chmod +x ab.sh
adam_sohn@myw205tools:~/w205/P3$ ./ab.sh
```
***
Below are response to the above commands as seen on the terminal serving game_api_sohn.py

```bash
127.0.0.1 - - [09/Dec/2019 22:52:52] "GET /purchase_a_sword/foam HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:52] "GET /purchase_a_sword/foam HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:54] "GET /purchase_a_sword/strongium HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:54] "GET /purchase_a_sword/strongium HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:54] "GET /purchase_a_sword/strongium HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:55] "GET /join_guild/guildilocks HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:55] "GET /join_guild/guildilocks HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:55] "GET /join_guild/guildilocks HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:55] "GET /join_guild/guildilocks HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:57] "GET /join_guild/mids_united HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:57] "GET /join_guild/mids_united HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:57] "GET /join_guild/mids_united HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:57] "GET /join_guild/mids_united HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:57] "GET /join_guild/mids_united HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:59] "GET /take_nap/cat HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:59] "GET /take_nap/cat HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:59] "GET /take_nap/cat HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:59] "GET /take_nap/cat HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:59] "GET /take_nap/cat HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:52:59] "GET /take_nap/cat HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:02] "GET /take_nap/power HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:02] "GET /take_nap/power HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:02] "GET /take_nap/power HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:02] "GET /take_nap/power HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:02] "GET /take_nap/power HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:02] "GET /take_nap/power HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:02] "GET /take_nap/power HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:04] "GET /consume_fermented_beverage/beer HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:04] "GET /consume_fermented_beverage/beer HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:04] "GET /consume_fermented_beverage/beer HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:04] "GET /consume_fermented_beverage/beer HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:04] "GET /consume_fermented_beverage/beer HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:04] "GET /consume_fermented_beverage/beer HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:04] "GET /consume_fermented_beverage/beer HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:04] "GET /consume_fermented_beverage/beer HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:05] "GET /consume_fermented_beverage/mead HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:07] "GET / HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
127.0.0.1 - - [09/Dec/2019 22:53:09] "GET /funtime HTTP/1.0" 200 -
```
***
```bash
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "purchd_sword", "metadata_characteristic": "foam", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "purchd_sword", "metadata_characteristic": "foam", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "purchd_sword", "metadata_characteristic": "strongium", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "purchd_sword", "metadata_characteristic": "strongium", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "purchd_sword", "metadata_characteristic": "strongium", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "guildilocks", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "guildilocks", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "guildilocks", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "guildilocks", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "mids_united", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "mids_united", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "mids_united", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "mids_united", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "joined_guild", "metadata_characteristic": "mids_united", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "cat", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "cat", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "cat", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "cat", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "cat", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "cat", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "power", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "power", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "power", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "power", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "power", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "power", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user1.comcast.com", "event_type": "took_nap", "metadata_characteristic": "power", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "beer", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "beer", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "beer", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "beer", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "beer", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "beer", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "beer", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "beer", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "consumed_fermented_beverage", "metadata_characteristic": "mead", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
```

#### 14. Note tables in Presto
Enable Presto prompt
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose exec presto presto --server presto:8080 --catalog hive --schema default
presto:default>    
```
***
Show available tables
```bash
presto:default> show tables;
        Table
---------------------
 filtered_events_tbl
(1 row)

Query 20191210_060812_00050_caj53, FINISHED, 1 node
Splits: 2 total, 1 done (50.00%)
0:00 [1 rows, 44B] [5 rows/s, 230B/s]
```
***
Describe table
```bash
presto:default> describe filtered_events_tbl;
         Column          |  Type   | Comment
-------------------------+---------+---------
 timestamp               | varchar |
 accept                  | varchar |
 host                    | varchar |
 user_agent              | varchar |
 event_type              | varchar |
 metadata_characteristic | varchar |
(6 rows)

Query 20191210_060710_00049_caj53, FINISHED, 1 node
Splits: 2 total, 1 done (50.00%)
0:00 [6 rows, 484B] [21 rows/s, 1.69KB/s]
```

#### 15. Sample queries in Presto 

Query 1: Show all data
```bash
presto:default> select * from filtered_events_tbl;
        timestamp        | accept |       host        | user_agent |  event_type  | metadata_characteristic
-------------------------+--------+-------------------+------------+--------------+-------------------------
 2019-12-10 05:38:46.789 | */*    | user2.att.com     | NULL       | purchd_sword | foam
 2019-12-10 05:38:46.794 | */*    | user2.att.com     | NULL       | purchd_sword | foam
 2019-12-10 05:38:48.437 | */*    | user2.att.com     | NULL       | purchd_sword | strongium
 2019-12-10 05:38:48.447 | */*    | user2.att.com     | NULL       | purchd_sword | strongium
 2019-12-10 05:38:48.458 | */*    | user2.att.com     | NULL       | purchd_sword | strongium
 2019-12-10 05:38:50.131 | */*    | user1.comcast.com | NULL       | joined_guild | guildilocks
 2019-12-10 05:38:50.134 | */*    | user1.comcast.com | NULL       | joined_guild | guildilocks
 2019-12-10 05:38:50.138 | */*    | user1.comcast.com | NULL       | joined_guild | guildilocks
 2019-12-10 05:38:50.14  | */*    | user1.comcast.com | NULL       | joined_guild | guildilocks
 2019-12-10 05:38:51.75  | */*    | user1.comcast.com | NULL       | joined_guild | mids_united
 2019-12-10 05:38:51.753 | */*    | user1.comcast.com | NULL       | joined_guild | mids_united
 2019-12-10 05:38:51.756 | */*    | user1.comcast.com | NULL       | joined_guild | mids_united
 2019-12-10 05:38:51.763 | */*    | user1.comcast.com | NULL       | joined_guild | mids_united
 2019-12-10 05:38:51.765 | */*    | user1.comcast.com | NULL       | joined_guild | mids_united
 2019-12-10 05:38:53.424 | */*    | user1.comcast.com | NULL       | took_nap     | cat
 2019-12-10 05:38:53.428 | */*    | user1.comcast.com | NULL       | took_nap     | cat
 2019-12-10 05:38:53.432 | */*    | user1.comcast.com | NULL       | took_nap     | cat
 2019-12-10 05:38:53.437 | */*    | user1.comcast.com | NULL       | took_nap     | cat
 2019-12-10 05:38:53.444 | */*    | user1.comcast.com | NULL       | took_nap     | cat
 2019-12-10 05:38:53.45  | */*    | user1.comcast.com | NULL       | took_nap     | cat
 2019-12-10 05:38:55.089 | */*    | user1.comcast.com | NULL       | took_nap     | power
 2019-12-10 05:38:55.103 | */*    | user1.comcast.com | NULL       | took_nap     | power
 2019-12-10 05:38:55.105 | */*    | user1.comcast.com | NULL       | took_nap     | power
 2019-12-10 05:38:55.114 | */*    | user1.comcast.com | NULL       | took_nap     | power
 2019-12-10 05:38:55.12  | */*    | user1.comcast.com | NULL       | took_nap     | power
 2019-12-10 05:38:55.122 | */*    | user1.comcast.com | NULL       | took_nap     | power
 2019-12-10 05:38:55.126 | */*    | user1.comcast.com | NULL       | took_nap     | power
 2019-12-10 05:37:29.798 | */*    | user2.att.com     | NULL       | purchd_sword | foam
 2019-12-10 05:37:29.806 | */*    | user2.att.com     | NULL       | purchd_sword | foam
 2019-12-10 05:34:29.321 | */*    | user2.att.com     | NULL       | purchd_sword | foam
 2019-12-10 05:34:29.331 | */*    | user2.att.com     | NULL       | purchd_sword | foam
(31 rows)

Query 20191210_054144_00028_caj53, FINISHED, 1 node
Splits: 5 total, 0 done (0.00%)
0:50 [0 rows, 0B] [0 rows/s, 0B/s]
```
***
Query 2: Count all rows
```bash
presto:default> select count(*) as Row_Count from filtered_events_tbl;
 Row_Count
-----------
        31
(1 row)

Query 20191210_054410_00030_caj53, FINISHED, 1 node
Splits: 5 total, 0 done (0.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]
```
***
Query 3: List distinct metadata_characteristic for event_type `purchd_sword`.
```
presto:default> select distinct(metadata_characteristic) as Distinct_Sword_Types from filtered_events_tbl where event_type like 'purchd_sword';
 Distinct_Sword_Types
----------------------
 foam
 strongium
(2 rows)

Query 20191210_060455_00048_caj53, FINISHED, 1 node
Splits: 6 total, 0 done (0.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]
```
***
Query 4: Perecentage of (event_type) `took_nap` that are (metadata_characteristic) `cat`.
```bash
presto:default> select 100*(select count(*) from filtered_events_tbl where event_type like 'took_nap' and metadata_characteristic like 'cat') / (select count(*) from filtered_events_tbl where event_type like 'took_nap' ) as Percentage_of_naps_that_are_cat_naps;
 Percentage_of_naps_that_are_cat_naps
--------------------------------------
                                   46
(1 row)

Query 20191210_060051_00045_caj53, FINISHED, 1 node
Splits: 10 total, 5 done (50.00%)
0:01 [33 rows, 6.71KB] [48 rows/s, 9.81KB/s]
```

#### 16. Stop all services

Presto
```bash
presto:default> exit
adam_sohn@myw205tools:~/w205/P3$
```
***
stream_filtered_writes_sohn.py
```bash

  "sink" : {
    "description" : "FileSink[/tmp/filterd_evnts]"
  }
}
^CTraceback (most recent call last):
  File "/w205/P3/stream_filtered_writes_sohn.py", line 89, in <module>
    main()
  File "/w205/P3/stream_filtered_writes_sohn.py", line 86, in main
    sink.awaitTermination()
  File "/spark-2.2.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/sql/streaming.py", line 106, in awaitTermination
  File "/spark-2.2.0-bin-hadoop2.6/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1131, in __call__
  File "/spark-2.2.0-bin-hadoop2.6/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 883, in send_command
  File "/spark-2.2.0-bin-hadoop2.6/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1028, in send_command
19/12/10 06:12:46 INFO SparkContext: Invoking stop() from shutdown hook
  File "/opt/anaconda3/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/spark-2.2.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/context.py", line 237, in signal_handler
KeyboardInterrupt
19/12/10 06:12:46 INFO SparkUI: Stopped Spark web UI at http://172.26.0.7:4040
19/12/10 06:12:47 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
19/12/10 06:12:47 INFO MemoryStore: MemoryStore cleared
19/12/10 06:12:47 INFO BlockManager: BlockManager stopped
19/12/10 06:12:47 INFO BlockManagerMaster: BlockManagerMaster stopped
19/12/10 06:12:47 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/12/10 06:12:47 INFO SparkContext: Successfully stopped SparkContext
19/12/10 06:12:47 INFO ShutdownHookManager: Shutdown hook called
19/12/10 06:12:47 INFO ShutdownHookManager: Deleting directory /tmp/spark-b323c825-51a7-4968-bd94-a0cb8002ae7e
19/12/10 06:12:47 INFO ShutdownHookManager: Deleting directory /tmp/spark-b323c825-51a7-4968-bd94-a0cb8002ae7e/pyspark-ee7c7155-446e-4244-b978-2ebab6679b9d
adam_sohn@myw205tools:~/w205/P3$
```
***
Kafka
```bash

{"Accept": "*/*", "Host": "user2.att.com", "event_type": "default", "metadata_characteristic": "", "User-Agent": "ApacheBench/2.3"}
^Cadam_sohn@myw205tools:~/w205/P3$
```
***
game_api_sohn.py
```
127.0.0.1 - - [10/Dec/2019 05:39:02] "GET /funtime HTTP/1.0" 200 -
^Cadam_sohn@myw205tools:~/w205/P3$
```
***
docker-compose
```bash
adam_sohn@myw205tools:~/w205/P3$ docker-compose down
Stopping p3_spark_1     ... done
Stopping p3_kafka_1     ... done
Stopping p3_cloudera_1  ... done
Stopping p3_presto_1    ... done
Stopping p3_zookeeper_1 ... done
Stopping p3_mids_1      ... done
Removing p3_spark_1     ... done
Removing p3_kafka_1     ... done
Removing p3_cloudera_1  ... done
Removing p3_presto_1    ... done
Removing p3_zookeeper_1 ... done
Removing p3_mids_1      ... done
Removing network p3_default
adam_sohn@myw205tools:~/w205/P3$
```

#### Citations

<a name="cite">
</a>
1. http://cloudurable.com/blog/kafka-architecture/index.html<br>
2. https://www.cloudera.com/products/open-source/apache-hadoop/key-cdh-components.html<br>
3. https://stackoverflow.com/questions/40801772/what-is-the-difference-between-docker-compose-ports-vs-expose <br>
4. https://en.wikipedia.org/wiki/Presto_(SQL_query_engine) <br>
5. https://databricks.com/spark/about <br>
6. https://dev.to/ibmdeveloper/what-is-curl-and-why-is-it-all-over-api-docs-9mh <br>
7. https://httpd.apache.org/docs/2.4/programs/ab.html <br>
8. https://en.wikipedia.org/wiki/Chmod <br>
9. https://stackoverflow.com/questions/8967902/why-do-you-need-to-put-bin-bash-at-the-beginning-of-a-script-file <br>
