# Joe Mirza - Project 3 - W205

<br>  
### **My project has the following files:**  
**1. Project_3_Joe_Mirza_w205.ipynb:** This notebook. To narrate the pipeline steps and perform a few presto queries once the data is landed in hadoop.   <br>  
**2. ab.sh:** A bash script I run from within a while loop that runs on the command line and streams events into the pipeline using apache bench.   <br>  
**3. game_api.py:** Takes the events from the last step and routes them from flask into kafka. Maps those incoming events into event dictionaries, appends header information and packages each bundle into a json before sending the object into kafka.    <br>  
**4. stream_and_hive.py:** Reads the event objects from kafka. Uses pyspark to filter the 4 events types I'm supporting (default, purchase_sword, purchase_armor and join_a_guild) into 4 separate tables which are registered in hive and written to hadoop.     <br>  
**5. docker-compose.yml:** Spins up the cluster, maps/coordinates the interaction between some of the services. Will pull a few pieces out of that file into this one inline in this notebook to demonstrate I understand how they work.   <br>  

Will spin up the cluster with:        `     docker-compose up -d`

### **Services spun up:**
- zookeeper
- kafka
- cloudera
- spark
- presto
- mids
<br>  
The only new service here, relative to what we've done in Project 1 and 2, is presto. Spark is used in this project to filter, transform and ultimately write events to hadoop. We'll use presto in this notebook to perform sql queries.  


### **Start up kafkacat to 'consume' or receive events from apache bench:**

#### `docker-compose exec mids kafkacat -C -b kafka:29092 -t events -o beginning`

I followed the approach laid out in unit 13a, which didn't create the topic 'events' first but rather relied on the fact that kafkacat will create a topic 'events' if it doesn't already exist. But that requires entering the above command twice. Seems odd, but it works. 

Per the above command and the .yml, kafka is listening on port 29092
    ```
      kafka:
        image: confluentinc/cp-kafka:latest
        depends_on:
          - zookeeper
        environment:
          KAFKA_BROKER_ID: 1
          KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
        expose:
          - "9092"
          - "29092"
    ```

### **Before running flask, we'll discuss what game_api.py does and how I modified it. I'll only show the method and flask decorator I created so you don't have to wade through the whole thing. If you need to see all of game_api.py, it's in the repository.**

1.flask is imported to read and, well, 'route' the apache bench messages it receives  <br>    
2.KafkaProducer is imported to send the events that are transformed here into the kafka queue   <br>  
3.request is imported to append header information to the transformed events    <br>  
4.Each of the 4 events type I'm supporting has a different flask decorator and associated python method. The one I created is join_a_guild. When flask's @app.route 'sees' an incoming event from apache bench ending in "/join_a_guild", it creates a join_guiild_event dictionary and sends it to the log_to_kakfa method. There the header is appended and the whole thing is sent to the `events` topic as a json by kafka producer. <br>  
 

    ```
    #!/usr/bin/env python
    import json
    from kafka import KafkaProducer
    from flask import Flask, request

    app = Flask(__name__)
    producer = KafkaProducer(bootstrap_servers='kafka:29092')

    def log_to_kafka(topic, event):
        event.update(request.headers)
        producer.send(topic, json.dumps(event).encode())
    ...
    ...
    ...

    @app.route("/join_a_guild")
    def join_a_guild():
        join_guild_event = {'event_type': 'join_a_guild'}
        log_to_kafka('events', join_guild_event)
        return "Guild Joined!\n"
    ```

### **Let's run flask now**

#### `docker-compose exec mids env FLASK_APP=/w205/project-3-FuriousGeorge19/game_api.py flask run --host 0.0.0.0`

### **With flask and kafka running, now is a good point to test whether what we've built up until this point is working by using apache bench to stream some events and see if they're picked up by flask and kafka **

Rather than use the while loop in the live session slides, that had the apache bench calls in them, I thought I'd try to use the while loop to run the bash script ab.sh repeatedly instead. I was pleasantly surprised when it worked. 

In [18]:
!cat ab.sh

#!/bin/bash
 
docker-compose exec mids ab -n 10 -H "Host: user1.comcast.com" http://localhost:5000/
docker-compose exec mids ab -n 10 -H "Host: user1.comcast.com" http://localhost:5000/purchase_a_sword
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/join_a_guild
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/purchase_armor

docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/purchase_a_sword
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/join_a_guild
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/purchase_armor

docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/purchase_a_sword
docker-compose exec mids ab -n 10 -H "Host

The modified loop sends lots and lots (I guess 120 events) each time the loop runs. 

    ```
    
    while true; do 
      ./ab.sh; 
      sleep 10;  
    done
    
    ```

In [9]:
!ls

Project_3_Joe_Mirza_w205.ipynb	docker-compose.yml  stream_and_hive.py
README.md			game_api.py
ab.sh				game_api.pyc


In [2]:
!cat ab.sh

#!/bin/bash
 
docker-compose exec mids ab -n 10 -H "Host: user1.comcast.com" http://localhost:5000/
docker-compose exec mids ab -n 10 -H "Host: user1.comcast.com" http://localhost:5000/purchase_a_sword
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/join_a_guild
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/purchase_armor

docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/purchase_a_sword
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/join_a_guild
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/purchase_armor

docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/
docker-compose exec mids ab -n 10 -H "Host: user2.att.com" http://localhost:5000/purchase_a_sword
docker-compose exec mids ab -n 10 -H "Host