# Overview


You'll find two major components in this directory, that can merge into one. 

- [Video Image Analysis](videoImageAnalysis.jupyter-py36.ipynb)(VIA) : Extract objects from video. 
- [Live Wiki Extract](liveWikiExtract.jupyter-py36.ipynb): Analysis of current WikiPedia updates.

Video Image Analysis creates a Streams application that accepts images and extracts objects using an open sourced ML code. The notebook composes and submits the application, images that have objects recognized by the ML module can be rendered in the notebook. The images arrive on the 'image_active' topic.

Two source of images are provided :
- 'Live Wiki Extract' notebook monitors [Wikimedia EventStreams](https://stream.wikimedia.org/?doc) where updates to Wikipedia are published as they arrive. The notebook goes through multi step process of extracting and characteriing updates,  eventually getting to images which are published to the 'image_active' topic.
- The [Cropped Image Sender](croppedImageSender.ipynb) extracts frames from mpeg videos and pumps them to streams via Kafka. The recieving side on Streams, accepts the frames publishes them on the 'image_active' topic. 

Images that are source from the wikipidiea or video stream are processed and rendered by the 'Video Image Analysis' notebook.
- 


## Two Components

The 'Live Wiki Extract' notebook is a 'realistic' processing of a live feed, phases include :
- Connecting to the wikipedia feed using SSE (web standard similar to WebSocket).
- Break out requests generated by robots or non-robots.
- Break out the countries the submission are currently submitted from. Majority of submissions come in the countries evening. 
- Capture top submitters. 
- Extracting web content with Beautiful Soup, a beguiling named Python package used to shred web content.
- Fetching image content using the Python Requests package.
- publishing images for further processing on the 'image_active' topic.

This processing is built on relativly notebook composed of mulitple applications that communicate via pub/sub. A hard to slog for the curious but practical. 

The 'Video Image Processing' accepts images on the 'image_active' topic and use open source ML to analyze. A relativly short piece of code that illustrates how to intergrate a ML 'BlackBox'. 'BlackBox' box in this context mean blackbox since I found the model and moved it over - links to the description of how it's generated can be found in the notebook.

 




## Prerequisites 

### Imports 
The following collects all the packages used by the notebooks broken down when they are first required.
##### Notebook : imagAna_0
- pip install SSEClient===0.0.22 --upgrade --user
- pip install pillow
- pip install matplotlib
- pip install ipywidgets
- pip install pandas


##### Notebook: liveWikiExtract
- pip install --user --upgrade streamsx
- pip install beautifulsoup4
- pip install opencv-python
- pip install kafka-python

##### Notebook : croppedImageSender
- pip install m3u8
- pip install interactivecrop


## Credentials 



The applications use two Cloud resources. This will go though creating resources, fetching the credentials and setting up the scripts/credentials.py file. All the notebooks use something from the credential.py. 


### Streams - 

Using your IBM Cloud a account create a Lite [Streaming Analytics](https://cloud.ibm.com/catalog/services/streaming-analytics) service. 
-  When if finishes provisioning access the newly created resource. Select the 'Service credentials' to bring up the 'Service credentials' pane. 
- Create the credential by selecting 'New credential'
- Select the newly created credential's 'copy to clipboard' on the right
- Paste the copied text into the cell below, this will be used when we build the credentials file below

**ovewrite with Streams credential**

### Event Streams  aka Kafka
Using your IBM Clound account create a Lite [Event Streams](https://cloud.ibm.com/catalog/services/event-streams) service.
- When finished provisioning access the newly created resource. Select the 'Manage' to bring up the managment pane. 
- Select 'Create a topic'
- Enter 'VideoFrame' for the topic name and select 'Next'
- Select 'Next' and 'Create topic' with the defaule values.
- You will be retrned to your instance's resource entry. 
- Select the 'Service credentials' to bring up the 'Service credentials' pane. 
- Create the credential by selecting 'New credential'
- Select the newly created credential's 'copy to clipboard' on the right
- Paste the copied text into the cell below, this will be used when we build the credentials file below

**ovewrite with Event Streams credential**

## Update and rename credentialSetup.py

- Open 'script/credentialSet.py' file. 
- Paste the Streams and Event Streams credentials from above into the areas specified within the file. 
- Save the updated 
- Rename the file credential.py, do not check in the file to git. 

The credential.py is accessed when submitting Steams jobs and utilizing the Event Streams facility.



## Installing the yolov3.weights : 

Their are two Image analysis phases: locate faces and locate objects. I treat the models as a BlackBox, images go in and a score comes out. The development of the models can be found below. You will need to download the yolo3.weights files it's beyond the capacity of git.  

Download https://pjreddie.com/media/files/yolov3.weights and copy it into datasets directory with the other yolov3 files. 


[Yolo3](https://www.learnopencv.com/deep-learning-based-object-detection-using-yolov3-with-opencv-python-c/)

The code is using the YOLOv3 model, based upon this [note](https://blog.roboflow.ai/yolov5-is-here/) it is evolving. 


