Initial commit.

NVIDIA-AI-IOT · Nov 13, 2018 · 259d338 · 259d338
commit 259d338
Show file tree

Hide file tree

Showing 208 changed files with 43,579 additions and 0 deletions.
diff --git a/LICENSE.md b/LICENSE.md
@@ -0,0 +1,7 @@
+Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,13 @@
+This document describes the full end to end smart parking application that is available with DeepStream 3.0. The below architecture provides a reference to build distributed and scalable DeepStream applications.
+
+![Architecture](readme-images/architecture.png?raw=true "Architecture")
+
+The perception capabilities of a DeepStream application can now seamlessly be augmented with data analytics capabilities to build complete solutions, offering rich data dashboards for actionable insights. This bridging of DeepStream’s perception capabilities with data analytics frameworks is particularly useful for applications requiring long term trend analytics, global situational awareness, and forensic analysis. This also allows leveraging major Internet of Things (IOT) services as the infrastructure backbone.
+
+The data analytics backbone is connected to DeepStream applications through a distributed messaging fabric. DeepStream 3.0 offers two new plugins, gstnvmsgconv and gstnvmsgbroker, to transform and connect to various messaging protocols. The protocol supported in this release is Kafka.
+
+
+
+To build an end to end implementation of the Analytics layer, DeepStream 3.0 uses open source tools and frameworks that can easily be reproduced for deployment on an on-premise server or in the Cloud.
+The framework comprises stream and batch processing capabilities. Every component of the Analytics layer, Message Broker, Streaming, NoSQL, and Search Indexer can be horizontally scaled. The streaming analytics pipeline can be used for processes like anomaly detection, alerting, and computation of statistics like traffic flow rate. Batch processing can be used to extract patterns in the data, look for anomalies over a period of time, and build machine learning models. The data is kept in a NoSQL database for state management, e.g. the occupancy of a building, activity in a store, or people movement in a train station. This also provides the capability for forensic analytics, if needed. The data can be indexed for search and time series analytics. Information generated by streaming and batch processing is exposed through a standard API for visualization. The API can be accessed through REST, WebSocket, or messaging, based on the use case. The user interface allows the user to consume all the relevant information.
+Deployment is based on an open source technology stack. The modules and technology stack are shown with respect to the Streaming Data pipeline.
diff --git a/analytics_server_docker/README.md b/analytics_server_docker/README.md
@@ -0,0 +1,164 @@
+# Docker 360
+
+## Introduction
+
+To demonstrate the full end to end capabilities and to help developers jump start their development, Deepstream 3.0 comes with a complete reference 
+implementation of a smart parking solution. This reference application can be deployed on edge servers or in the cloud. 
+Developers can leverage this and adapt to their specific use cases. Docker containers have been provided to further simplify deployment, adaptability, and manageability.
+
+The architecture of the application looks as follows:
+
+![Architecture](readme-images/architecture.png?raw=true "Architecture")
+
+**Note**: This application creates docker containers only for Analytics Server.
+
+The application can be run in two modes:
++ **Playback**: This mode is used to playback events from a point in time
++ **Live**: This mode is used for seeing the events, scene as and when they are detected
+
+## Getting Started
+
+### Dependencies
+
+The application requires recent versions of [Docker](https://docs.docker.com/install/linux/docker-ce/ubuntu/) and [Docker Compose](https://docs.docker.com/compose/install/#install-compose) to be installed in the machine.
+
+### Environment Variables
+
+Export the following Environment variables:
++ **IP_ADDRESS** - IP address of host machine
++ **GOOGLE_MAP_API_KEY** - Api Key for Google Map
+
+Follow the instructions in this [link](https://developers.google.com/maps/documentation/javascript/get-api-key) to get an api key for Google Maps.
+
+### Configurations
+
+Playback is the default mode of the application.
+
+If live mode has to be used, then:
+
+1. Go to `node-apis/config/config.json` and change the following config:
+
+        garage.isLive: true
+
+2. Send the data generated by DeepStream 3.0 to the kafka topic `metromind-raw`
+
+### Installation
+
+1. Install Docker and Docker Compose.
+
+2. Export the environment variables
+
+    a) IP Address of Host machine
+
+        export IP_ADDRESS=xxx.xxx.xx.xx
+
+    b) Google Map API Key:
+
+        export GOOGLE_MAP_API_KEY=AIzaSyA9nCK3AwnwsWMaFv4Ce_4wymC6ai3JEh0
+
+3. Download this application by either clicking the download button on top right corner, or using the command
+
+        git clone https://gitlab-master.nvidia.com/metromind/DS-360-app/docker-360.git
+        
+4. Change Configurations (Optional)
+
+5. Run the docker containers using the following `docker-compose` command
+
+        sudo -E docker-compose up -d
+
+    this will start the following containers
+
+        cassandra
+        kafka
+        zookeeper
+        spark-master
+        spark-worker
+        elasticsearch
+        kibana
+        logstash
+        api
+        ui
+        python-module
+
+6. Start spark streaming job, this job does the following
+
+    a) manages the state of parking garage
+
+    b) detects car "understay" anomaly
+
+    c) computes flowrate
+
+
+    run the following command to login into spark master 
+
+        sudo docker exec -it spark-master /bin/bash
+
+    the docker container picks up the jar file from spark/data
+
+        ./bin/spark-submit  --class com.nvidia.ds.stream.StreamProcessor --master spark://master:7077 --executor-memory 8G --total-executor-cores 4 /tmp/data/stream-360-1.0-jar-with-dependencies.jar
+
+
+    Note that one can check out the stream-360 project and compile the source code using maven to create the stream-360-1.0-jar-with-dependencies.jar
+
+        mvn clean install -Pjar-with-dependencies
+
+7. Start spark batch job, this detects "overstay" anomaly (Note stop the streaming job in step 6 before running this job, or allocate more resources in the spark cluster)
+
+    use a second shell, run the following command to login into spark master 
+
+        sudo docker exec -it spark-master /bin/bash
+
+    run the batch job
+
+        ./bin/spark-submit  --class com.nvidia.ds.batch.BatchAnomaly --master local[8]  /tmp/data/stream-360-1.0-jar-with-dependencies.jar
+
+8.  **Generate Data** (Optional) , for test purpose ONLY, normally Deepstream 360 application will read from camera and send metadata to Analytics Server 
+
+        a) sudo apt-get update
+        b) sudo apt-get install default-jdk
+        c) sudo apt-get install maven 
+        d) git clone https://gitlab-master.nvidia.com/metromind/DS-360-app/stream-360.git 
+        e) cd ./stream-360
+        f) sudo mvn clean install exec:java -Dexec.mainClass=com.nvidia.ds.util.Playback -Dexec.args="<KAFKA_BROKER_IP_ADDRESS>:<PORT> --input-file <path to input file>"
+
+    **Note**: 
+    + Change KAFKA_BROKER_IP_ADDRESS and PORT with Host IP_ADDRESS and port used by Kafka respectively.
+    + Set path to input file as `data/playbackData.json` for viewing the demo data.
+    + The following additional options can be added to args in step f:
+        + **topic-name** - Name of the kafka topic to which data has to be sent. Set it to `metromind-raw` if input data is not tracked, but if input data has already gone through the tracking module then send it to `metromind-start`. The default value used in step f is `metromind-start`.<br/>
+    With this additional option, step f will look as follows:
+
+                sudo mvn clean install exec:java -Dexec.mainClass=com.nvidia.ds.util.Playback -Dexec.args="<KAFKA_BROKER_IP_ADDRESS>:<PORT> --input-file <path to input file> --topic-name <kafka topic name>"
+
+9. **Create Elasticsearch start-Index** (Optional)
+
+    browse to Kibana URL http://IP_ADDRESS:5601
+
+     ![Start Index](readme-images/index-creation-1.png?raw=true "Start Index")
+
+
+10. **Create Elasticsearch anomaly-Index** (Optional)
+
+    ![Anomaly Index](readme-images/index-creation-2.png?raw=true "Anomaly Index")
+
+11. **Automated Script** (Optional)
+
+    The entire process to start and stop the dockers can be automated using `start.sh` and `stop.sh`.
+
+    If `start.sh` is going to be used, make sure that `xxx.xxx.xx.xx` is replaced by the IP ADDRESS of the host machine. Also replace `<YOUR GOOGLE_API_KEY>` with your own API KEY (Use this key for testing: AIzaSyA9nCK3AwnwsWMaFv4Ce_4wymC6ai3JEh0).
+
+    `stop.sh` should be only used when the containers need to be stopped and the docker images have to be removed from the system. 
+
+    Use `sudo docker-compose down` to stop the containers. This significantly reduces the time taken by docker containers to start again when `start.sh` is executed.
+
+    **Note**:
+    + The deepstream application should be started only after the analytics server is up and running.
+    + Remember to shut down the docker-containers of analytics server once the deepstream is shut down.
+
+12. **Test**
+
+    http://IP_ADDRESS
+
+    ![UI](readme-images/ui.png?raw=true "UI")    
+
+    **Note**: The events that show up in the UI are comparatively less as compared to real events. This is because, if a object has a lot of events within the refresh interval then the events with respect to other objects may get obscured. To avoid this situation we display only a few events per object.
diff --git a/analytics_server_docker/cassandra/Dockerfile b/analytics_server_docker/cassandra/Dockerfile
@@ -0,0 +1,11 @@
+FROM cassandra:3.11.2
+
+WORKDIR /home/cassandra
+
+COPY entrypoint-wrap.sh .
+
+COPY schema.cql .
+
+ENTRYPOINT ["/home/cassandra/entrypoint-wrap.sh"]
+
+CMD ["cassandra", "-f"]
diff --git a/analytics_server_docker/cassandra/entrypoint-wrap.sh b/analytics_server_docker/cassandra/entrypoint-wrap.sh
@@ -0,0 +1,8 @@
+#!/bin/bash
+
+until cqlsh -u cassandra -p cassandra -f /home/cassandra/schema.cql; do
+    echo "cqlsh: Cassandra is unavailable - retry later"
+    sleep 2
+done &
+
+exec /docker-entrypoint.sh "$@"