From d37b65aca149b3df6f2d789590ad42665eef3b8e Mon Sep 17 00:00:00 2001 From: Clemens Wolff Date: Wed, 9 Aug 2017 11:52:24 -0700 Subject: [PATCH] Remove outdated Kafka entry --- README.md | 77 +------------------------------------------------------ 1 file changed, 1 insertion(+), 76 deletions(-) diff --git a/README.md b/README.md index aec77228..80694eb1 100644 --- a/README.md +++ b/README.md @@ -13,82 +13,7 @@ This project contains a Spark Streaming job that ingests data into the Fortis sy 3. Narrow down the stream of events based on user-defined geo-areas, target keywords and blacklisted terms. 4. Perform trend detection and aggregate the metrics that back Project Fortis. -At the end of the ingestion pipeline, we publish the events to Kafka from where any downstream processors or aggregators -can consume the data. The schema of the data in Kafka is as follows: - -```json -{ - "title": "FortisEvent", - "type": "object", - "properties": { - "language": { - "type": "string" - }, - "locations": { - "description": "The ids of all places mentioned in the event", - "type": "array", - "items": { - "description": "A Who's-On-First id", - "type": "string" - } - }, - "sentiments": { - "type": "array", - "items": { - "description": "Neutral sentiment is 0.6, 0 is most negative, 1 is most positive.", - "type": "number", - "minimum": 0, - "maximum": 1 - } - }, - "keywords": { - "type": "array", - "items": { - "type": "string" - } - }, - "entities": { - "type": "array", - "items": { - "type": "string" - } - }, - "summary": { - "type": "string" - }, - "id": { - "type": "string" - }, - "createdAtEpoch": { - "type": "number" - }, - "body": { - "type": "string" - }, - "title": { - "type": "string" - }, - "publisher": { - "type": "string" - }, - "sourceUrl": { - "type": "string" - }, - "sharedLocations": { - "description": "The ids of all places explicitly tagged in the event", - "type": "array", - "items": { - "description": "A Who's-On-First id", - "type": "string" - } - } - }, - "required": [ - "id", - "createdAtEpoch" - ] -} -``` +At the end of the ingestion pipeline, we publish the events and various aggregations to Cassandra. ## Development setup ##