This project aims to compare the current tweets' stream based on different NLP algorithms.
- common: contains the shared, flat generic models and entities
- twitter-indexer: streams the tweets from twitter based on the required terms via kafka
- sentiment: enrich/sentiment the consumed tweets
- es-sink: persist the enriched tweets into es
- Java 11
- Spring Boot Framework
- Apache Kafka
- Elasticsearch
- Docker/Docker-compose
- Stanford NLP library
TODO
TODO
- schema registry?
- metrics should be fine-grained in graphite
- mount logging into kibana via filebeat
- es indexing based on sentiment analysis
- more kafka brokers into docker compose
- kafka sink to ES? (ksql)
Stanford coreNLP //Actually I double-checked the result of this NLP algorithm, and I am not satisfied with it.//
I have to say that the NLP libs' documentation - all of them are ugly - messy, not clear, readable easily...unfortunately.