Skip to content

Stream Processing Analysis Platform (initially with Twitter and CrossRef connectors/workers)

License

Notifications You must be signed in to change notification settings

ambalytics/amba-analysis-streams

Repository files navigation

amba-analysis-streams

Generic Stream Processing Framework for processing of Events related to scientific literature entities.

Setup

  1. Create .env with the following variables:
name default value comment
KAFKA_BROKER_ID 1
KAFKA_CREATE_TOPICS events_unlinked:1:1,events_unlinked-discusses:3:1,events_unlinked-crossref:3:1,events_linked:1:1,events_linked-discusses:3:1,events_unknown:3:1,events_processed:1:1,events_processed-discusses:3:1,events_aggregated:3:1
KAFKA_ADVERTISED_HOST_NAME kafka
KAFKA_ZOOKEEPER_CONNECT zookeeper:2181
KAFKA_ADVERTISED_PORT 9092
KAFKA_ADVERTISED_LISTENERS_PREFIX PLAINTEXT://
KAFKA_BOOTRSTRAP_SERVER kafka:9092
POSTGRES_HOST postgres
POSTGRES_PORT 5432
POSTGRES_DB amba
POSTGRES_USER streams
POSTGRES_PASSWORD - (omitted for security)
TWITTER_BEARER_TOKEN - (omitted for security) see developer.twitter.com
INFLUXDB_PORT 8086
INFLUXDB_PASSWORD - (omitted for security)
INFLUXDB_USER streams
INFLUXDB_ORG ambalytics
INFLUXDB_BUCKET history
INFLUXDB_TOKEN - (omitted for security) this is used to authenticate the aggregator
AWS_ACCESS_KEY_ID - (omitted for security) this is for certbot SSL DNS auth with Route53
AWS_SECRET_ACCESS_KEY - (omitted for security) this is for certbot SSL DNS auth with Route53
CONSUMER_KEY_TWITTER_BOT - (omitted for security) this is for twitterbot
CONSUMER_SECRET_TWITTER_BOT - (omitted for security) this is for twitterbot
ACCESS_TOKEN_TWITTER_BOT - (omitted for security) this is for twitterbot
ACCESS_TOKEN_SECRET_TWITTER_BOT - (omitted for security) this is for twitterbot
  1. Make sure ports 80 and 443 are free.
  2. Optionally, get SSL certs first: comment out certbot:command in docker-compose.yml, then run docker-compose up --no-deps certbot
  3. Run the stack (comment out certbot:command to prevent log spam): docker-compose up

Update a container while keeping the rest running

  1. merge into master and make sure the packaging was successful
  2.  docker pull <container source>
    
  3.  docker-compose up -d --no-deps --build <container name>
    

Example perculator:

 docker pull ghcr.io/ambalytics/amba-analysis-worker-perculator/amba-analysis-worker-perculator:latest
 docker-compose up -d --no-deps --build twitter-perculator

Helpful commands:

docker stats   | stats docker
docker ps -a   | stats about container

df -h          | disk usage
free           | memory usage

About

Stream Processing Analysis Platform (initially with Twitter and CrossRef connectors/workers)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages