Skip to content

These components are responsible for interacting with online social activity in order to collect data for the Phoros project

License

Notifications You must be signed in to change notification settings

OmarZOS/remote-extraction-master-and-worker

Repository files navigation

Extraction proxy and worker containers

A distributed way to collect social media data

Description

These components are supposed to be dependent to packages from my previous repositories.

Deploying

Docker

Use this command to build and deploy the containers:

sudo docker-compose up -d

Development

The folder env contains environment variables for each supported online social network. Depending on your implementation, there are variables that are global (like TwitterCredentials) and there are some that are specific for a service. These variable names are shown in extractors.json and initialised inside constants.py in order to be used in a generic way and avoid to mess around inside the code.

  • To use the variables in a large scaled extraction, you should initialise every variable mentioned in extractors.json after launching the shared context subsystem.

Progress

  • Proxy server.
    • Service choice depending on the query model.
    • Submit tasks to workers.
  • Choreographed extraction using context.
  • Extraction template.
    • Sending data to data transformers.
    • Environment variables to initialise the extraction.
    • Image Extraction.
    • Youtube video downloading.
  • Containerisation.
    • Automation of deployment. (docker-compose)
    • Smaller footprint.

NOTES:

  • Proxy and worker components are dependent on the context component.
  • Workers depend on the proxy.

About

These components are responsible for interacting with online social activity in order to collect data for the Phoros project

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages