Skip to content

Latest commit



137 lines (108 loc) · 4.3 KB

File metadata and controls

137 lines (108 loc) · 4.3 KB


A platform for universal social media data harvesting.

Description :

This platform was designed to be a universal social media dataset generator,
it collects the data using the NetworkExtractor according to a global
 schema which is predefined inside in the Model folder.
The resulting graph is passed through a Transformer class to apply any cleaning or
 restraints either on the schema or the data.
The final graph is conducted through a canal
 to be received by the listening storage services. 

Dependendcies :

Pyvis (for test visuals)

rabbitmq-server : (for results publishing/listening functionality)

sudo apt-get update && sudo apt-get upgrade
sudo apt-get install erlang
sudo apt-get install rabbitmq-server
sudo systemctl enable rabbitmq-server
sudo systemctl start rabbitmq-server
sudo rabbitmq-plugins enable rabbitmq_management

Adding an account

sudo rabbitmqctl add_user username password

Giving that user adiministraitve rights

sudo rabbitmqctl set_user_tags username administrator
sudo rabbitmqctl set_permissions -p / username "." "." "."
pip install pika
# enabling the service
start rabbitmq server
sudo systemctl start rabbitmq-server
sudo systemctl enable rabbitmq-server

Neo4j :

You can choose the native or containerised one (or both!!).

native Neo4j :

sudo apt install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL | sudo apt-key add -
sudo add-apt-repository "deb stable 4.1"
sudo apt install neo4j
sudo systemctl enable neo4j.service
sudo systemctl status neo4j.service
pip install neo4j
# to interact with neo4j from CLI

Cypher Shell:

# allow remote connections
sudo nano /etc/neo4j/neo4j.conf
# add this line

Neo4j container :

#getting the image for the first time
docker pull neo4j
docker pull confluentinc/cp-server    

#running the container
docker run \
	--name testneo4j \
	-p7474:7474 -p7687:7687 \
	-d \
	-v $HOME/neo4j/data:/data \
	-v $HOME/neo4j/logs:/logs \
	-v $HOME/neo4j/import:/var/lib/neo4j/import \
	-v $HOME/neo4j/plugins:/plugins \
	--env NEO4J_AUTH=neo4j/test \

#optional : to run neo4j as the current user@group, replace --env NEO4J_AUTH=neo4j/test
docker run \
... \
--user="$(id -u):$(id -g)" \

Cypher Shell:

#launch the container in interactive mode
docker exec -it testneo4j bash

#enter credentials (user,password)
cypher-shell -u neo4j -p test

Usage :

  • Networking : Go to the publishing service to verify your channels.
  • Schema models : Visit API_Models/ to view/modify schema models.
  • Execution : Go to the file and launch it

Phoros :

This repository has evolved to become a cloud service for distributed online social data extraction. This repository is no longer supported and is used as an abstract package for the Phoros variants. You can visit our roadmap, here are the repositories that are linked to the phoros project:

Progress (for the phoros project):

  • Current code consistency.
  • Multithreading .
  • Containerisation.
  • Natural language processing.
  • Graph oriented database storage.
  • Document oriented database storage.
  • Infrastructure as code.
  • Database as a service.
  • Communications as a service.
  • Twitter support.
  • Youtube support.
  • LinkedIn support.
  • Facebook support.
  • Instagram support.
  • Graphical user interface.
  • Encrypting network circulating data.