Sift NLP

Natural language processing service of the Sift app.

The other components to the Sift app are in these repositories: The following repositories contain Sift services:

Setup

Install `pip` and `virtualenv`

Ubuntu sudo apt-get install virtualenv

OS X brew install virtualenv

Initiate virtual environment, activate it, and code.

virtualenv -p python2.7 venv
source ./venv/bin/activate
make init

Alternatively all these instructions are automated (for *nix systems) in setup.sh. After installing virtualenv as instructed above, run in your root sift-nlp directory as such:

chmod +x setup.sh
./setup.sh

Set up Docker environment

See instructions in Sift Base for setting up a Docker environment to run Sift. If you can't use Docker for some reason, follow the instructions below for installing and running RabbitMQ and Redis manually.

Install RabbitMQ

Ubuntu

sudo apt-get update
sudo apt-get install rabbitmq-server

OS X

brew update
brew install rabbitmq

Add /usr/local/sbin to the path. That's where the RabbitMQ executables are.

export PATH=$PATH:/usr/local/sbin

Start a RabbitMQ server.

rabbitmq-server

Set up a user and vhost.

rabbitmqctl add_user sift sift
rabbitmqctl add_vhost sift
sudo rabbitmqctl set_permissions -p sift sift ".*" ".*" ".*"

Install Redis

Ubuntu

sudo apt-get update
sudo apt-get install redis

OS X

brew update
brew install redis

Start a Redis server.

redis-server

Testing

For test data, go to the Amazon product review dataset and download any of the datasets (I recommend only one as these are large files) to test your code against.

We will be using the pytest testing framework which should be installed when you run make init. If this does not work, run pip install pytest after activating venv.

All tests must be kept in the tests/ dir. To run your tests, type make test in the Makefile dir. See test_{nlp, parse}.py for examples of tests.

Running Without Docker

If you can't use Docker for some reason, you can manually run Sift services.

Sift NLP requires running Celery, Redis, and RabbitMQ. You can run each of these by running make run-celery, make run-rabbitmq, and make run-redis,

Jobs

Sift NLP is composed of jobs, which are simply functions registered with Celery so they can be run asynchronously and in parallel. You can find a simple example job in jobrunner/jobs/sample.py. Jobs accept 0 or more inputs of any JSON-serializable type, and return a JSON-serializable object.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
data		data
docs		docs
jobrunner		jobrunner
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
setup.sh		setup.sh
sift		sift

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sift NLP

Setup

Install `pip` and `virtualenv`

Set up Docker environment

Install RabbitMQ

Install Redis

Testing

Running Without Docker

Jobs

About

Releases

Packages

Contributors 3

Languages

License

ubclaunchpad/sift-nlp

Folders and files

Latest commit

History

Repository files navigation

Sift NLP

Setup

Install pip and virtualenv

Set up Docker environment

Install RabbitMQ

Install Redis

Testing

Running Without Docker

Jobs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Install `pip` and `virtualenv`

Packages