Data-Tag is an evolved system to classify textual data and web pages using NLP techniques, rather than not so intelligent Keyword-based Tagging. It uses NLTK to categorize data tokens into various "Word-Classes" and then using Open Data from Wikipedia applies Word-Sense Disambiguation algorithm to "smartly" tag the input data.
After Forking the Repo into your account...
git clone to clone this repo to your local machine:
$ git clone https://github.com/rishy/data-tag.git
Install all the dependencies using
$ npm install
Install all the
$ bower install
for first time, install a virtual environment in root directory using
install.bat for Windows):
$ chmod +x install.sh $ ./install.sh
Keep your Cool, this will take a while to install all the dependencies. ;)
Note:- First install
Fabric to run below commands
$ sudo pip install fabric
To install all dependencies in
$ fab installDep
To run an app :
$ fab runapp
To run a worker :
$ fab runworker
App Running on http://127.0.0.1:5000/
After Cloning the Repo...
upstream to this repo:
The easiest way is to use the https url:
git remote add upstream https://github.com/rishy/data-tag.git
or if you have ssh set up you can use that url instead:
git remote add upstream firstname.lastname@example.org:rishy/data-tag.git
Working branch for data-tag will always be the
develop branch. Hence, all the latest code will always be on the develop branch.
You should always create a new branch for any new piece of work branching from develop branch:
git branch new_branch
NOTE: You must not mess with
master branch or bad things will happen.
master branch contains the latest stable code, so just leave it be.
Before starting any new piece of work, move to develop branch:
git checkout develop
Now you can fetch latest changes from main repo using:
git fetch upstream
merge the latest code with develop branch:
git merge upstream/develop
checkout to your newly created branch:
git checkout new_branch
Rebase the code of new_branch from the code in develop branch, run the
rebase command from your current branch:
git rebase develop
Now all your changes on your current branch will be based on the top of the changes in develop branch.
Push your changes to your forked repo
git push origin new_branch
Now, you can simply send the Pull Request to Parent Repo from within the Github.
Always squash up your commits into a single commit before sending the Pull Request. Use
git rebase -i for this purpose. For example to squash last 3 commits into a single commit, simply run:
git rebase -i HEAD~3
For any changes in
requirements.txt, you will have to run
flask/bin/pip install -r requirements.txt