Website: http://forks-insight.com
Poster: Forks Insight: Providing an Overview of GitHub Forks
Full paper: INFOX: Identifying Features in Forks
Language: Python3
Framework: Flask
Database: mongo
Http server: uwsgi & nginx
More on Wiki Page
-
Ramp up the environment according to environment.yaml(or requirements.txt)
Here is an example of using Anaconda:
-
install conda (python3 version) Download Anaconda
-
install dependencies using environment.yaml
conda env create -f environment.yaml source activate p3 (p3 is the env's name, see in environment.yaml)
-
Install mongodb & redis
-
Edit the config (see in config.py) & Set the environment variables
-
Check the config.py
-
export GITHUB_CLIENT_ID=[your_github_oAuth_Client_ID] export GITHUB_CLIENT_SECRET=[your_github_oAuth_Client_Secret] export INFOX_LOCAL_DATA_PATH=[local path for storing analyzed result (like /Users/fancycoder/infox_data)] export INFOX_SECRET_KEY=[a random string(like abcd1234)] export INFOX_MAIL_USERNAME=[smtp_username] export INFOX_MAIL_PASSWORD=[smtp_password]
-
*4. Run http server on localhost:
python manage.py runserver --threaded
*5. Run worker for async crawling on localhost:
celery worker -A celery_worker.celery --loglevel=info
Use flower to monitor the worker:
celery flower --port=5555 --broker=redis://localhost:6379/0 --broker_api=redis://localhost:6379/0
-
Deploy on server:
An quick tutorial: A Simple Tutorial for deploying your Flask application with uWSGI + nginx on server without root permission
Another online tutorial: Serve Flask Applications with uWSGI and Nginx on Ubuntu 16.04
*Note: if using docker container, there is a problem connecting to redis, we haven't figured out a solution yet.
Some of the stackoverflow posts we've looked at discussing this problem include: https://stackoverflow.com/questions/54965291/error-99-connecting-to-localhost6379-cannot-assign-requested-address https://stackoverflow.com/questions/47272072/celery-workers-unable-to-connect-to-redis-on-docker-instances?rq=1 https://stackoverflow.com/questions/33142139/error-could-not-connect-to-redis-at-redis6379-name-or-service-not-known https://stackoverflow.com/questions/50818146/docker-cant-connect-to-redis-from-another-service https://stackoverflow.com/questions/51639652/how-to-configure-docker-to-use-redis-with-celery https://stackoverflow.com/questions/57461129/error-connecting-celery-to-redis-when-using-docker
- Login to eecg.utoronto.ca server
ssh <username>@anubis.eecg.utoronto.ca
- Login to INFOX server
ssh infoxadm@torrent.eecg.utoronto.ca
- Change directory to INFOX folder
cd INFOX/
- Deploy the code changes
docker-compose build
- Restart docker containers
docker-compose up
./app/main - Program Entrance
./app/analyse - Crawler
./app/analyse/analyser.py - Start Crawler and do analysis, load result into database.
./app/analyse/compare_changes_crawler.py - comparing the diff bewteen two repos.
./app/analyse/clone_crawler.py - Download the source code for repo, prepare for calculation for keywords.
./models.py - Database Model
./app/auth - Logic about account
./app/templates - HTML files(related to ./app/main/views.py)
./app/static - CSS/Javascript/Img Resource
./app/tests - Basic Test
./config.py - Config for Flask
./config.ini - Config for using uWSGI
./wsgi.py - Start script for uwsgi
./celery_worker.py - Start script for crawler worker
./manage.py - Start script for testing
./requirements.txt - lib install for pip install
./environment.yaml - env for anaconda
Under ./app/analyse
./app/analyse/analyser.py is the entrance of crawler.
Following is the workflow.