Scraper and website for comparing scores from rally events
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Scraper and website for comparing scores from rally events

AWS Install

  1. Update apt-get $ sudo apt-get update

  2. Install compilers and libraries $ sudo apt-get install build-essential $ sudo apt-get install zlib1g-dev libbz2-dev libreadline6 libreadline6-dev libssl-dev sqlite3 libsqlite3-dev python3 python3-pip

  3. Clone repo and install dependencies $ git clone $ sudo pip3 install scrapy pymongo scrapyd-client daterangeparser

  4. Install mongodb $ sudo apt-key adv --keyserver hkp:// --recv 0C49F3730359A14518585931BC711F9BA15703C6 $ echo "deb [ arch=amd64,arm64 ] xenial/mongodb-org/3.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.4.list $ sudo apt-get update $ sudo apt-get install -y mongodb-org $ sudo service mongod start $ sudo systemctl enable mongod

  5. Install scrapyd $ sudo -i

    useradd -r -s /bin/false scrapy

    mkdir /var/log/scrapyd

    mkdir /etc/scrapyd

    mkdir /var/lib/scrapyd

    chown -R scrapy /var/lib/scrapyd

    chown -R scrapy /var/log/scrapyd

Copy this into /etc/systemd/system/scrapyd.service

[Unit] Description=scrapyd

[Service] User=root ExecStart=/usr/local/bin/scrapyd -u scrapy -g nogroup -l /var/log/scrapyd/scrapyd.log


Copy this into /etc/scrapyd/scrapyd.conf

[scrapyd] http_port = 6800 debug = off max_proc = 4 eggs_dir = /var/lib/scrapyd/eggs dbs_dir = /var/lib/scrapyd/dbs items_dir = /var/lib/scrapyd/items logs_dir = /var/log/scrapyd jobs_to_keep = 5 max_proc_per_cpu = 4 finished_to_keep = 100 poll_interval = 5.0 bind_address = runner = scrapyd.runner application = launcher = scrapyd.launcher.Launcher webroot =

[services] schedule.json = scrapyd.webservice.Schedule cancel.json = scrapyd.webservice.Cancel addversion.json = scrapyd.webservice.AddVersion listprojects.json = scrapyd.webservice.ListProjects listversions.json = scrapyd.webservice.ListVersions listspiders.json = scrapyd.webservice.ListSpiders delproject.json = scrapyd.webservice.DeleteProject delversion.json = scrapyd.webservice.DeleteVersion listjobs.json = scrapyd.webservice.ListJobs daemonstatus.json = scrapyd.webservice.DaemonStatus

Start the service and verify it works then install it so it starts automatically # systemctl start scrapyd # systemctl status scrapyd # systemctl enable scrapyd

Test you can interact with it: $ curl http://localhost:6800/daemonstatus.json returns: {"finished": 0, "node_name": "ip-172-31-0-90", "status": "ok", "running": 0, "pending": 0}

  1. Test deploying scrapy project $ cd ~/rallyscores/timecontrol $ scrapyd-deploy -a

  2. Register domain:

  3. Setup nodejs server $ curl -sL | sudo -E bash - $ sudo apt-get install -y nodejs $ npm install $ sudo npm install forever -g

Test running the server (sudo needed for port 80) $ PORT=80 sudo npm run start

Build the client on a local checkout and copy it over. (you can do this on the server, it will just take a while to build) $ cd rallyscores/rallyboard-v2/client $ npm run build $ scp -r build -i /.ssh/aws-key.pem

  1. Setup indexes in mongo database $ mongo

    use rally db.ra_events.createIndex({year: -1, event_code: 1}) db.ra_scores.createIndex({year: -1, event_code: 1})

  2. Start the server $ cd ~/rallyscores/rallyboard-v2 $ sudo forever start "PORT=80 NODE_ENV=production npm run server"


  • Delete all elements in a mongodb collection: db.collection.deleteMany({})

Local Testing

Run scrapyd > scrapyd Deploy latest spiders > scrapyd-deploy -a Run webserver > npm start (in rallyboard-v2)