This repository has been archived by the owner on Jun 1, 2022. It is now read-only.

snc-scraper

Scraping the Auckland SNC Hockey website one symbol at a time. (http://www.aucklandsnchockey.com)

Foreword

At the time of writing this all the scraping is done in Beautiful Soup 4. There are plans to move it to use Scrapy later down the lines.

Setup

This repository uses Scrapy and Python3. To get set up do the following:

Install python3
Set up a virtual environment virtualenv venv or virtualenv -p python3 venv
Activate your virtual environment with source venv/bin/activate
cd into this repo
Run pip install -r requirements.txt
Write awesome code

Running

python src/main.py (builds not yet implemented)

Scraping in the REPL

from bs4 import BeautifulSoup

import requests

r = requests.get([url goes here])

soup = BeautifulSoup(r.text, 'lxml')