Scripts para scrapear datos del Registro de Deudores Alimentarios Morosos
HTML Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
old_scripts
scraper_redam
.gitignore
COPYING.txt
LICENSE.txt
README.md
requirements.txt

README.md

Install

git clone https://github.com/ventanita/scraper_redam.git
cd scraper_redam
pip install -r requirements.txt

Ubuntu instructions

Install dependencies for cryptography package:

sudo apt-get install build-essential libssl-dev libffi-dev python-dev

After this just run the above mentioned install.

Run this way

cd scraper_redam
scrapy crawl redam -a start_id=1 -a end_id=3000

Run scrapy using other server's IP

  • Install tsocks.
  • Configure /etc/tsocks.conf:
server = 127.0.0.1
server_type = 5
server_port = 9999
  • Login ssh -D 9999 user@server
  • Run scraper: tsocks scrapy crawl spider

See more http://blog.scrapinghub.com/2010/11/12/scrapy-tsocks/