Description

Simple spiders made using Scrapy to retrieve free proxys from websited publishing them. We use a simple judge to filter them.

Requirements

Configuration

You'll need to configure the general.cfg file using this template:

Postgres database where we will store the results.

[postgres]
drivername = postgres 
host = >Postgres host<
port = >5432 by default<
username = >Postgres username<
password = >Postgres passwd<
database = >Postgres database name

Optionally we can also configure a redis database that will deliver the fresh proxies to our app:

[remote_redis]
host = >Redis host<
port = >Redis port<
password = >Redis passwd<

Also, if you want to run the job periodically, you can set up crontab to work with virtualenv. WITH THE VIRTUALENV ACTIVATED

$ echo "PATH=$PATH" > myserver.cron
$ crontab -l >> myserver.cron
$ crontab myserver.cron

Crontab file will now look like:

PATH=/home/me/virtualenv/bin:/usr/bin:/bin:  # [etc...]

Then add the jobs like:

1 * * * * sh ~/Your-project-folder/cronscripts/execute_spider.sh vpnhook >>/tmp/cron_debug_log.log 2>&1

Please note that the trailing addition is for log and debug purposes, and the execute spider is a generic script.

Basic Usage

This will output via console the results of the spider of the website proxyorca.

scrapy crawl proxyorca

If we want to save the results into a file

scrapy crawl proxyorca -o items.json

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
ProxyFetcher		ProxyFetcher
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
general.cfg		general.cfg
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Requirements

Configuration

Basic Usage

About

Releases

Packages

Languages

License

MartiONE/scrapy-proxy-spiders

Folders and files

Latest commit

History

Repository files navigation

Description

Requirements

Configuration

Basic Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages