Scrapyd-mongodb

Scrapyd is a fantastic open-source library for management of crawlers using scrapy-framework. However, the builtin queue management is implemented to work using SQLite which ends up being a problem when we need to scale.

This library is designed to replace the SQLite backend by a MongoDB backend. In other words, all the queue management will be done using MongoDB.

Install

You need to have MongoDB installed before using this library. The documentation to install it can be found at: https://docs.mongodb.org/manual/installation/

scrapyd-mongo is available at pypi:

$ pip install scrapyd-mongodb

Config

To start using this library you just need to override the application option in your scrapy.cfg file:

[scrapyd]
application = scrapyd_mongodb.application.get_application
...

If you want to customize the access to the database, you can add into your scrapy.cfg file:

[scrapyd]
mongodb_name = scrapyd_mongodb
mongodb_host = 127.0.0.1
mongodb_port = 27017
mongodb_user = custom_user  # (Optional)
mongodb_pass = custompwd  # (Optional)
...

Contributing

Having trouble? have suggestions?
Report bugs or suggestions on the issue tracker.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
scrapyd_mongodb		scrapyd_mongodb
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapyd-mongodb

Install

Config

Contributing

About

Releases 2

Packages

Contributors 3

Languages

License

Tiago-Lira/scrapyd-mongodb

Folders and files

Latest commit

History

Repository files navigation

Scrapyd-mongodb

Install

Config

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Languages

Packages