Skip to content
This repository has been archived by the owner on Aug 16, 2023. It is now read-only.

Scrapy doo #3

Merged
merged 6 commits into from
Mar 13, 2016
Merged

Scrapy doo #3

merged 6 commits into from
Mar 13, 2016

Conversation

ttowncompiled
Copy link
Collaborator

Adds City Spider to crawl whitelisted sites and adds a scrapy item to represent each crawled page.

@ttowncompiled
Copy link
Collaborator Author

From here, we can hook up an item pipeline to pipe pages into elastic search. There is already an outline for an item pipeline in pipelines.py. We just need to modify it to point at the elastic search db.

@destos
Copy link
Member

destos commented Mar 6, 2016

If it's any help, I've written some scrappy pipelines a while back that save into Django models: https://github.com/destos/free-audio-books/tree/master/scrapers

@groovecoder
Copy link
Member

I rebased this on my latest master; removed the *.pyc files that were committed, and added *.pyc to .gitignore.

groovecoder added a commit that referenced this pull request Mar 13, 2016
@groovecoder groovecoder merged commit 6f98b66 into master Mar 13, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants