Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: IN/ACTIVE status on NewsWebsite? #91

Closed
senoadiw opened this issue Jun 9, 2017 · 4 comments
Closed

Question: IN/ACTIVE status on NewsWebsite? #91

senoadiw opened this issue Jun 9, 2017 · 4 comments

Comments

@senoadiw
Copy link

senoadiw commented Jun 9, 2017

Hello,

Quick newbie question, I have a use case where I have 3 NewsWebsite entries where all scrape the same domain url with only the keyword differentiating each other like the following

NewsWebsite 1 url is "http://www.somewebsite.com/?q=keyword1
NewsWebsite 2 url is "http://www.somewebsite.com/?q=keyword2
etc

this way I can filter by a keyword on the Article admin as well as only needing to create 1 scraper for all. However I notice the IN/ACTIVE status is on the scraper, thus setting the scraper INACTIVE will stop scraping for all NewsWebsite when I actually only need to disable one keyword scraping. So is there a way to accomplish this in DDS?

Cheers

@holgerd77
Copy link
Owner

No, there's no way to achieve this the way you described it, either you have to use different scrapers (you can use the CLONE action from the Django admin scraper overview site), or you have to live with the fact that all scrapers will stop on the status change.

P.S.: These kind of questions are better suited for the mailing list:
https://groups.google.com/forum/#!forum/django-dynamic-scraper

Cheers
Holger

@senoadiw
Copy link
Author

senoadiw commented Jun 9, 2017

Ah yes, I didn't notice there was a clone scraper in the admin actions. Looks like this will do. I'll make sure to post questions in the list next time. Much appreciated for the answer.

Cheers

@holgerd77
Copy link
Owner

You might also want to look if pagination is doing the job for you (with a FREE_LIST with your keywords e.g.), but this depends on the specific use case.

Cheers
Holger

@senoadiw
Copy link
Author

senoadiw commented Jun 9, 2017

@holgerd77 nice tip, that could come handy instead of creating multiple NewsWebsite entries. Just put every keyword in the scraper's FREE_LIST.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants