-
-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to make Scrapy Item class customizable via configuration #145
Conversation
The LOG_ENABLED variable is not honored in the spider (using RecursiveSpider) because it's being unset here.
Stopped the LOG_ENABLED variable from being unset
Hi, thanks for the PR. Could you please elaborate which problem these changes solve, i.e., what would be the use case when one wants to customize the ScrapyItem? Thank you! |
Hi, Sorry I should've explained better. If I want to modify the plugin to include secondary processing - let's say add a word count or something, the pipeline can't add additional fields to the current Scrapy Item object being moved forward in the pipeline. This PR is to allow a sub class of the current Scrapy Item class to be substituted via the config.cfg file. This will allow the pipeline to be customized extensively - without forking the project. |
Thanks for your elaborations. After reviewing the code, I think there are a few things that should be addressed before it can be merged into master.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my other comment
@fhamborg I didn't think there would be much use for setting up an customized pipeline in the library mode. For me using library mode means you would use existing pipeline elements and add your own customizations programmatically, in which case you have complete control anyway. But I'll defer to you on this and add the configuration to the config_lib.cfg file. Sorry about the f strings - should've realized that would increase the library supported python version. I'll fix it too. |
Thanks for the hint. Do I understand you correctly that the changes you made initially in the PR are for the use case of being in CLI mode and having news-please set up by pypi? |
@fhamborg That is correct. |
…ter into config_lib.cfg file
@fhamborg Changes are done. Please review and let me know if there are additional issues. |
…ed with tag.string
@fhamborg Is there anything else blocking this PR? |
Sorry for the delay. Before I can merge this PR, I need to comprehensively test the changes that come with it, as they are partly in core files of news-please. |
@fhamborg Understood. Let me know if there's anything I can do to help speed things up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minors, other than that we can merge. Thanks for your effort!
@classmethod | ||
def from_string(cls, class_name): | ||
if "." not in class_name: | ||
raise ImportError("{0} does't look like a module path".format(class_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- please replace does't with doesn't
- Why this test condition on containing "."? IIRC, is not only
some.path.modulename
a valid path, but for example onlymodulename
is also a valid import path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please rename the file to a more descriptive name, e.g., class_loader.py? Because currently it only contains the ClassLoader class and nothing else, thus we should go with the more descriptive name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fhamborg About no2; the fully qualified name will be module.class at least. The use of a non names-paced class seemed rare enough I didn't see the value in handling that case. Do you disagree?
I'll do the other changes and update the PR soon. Apologies about the long delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fhamborg Changes are done.
…ing for loading custom news item module and class. Fixed a typo.
Thanks for the PR and the subsequent changes you implemented. I will keep this in the master branch for a while and if no issues arise push a new version to Pypi. |
No description provided.