Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@handle_urls() with item return type #84

Merged
merged 48 commits into from
Oct 27, 2022
Merged

@handle_urls() with item return type #84

merged 48 commits into from
Oct 27, 2022

Conversation

BurnzZ
Copy link
Member

@BurnzZ BurnzZ commented Sep 28, 2022

Addresses approach 1 of #77

TODO:

  • Documentation overhaul
    • docstrings
    • tutorials
      • overriding to_return item using @handle_urls
  • Consider renaming OverrideRule to something else (candidate: ApplyRule)
  • Consider renaming OverrideRule.instead_of to something else (hopefully to match the @handle_urls parameter). Or the other way around: renaming @handle_urls's overrides parameter.
  • Consider renaming handle_urls's data_type to something else (ensure that it's consistent with OverrideRule as well)
  • Consistent use of the term "item type" vs "data type" vs "item class" in docstrings, variables, and parameters.
  • Handling the case of not needing all fields (recommendation for frameworks like scrapy-poet)

@codecov
Copy link

codecov bot commented Sep 28, 2022

Codecov Report

Merging #84 (383b4f7) into master (551aa3b) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 383b4f7 differs from pull request most recent head 2c570e2. Consider uploading reports for the commit 2c570e2 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master      #84   +/-   ##
=======================================
  Coverage   99.81%   99.82%           
=======================================
  Files          17       18    +1     
  Lines         553      583   +30     
=======================================
+ Hits          552      582   +30     
  Misses          1        1           
Impacted Files Coverage Δ
web_poet/_typing.py 100.00% <100.00%> (ø)
web_poet/overrides.py 100.00% <100.00%> (ø)
web_poet/pages.py 100.00% <100.00%> (ø)
web_poet/rules.py 100.00% <100.00%> (ø)
web_poet/utils.py 100.00% <100.00%> (ø)

@kmike
Copy link
Member

kmike commented Sep 29, 2022

hey @BurnzZ! I think we can solve "Handling the case of not needing all fields (recommendation for frameworks like scrapy-poet)" separately. These seem to be two independent features.

web_poet/overrides.py Outdated Show resolved Hide resolved
web_poet/__init__.py Outdated Show resolved Hide resolved
web_poet/overrides.py Outdated Show resolved Hide resolved
web_poet/overrides.py Outdated Show resolved Hide resolved
web_poet/rules.py Outdated Show resolved Hide resolved
web_poet/pages.py Outdated Show resolved Hide resolved
web_poet/rules.py Outdated Show resolved Hide resolved
Copy link
Member

@Gallaecio Gallaecio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed only 11.5/27 files so far, but I think this is enough feedback for a first round of thorough review.

I won’t fight on the capitalization front, but for the record I am still against it 😅

CHANGELOG.rst Outdated Show resolved Hide resolved
docs/advanced/fields.rst Outdated Show resolved Hide resolved
docs/intro/overrides.rst Outdated Show resolved Hide resolved
docs/intro/overrides.rst Outdated Show resolved Hide resolved
docs/intro/overrides.rst Outdated Show resolved Hide resolved
docs/intro/overrides.rst Outdated Show resolved Hide resolved
docs/intro/overrides.rst Outdated Show resolved Hide resolved
docs/intro/overrides.rst Outdated Show resolved Hide resolved
docs/intro/overrides.rst Outdated Show resolved Hide resolved
BurnzZ and others added 2 commits October 25, 2022 14:15
docs/intro/overrides.rst Show resolved Hide resolved
tests/po_lib_to_return/__init__.py Show resolved Hide resolved
tests/test_rules.py Show resolved Hide resolved
tests/test_rules.py Outdated Show resolved Hide resolved
tests/test_rules.py Show resolved Hide resolved
tests/test_rules.py Show resolved Hide resolved
web_poet/rules.py Show resolved Hide resolved
web_poet/rules.py Outdated Show resolved Hide resolved
web_poet/rules.py Show resolved Hide resolved
web_poet/rules.py Show resolved Hide resolved
tests/test_rules.py Outdated Show resolved Hide resolved
Comment on lines 60 to 62
If ``instead_of=None``, this simply means that the Page Object assigned in
the ``use`` parameter will be utilized for all URLs matching the URL pattern
in ``for_patterns``. However, if ``instead_of=ReplacedPageObject``, then it
Copy link
Member

@kmike kmike Oct 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look correct: if instead_of=None, then Page Object won't be used to replace another Page Object, regardless of URL pattern. It might be used on the provided URL pattern if to_return is set, but this has nothing to do with instead_of=None, as it should work regardless of instead_of value.

So, in the example below:

@handle_urls("example.com")
class MyPage(ItemPage):
    # ...

handle_urls is a pure documentation. You can't use it instead of any other page object: instead_of is not set, so it is unknown which POs can MyPage replace. You also can't use it to return a certain data type, because the item class MyPage can return is unknown.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BurnzZ let me send a PR with some doc-related suggestions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#90

kmike and others added 7 commits October 27, 2022 01:21
I've removed the incorrect parts, and also removed some of
the correct parts :) It seems this kind of documentation belongs
somewhere else - it describes how the information stored in ApplyRule
is used by other components.
Co-authored-by: Kevin Lloyd Bernal <kevin@zyte.com>
Documentation improvements for overrides (apply rules)
@kmike
Copy link
Member

kmike commented Oct 27, 2022

Thanks @BurnzZ, great work!

@kmike kmike merged commit 5adaa28 into master Oct 27, 2022
@BurnzZ BurnzZ deleted the handle_urls-with-item branch October 27, 2022 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants