Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Redirect codes in meta #3687

Merged
merged 14 commits into from Mar 26, 2019
Merged

[MRG+1] Redirect codes in meta #3687

merged 14 commits into from Mar 26, 2019

Conversation

@maramsumanth
Copy link
Contributor

@maramsumanth maramsumanth commented Mar 15, 2019

According to @kmike's comment in #3581 (comment)

@codecov
Copy link

@codecov codecov bot commented Mar 15, 2019

Codecov Report

Merging #3687 into master will increase coverage by 0.18%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3687      +/-   ##
==========================================
+ Coverage   84.54%   84.72%   +0.18%     
==========================================
  Files         167      168       +1     
  Lines        9420     9461      +41     
  Branches     1402     1407       +5     
==========================================
+ Hits         7964     8016      +52     
+ Misses       1199     1188      -11     
  Partials      257      257
Impacted Files Coverage Δ
scrapy/downloadermiddlewares/redirect.py 96.77% <100%> (+0.05%) ⬆️
scrapy/http/__init__.py 100% <0%> (ø) ⬆️
scrapy/http/request/json_request.py 93.75% <0%> (ø)
scrapy/settings/default_settings.py 98.64% <0%> (ø) ⬆️
scrapy/item.py 98.48% <0%> (+0.07%) ⬆️
scrapy/extensions/feedexport.py 84.9% <0%> (+6.43%) ⬆️

Loading

@@ -34,6 +34,8 @@ def _redirect(self, redirected, request, spider, reason):
redirected.meta['redirect_ttl'] = ttl - 1
redirected.meta['redirect_urls'] = request.meta.get('redirect_urls', []) + \
[request.url]
redirected.meta['redirect_codes'] = request.meta.get('redirect_codes', []) + \
Copy link
Member

@kmike kmike Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case of MetaRefreshMiddleware instead of an http code a string can be passed as a reason. What do you think about renaming this meta key to e.g. redirect_reasons?

Loading

Copy link
Contributor Author

@maramsumanth maramsumanth Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah nice idea @kmike :)

Loading

@kmike
Copy link
Member

@kmike kmike commented Mar 15, 2019

Implementation looks fine, thanks @maramsumanth! Could you please add tests and docs for this feature? Meta key should be mentioned in https://doc.scrapy.org/en/latest/topics/request-response.html#request-meta-special-keys and described somewhere.

Loading

@maramsumanth
Copy link
Contributor Author

@maramsumanth maramsumanth commented Mar 15, 2019

@kmike I have made the necessary changes. If I have made any mistake in changing .rst files please do let me know.
Thanks :) 👍

Loading

@maramsumanth
Copy link
Contributor Author

@maramsumanth maramsumanth commented Mar 16, 2019

@kmike , can you please look into this :)

Loading

@@ -299,6 +299,7 @@ Those are:
* :reqmeta:`dont_merge_cookies`
* :reqmeta:`cookiejar`
* :reqmeta:`dont_cache`
* :reqmeta:`redirect_reasons`
Copy link
Member

@kmike kmike Mar 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please build docs locally and check that this link works (I think it doesn't)?

Loading

Copy link
Contributor Author

@maramsumanth maramsumanth Mar 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmike , I just copied redirect_urls from above and replaced url with reason.
Could you please tell me how to check docs locally, because I don't know how to open rst files

Loading

Copy link
Member

@kmike kmike Mar 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading

Copy link
Contributor Author

@maramsumanth maramsumanth Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please build docs locally and check that this link works (I think it doesn't)?

@kmike could you please help me fix this.

Loading

Copy link
Member

@Gallaecio Gallaecio Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must define the target first where you added the documentation. In this case, you could add .. reqmeta:: redirect_reasons right below .. reqmeta:: redirect_urls in downloader-middleware.rst.

Loading

@kmike
Copy link
Member

@kmike kmike commented Mar 16, 2019

@maramsumanth could you please add tests for this feature?

Loading

@maramsumanth
Copy link
Contributor Author

@maramsumanth maramsumanth commented Mar 16, 2019

@kmike , how to link redirect_reasons in request-response.rst to a link like https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#std:reqmeta-redirect_urls, so that on clicking this button it will open this link.
I am able to change the generated html file to get the desired link, but I am unable to do the same while changing this rst file.

Loading

@maramsumanth
Copy link
Contributor Author

@maramsumanth maramsumanth commented Mar 20, 2019

@kmike , I've added tests, can you help me in making this link work!
Thanks :)

Loading

@@ -299,6 +299,7 @@ Those are:
* :reqmeta:`dont_merge_cookies`
* :reqmeta:`cookiejar`
* :reqmeta:`dont_cache`
* :reqmeta:`redirect_reasons`
Copy link
Member

@Gallaecio Gallaecio Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must define the target first where you added the documentation. In this case, you could add .. reqmeta:: redirect_reasons right below .. reqmeta:: redirect_urls in downloader-middleware.rst.

Loading

@maramsumanth
Copy link
Contributor Author

@maramsumanth maramsumanth commented Mar 23, 2019

Thanks @Gallaecio

Loading

@Gallaecio Gallaecio changed the title Redirect codes in meta [MRG+1] Redirect codes in meta Mar 23, 2019
@kmike
Copy link
Member

@kmike kmike commented Mar 25, 2019

Thanks @maramsumanth!
The code and docs looks fine to me. Though could you please make one more change to the documentation, and document what can be in "redirect_reasons"? Currently docs say a possible value is HTTP status code, but it can be "meta refresh" string as well (can it be something else as well?). We should document what kind of data can users expect there (http codes as ints, "meta refresh" string constant, something else?).

Loading

@maramsumanth
Copy link
Contributor Author

@maramsumanth maramsumanth commented Mar 25, 2019

can it be something else as well?

@kmike , any other possible?

Loading

docs/topics/downloader-middleware.rst Outdated Show resolved Hide resolved
Loading
docs/topics/downloader-middleware.rst Show resolved Hide resolved
Loading
@kmike kmike merged commit 2fd8b7c into scrapy:master Mar 26, 2019
3 checks passed
Loading
@kmike
Copy link
Member

@kmike kmike commented Mar 26, 2019

Thanks @maramsumanth for fixing it, and @Gallaecio for the review! Docs can be hard to get right indeed :)

Loading

@kmike kmike added this to the v1.7 milestone Mar 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants