Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating flags in Response.follow and TextResponse.follow #4279

Merged
merged 13 commits into from Feb 10, 2020

Conversation

Prime-5
Copy link
Contributor

@Prime-5 Prime-5 commented Jan 19, 2020

In reference to #4277, added "flags" parameter to Response.follow and TextResponse.follow

(edit) Fixes #4277

@codecov
Copy link

@codecov codecov bot commented Jan 19, 2020

Codecov Report

Merging #4279 into master will decrease coverage by 0.41%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4279      +/-   ##
==========================================
- Coverage   84.14%   83.72%   -0.42%     
==========================================
  Files         166      165       -1     
  Lines        9761     9893     +132     
  Branches     1462     1469       +7     
==========================================
+ Hits         8213     8283      +70     
- Misses       1295     1354      +59     
- Partials      253      256       +3     
Impacted Files Coverage Δ
scrapy/robotstxt.py 75.30% <0.00%> (-22.23%) ⬇️
scrapy/utils/test.py 49.35% <0.00%> (-8.99%) ⬇️
scrapy/utils/ftp.py 23.80% <0.00%> (-6.20%) ⬇️
scrapy/pipelines/files.py 61.66% <0.00%> (-3.99%) ⬇️
scrapy/utils/defer.py 95.00% <0.00%> (-2.50%) ⬇️
scrapy/core/downloader/__init__.py 89.39% <0.00%> (-1.45%) ⬇️
scrapy/core/downloader/handlers/datauri.py 93.33% <0.00%> (-0.79%) ⬇️
scrapy/crawler.py 89.26% <0.00%> (-0.36%) ⬇️
scrapy/utils/log.py 88.76% <0.00%> (-0.13%) ⬇️
scrapy/signals.py 100.00% <0.00%> (ø) ⬆️
... and 16 more

@elacuesta
Copy link
Member

@elacuesta elacuesta commented Jan 19, 2020

Hello, and many thanks for the contribution. I have two comments:

  1. Could you add the flags parameter to the end? I know it would make Request.__init__ and Response.follow have their parameters sorted differently, but it's possible that some users are already passing them as positional argument and this would break backward compatibility.
  2. Could you add a test case for this change? I'm aware that the current tests only check the URLs for the generated requests, but I believe it shouldn't be too hard to do.

I think the first comment should be addressed before approval. I would be ok with leaving the second one to a separate PR, since it's not a matter of just adding one line or test function to an existing test case.

@elacuesta
Copy link
Member

@elacuesta elacuesta commented Jan 20, 2020

Thanks for the update. One additional comment I forgot to make the first time: flags also need to be passed when creating the Request and when invoking the method from the parent class.

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Jan 20, 2020

Yes I was about to ask the same, on it.

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Jan 20, 2020

I'll add the test case in a separate PR.
For adding the test case, I will have to add a function that tests the flags in test_http_response, right?

@elacuesta
Copy link
Member

@elacuesta elacuesta commented Jan 20, 2020

Thanks. Indeed, test_http_response.py would be the appropriate place for it IMHO.

@kmike
Copy link
Member

@kmike kmike commented Jan 23, 2020

@Prime-5 we usually do tests together with changes/features; if you intend to send a PR with tests anyways, would you mind adding commits to this PR instead?

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Jan 23, 2020

Hello @kmike, I've added a test case but I'm not sure why there's a conflict.

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Jan 23, 2020

The updated __init__.py which has a new function follow_all calls the follow function without the flags parameter. Could this be the reason behind the conflict in the first file?

@elacuesta
Copy link
Member

@elacuesta elacuesta commented Jan 23, 2020

Yes, that is almost certainly what's happening, it should be solved by adding the flags to the follow_all methods as well. When I started #4057 we weren't aware of this issue.

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Jan 23, 2020

I've ran into a little trouble. I'm sorry for asking this here because it's not related to scrapy but to the workflow of github.
I updated my fork by committing the new changes in the original scrapy codebase to my master branch. I can view all the new changes only in my master, and my commits are only there in the update_flags branch. I tried merging them in my fork but that's not working. Is there any way I can have simultaneous access to both parts of code?

@elacuesta
Copy link
Member

@elacuesta elacuesta commented Jan 24, 2020

My personal setup involves having two git remotes, origin pointing to my fork and upstream pointing to the source repo. When I need to get commits from the upstream repo I usually do git fetch upstream followed by git merge upstream/master.

Looking at the commit history from your fork, I'd recommend you reset your history, something like:

git remote add upstream git@github.com:scrapy/scrapy.git
git fetch upstream
git checkout master
git reset --hard upstream/master

Then you can merge the changes from master into your feature branch:

git checkout update_flags
git merge master

At this point, you will have some conflicts, which you can solve by following this guide.

Hope this helps, let us know if you still have problems. Cheers!

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Jan 24, 2020

@elacuesta thanks a lot for the patient reply :)
I've added the flags parameter to the follow_on() function as well.

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Jan 24, 2020

The error is in the test function. For calling the Response.follow() should I be using self.response_class.follow()?

@elacuesta
Copy link
Member

@elacuesta elacuesta commented Jan 24, 2020

follow and follow_all are instance methods, they will fail if you call them directly from the class.
You need to create a response object (you can use self.response_class for this) object and then call follow/follow_all on it.

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Jan 25, 2020

Should I add tests for testing the flags parameter in Response.follow_all() and TextResponse.follow_all()?

Copy link
Member

@elacuesta elacuesta left a comment

Could you update the TextResponse.follow_all method as well? Since the follow_all feature has not been released yet I think it's ok to add the parameter before the css and xpath ones.

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Feb 1, 2020

Hello, sorry for the delay, I was caught up in my university exams. I have now added the parameter in TextResponse.follow_all()

@elacuesta
Copy link
Member

@elacuesta elacuesta commented Feb 5, 2020

Thanks for the update. While we're at it, could you add a test to make sure follow_all methods work as expected?

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Feb 8, 2020

I've added tests to check flags in follow_all() over a list of three urls. Should we add more tests for more comprehensive testing?

@Prime-5
Copy link
Contributor Author

@Prime-5 Prime-5 commented Feb 9, 2020

@elacuesta should there be a feature to allow returning different flag values over succeeding urls when called from follow_all()?
I can't point out any specific scenario right away where it might be useful but I thought it might be worth suggesting.

@elacuesta
Copy link
Member

@elacuesta elacuesta commented Feb 10, 2020

I don't think that is necessary, all other parameters (cookies, headers, etc) are passed to all generated requests.

@Gallaecio Gallaecio merged commit 4626e90 into scrapy:master Feb 10, 2020
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants