Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add json response #4460

Closed
wants to merge 9 commits into from
Closed

Add json response #4460

wants to merge 9 commits into from

Conversation

JochenFromm
Copy link

@JochenFromm JochenFromm commented Mar 27, 2020

My first PR for Scrapy that is supposed to fix GH #2444

Adds a JsonResponse class and a json() function similar to the one for the "requests" library here

Test manually by

# from the Scrapy package itself install Scrapy first  
python setup.py install

# then run the Scrapy shell
scrapy shell https://api.github.com/events
>> response.json

Fixes #2444

@codecov
Copy link

codecov bot commented Mar 27, 2020

Codecov Report

Merging #4460 into master will increase coverage by 0.04%.
The diff coverage is 90.90%.

@@            Coverage Diff             @@
##           master    #4460      +/-   ##
==========================================
+ Coverage   84.75%   84.79%   +0.04%     
==========================================
  Files         164      165       +1     
  Lines        9889     9900      +11     
  Branches     1469     1470       +1     
==========================================
+ Hits         8381     8395      +14     
+ Misses       1250     1248       -2     
+ Partials      258      257       -1     
Impacted Files Coverage Δ
scrapy/responsetypes.py 94.11% <ø> (ø)
scrapy/http/response/json.py 90.00% <90.00%> (ø)
scrapy/http/__init__.py 100.00% <100.00%> (ø)
scrapy/utils/misc.py 97.11% <0.00%> (+1.92%) ⬆️
scrapy/utils/defer.py 95.65% <0.00%> (+2.17%) ⬆️

Copy link
Member

@elacuesta elacuesta left a comment

Thanks for the PR! I think this is in the right direction, but some changes are needed:

  • The new response subclass should be added to the mapping in scrapy.responsetypes.ResponseTypes. I haven't tried it myself, but correctly recognizing JSON responses might not be trivial (see #4240)
  • The json method should live in scrapy.http.response.json.JsonResponse, not in scrapy.http.response.text.TextResponse
  • It probably shouldn't be decorated as a property, so it could be called as a function instead of as an attribute
  • Perhaps it could be good to cache its results? The body of a response is not supposed to change (see the replace and copy methods)

tests/test_http_response.py Outdated Show resolved Hide resolved
@JochenFromm
Copy link
Author

JochenFromm commented Mar 28, 2020

thanks for the good review @elacuesta , feedback has been applied

@JochenFromm JochenFromm mentioned this pull request Mar 28, 2020
docs/topics/request-response.rst Outdated Show resolved Hide resolved
docs/topics/request-response.rst Outdated Show resolved Hide resolved
scrapy/http/response/json.py Outdated Show resolved Hide resolved
scrapy/http/response/json.py Show resolved Hide resolved
scrapy/http/response/json.py Show resolved Hide resolved
scrapy/http/response/json.py Show resolved Hide resolved
JochenFromm added 2 commits Mar 31, 2020
Store data from json() method in cache to 
avoid performing the same operation multiple times
docs/topics/request-response.rst Outdated Show resolved Hide resolved
if self._json is None:
self._json = json.loads(self.text)
Copy link
Member

@Gallaecio Gallaecio Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💄

Because null is a valid JSON document that would be translated into Python as None, such documents would always be reloaded.

It is not that much of a issue, but it may be better to use a custom object() instance (e.g. define _NONE = object() or similar outside the class, and use _NONE instead of None as default value for _json and for the if statement). This approach is a common pattern in Python and it has a name, but I cannot remember it at the moment 🤦‍♂️

'application/json': 'scrapy.http.TextResponse',
'application/x-json': 'scrapy.http.TextResponse',
'application/json': 'scrapy.http.JsonResponse',
'application/x-json': 'scrapy.http.JsonResponse',
Copy link
Member

@kmike kmike May 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I wonder if it is better to add .json() method to TextResponse, which is contrary to what's been discussed here before. A problem with using HTTP headers is that in a real world not all APIs would return a proper Content-Type header, and .json() method wouldn't be available for them. What's the reason to only providing .json() for responses we're sure are json?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants