Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

response.json()? #2444

Closed
pawelmhm opened this issue Dec 12, 2016 · 11 comments · Fixed by #4574
Closed

response.json()? #2444

pawelmhm opened this issue Dec 12, 2016 · 11 comments · Fixed by #4574

Comments

@pawelmhm
Copy link
Contributor

pawelmhm commented Dec 12, 2016

python-requests have response.json() that decodes json body and returns appropriate Python objects. Does it make sense to have something like this in Scrapy?

@kmike
Copy link
Member

kmike commented Dec 12, 2016

@eliasdorneles proposed it here: #1729 (comment). I think that's fine to add it. Not sure if we need JsonResponse - maybe it'd be nice to have, to be able to check if json is available. But maybe it doesn't worths it, I'm not sure.

@kmike
Copy link
Member

kmike commented Dec 12, 2016

The question is whether should response.json raise AttributeError if response is not JSON; if it should (like .text, .css and .xpath) then we need JsonResponse. If it shouldn't then we don't need JsonResponse.

@pawelmhm
Copy link
Contributor Author

pawelmhm commented Dec 12, 2016

python-requests provide it by default and just fail with ValueError if body is not json. Of course their implementation is not necessarily ideal, maybe we can do better. We could check for response headers and check if they have application/json content type, but there's a risk that some responses won't have json content type and they will be json (it's probably not standard compliant but I imagine it might be possible). It would also make code somewhat more complicated, e.g.:

# some spider callback
if isinstance(response, scrapy.response.JSONResponse):
    data = response.json()
else:
   raise ValueError('expected JSON')

instead of:

data = response.json()

@eliasdorneles
Copy link
Member

eliasdorneles commented Dec 12, 2016

@kmike oops, only .text is raising AttributeError: https://github.com/scrapy/scrapy/blob/master/scrapy/http/response/__init__.py#L86-L103 -- should I change that for the case people do hasattr(response, 'css') as well?

@eliasdorneles
Copy link
Member

eliasdorneles commented Dec 12, 2016

About response.json, I'd avoid creating a new subclass (JsonResponse) if we don't have a clear use for it.

I've been thinking that the .json() method could be on the selectors as well, as a shortcut to extract() -> json.loads, but that idea has to mature a little bit. :)

@IAlwaysBeCoding
Copy link
Contributor

IAlwaysBeCoding commented Dec 14, 2016

If we do include the response.json then could we also include a method for JmesQuery. Something like this response.jmes('[::].urls').extract(). Although, I don't think we will need extract_first() and just a simple extract() will do.

@kmike
Copy link
Member

kmike commented Dec 16, 2016

Regarding jmespath - see also scrapy/parsel#27.

As for .json(), is it just a shortcut for json.loads(response.text)?

Supporting .json() it in selectors sounds good, but I'd wait for real use cases / examples for that.

@twinsant
Copy link

twinsant commented Jan 2, 2019

mark

@ejulio
Copy link
Contributor

ejulio commented Jan 30, 2019

I was thinking about .json and .jmespath this week.
I think that something like response.as_json() or response.css('script::text').as_json() would be nice, since sometimes we're handling JSON data and apis.

However, I don't think it should be only a wrapper for json.loads(response.text). Otherwise we would end up writing some code like

data = response.json()
item['field'] = data['property']['property']

But, it would be better to load the JSON as a python object and then write

data = response.json()
item['field'] = data.property.property

@tianhuil
Copy link

tianhuil commented Nov 5, 2019

Would love to have this feature.

@JochenFromm
Copy link

JochenFromm commented Mar 28, 2020

I have added a PR which is being reviewed by @elacuesta at the moment, please let me know if it goes in the right direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants