Skip to content
This repository has been archived by the owner on Mar 11, 2022. It is now read-only.

Document fetch fails when uses its default json decoder and module simplejson is present #409

Closed
dariosm opened this issue Oct 20, 2018 · 15 comments · Fixed by #412
Closed
Assignees
Milestone

Comments

@dariosm
Copy link

dariosm commented Oct 20, 2018

Bug Description

Document.fetch() fails when it uses its default json decoder and simplejson is in the environment.

Steps to reproduce and code sample to demonstrate the issue

To reproduce, just run a simple document fetch like this having simplejson package in the environment:

from cloudant import cloudant
from cloudant.database import CloudantDatabase
from cloudant.document import Document

def fetch_by_id(usr, pwd,url,db_name,id):
  with cloudant(user=usr, passwd=pwd, url=url) as client:
    db = CloudantDatabase(client, db_name)
    doc = Document(db, id)
    if doc.exists():
        doc.fetch()

Expected behavior

The full document should be retrieved by fetch function, instead when simplejson is reachable in the environment, it will raise the following error:

File \"/usr/local/lib/python3.6/site-packages/simplejson/__init__.py\", line 535, in loads
return cls(encoding=encoding, **kw).decode(s)
TypeError: __init__() got an unexpected keyword argument 'encoding'

What actually happened

Starting in this line: self.update(resp.json(cls=self.decoder)):

  1. json() function from requests.models is called with argument cls and value json.JSONDecoder, as self.decoder was initialized by default when no decoder is passed to Document
  2. a JSON decoder is imported selectively in requests.compat module: when simplejson is present, then the json decoder implementation comes from that package
  3. the JSON decoder from simplejson package is used to load the contents of the response
  4. now, argument cls is a json.JSONDecoder (see point 1), and it does not expect an argument called encoding, thus a TypeError is raised.

Environment details

Version affected: 2.10.0
Note that in version 2.9.0, cls=self.decoder was not passed to resp.json().

@csantanapr
Copy link

This affects users using IBM Cloud Functions (OpenWhisk/Serveless) using the python runtimes, the runtimes include multiple packages for example the requirements.txt looks like:

gevent == 1.3.6
flask == 1.0.2

# default available packages for python3action
beautifulsoup4 == 4.6.3
httplib2 == 0.11.3
kafka_python == 1.4.3
lxml == 4.2.5
python-dateutil == 2.7.3
requests == 2.19.1
scrapy == 1.5.1
simplejson == 3.16.0
virtualenv == 16.0.0
twisted == 18.7.0

# packages for numerics
numpy == 1.15.2
scikit-learn == 0.20.0
scipy == 1.1.0
pandas == 0.23.4

# packages for image processing
Pillow == 5.2.0

# IBM specific python modules
ibm_db == 2.0.9
cloudant == 2.10.0
watson-developer-cloud == 2.1.0
ibm-cos-sdk == 2.3.0
ibmcloudsql == 0.2.13

# Compose Libs
psycopg2 == 2.7.5
pymongo == 3.7.1
redis == 2.10.6
pika == 0.12.0
elasticsearch >=5.0.0,<6.0.0
cassandra-driver == 3.15.1

Like @dariosm I also I agree that the cloudant library needs to handle the 2 cases properyly were the requests library might use simplejson or might use json like @dariosm already pointed out in the request library.

@dariosm
Copy link
Author

dariosm commented Oct 20, 2018

A pretty simple workaround for this issue, that I have working live in IBM Cloud Functions, is:

from simplejson.decoder import JSONDecoder as json_decoder

from cloudant import cloudant
from cloudant.database import CloudantDatabase
from cloudant.document import Document

def fetch_by_id(usr, pwd, url, db_name, id):
  with cloudant(user=usr, passwd=pwd, url=url) as client:
    db = CloudantDatabase(client, db_name)
    doc = Document(db, id, decoder=json_decoder)
    if doc.exists():
        doc.fetch()

def update_doc_with_id(usr, pwd, url, db_name, id, doc_updates):
  with cloudant(user=usr, passwd=pwd, url=url) as client:
    db = CloudantDatabase(client, db_name)
        with Document(db, id, decoder=json_decoder) as doc:
            doc.update(doc_updates)

@ricellis
Copy link
Member

Thanks for reporting this.

The requests documentation for response.json() says:

Parameters: | **kwargs – Optional arguments that json.loads takes.

The documentation for json.loads says:

json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

and

To use a custom JSONDecoder subclass, specify it with the cls kwarg; otherwise JSONDecoder is used. Additional keyword arguments will be passed to the constructor of the class.

So what we're doing should be valid. However, I agree with your assessment, requests is preferentially swapping the standard json library for simplejson when it is available and causing the issue because the JSONDecoder classes are incompatible. This is a bug in requests (in documentation if nothing else) - noting also that the issue does not occur when running in Python 2.7.x. I have raised psf/requests#4842 and will await for feedback there before implementing any large scale workarounds here.

For reference it appears that this awkward undocumented preferential import of simplejson is slated for removal in requests version 3 as per:
psf/requests#2516 and psf/requests#3052.

@ricellis ricellis self-assigned this Oct 22, 2018
@dariosm
Copy link
Author

dariosm commented Oct 22, 2018

@ricellis I also thought about reporting to requests, but the issue occurs in a complex situation when certain versions and packages coexist in the same environment, so I decided to start here. As you already explained, may be there is nothing to change in python-cloudant since you are sticking to the documentation available in requests and even json.
Let's wait for the feedback on your issue report.
In the meantime, I believe IBM Cloud Function users can adopt my workaround if you agree. @csantanapr do you think is safe enough to announce it in our channel? Some people started today reporting the same issues in their projects.

@ricellis
Copy link
Member

@dariosm @csantanapr until we have a solution other workarounds could be:

  • Pin to python-cloudant 2.9.0 that predates our decoder change.
  • Use python 2.7.x - I wasn't able to reproduce the problem in 2.7.x (I think possibly requests doesn't use its compat.py in that env).

@csantanapr
Copy link

csantanapr commented Oct 23, 2018

@ricellis will look into reverting back to 2.9.0 the IBM Cloud Functions environment, this will buy you guys some time on releasing a new release of the lib with a fix.
IMHO the python-cloudant at 2.9.0 was working fine for a lot of users/customers, they might need or by another library as forced to have simplejson in their environment. The python-cloudant should handle the presence of simplejson gracefully, the same similar that requests@2.x doesn't brake when simplejson is present.
The python-cloudant library can detect is simplejson is present or other means detect the decoder, and handle gracefully.

The thing is the Cloud Functions environment is fixed for users using the UI interface on bluemix interface, and they don't have an option to select which version of python or python-cloudant lib.

@ricellis
Copy link
Member

will look into reverting back to 2.9.0 the IBM Cloud Functions environment, this will buy you guys some time on releasing a new release of the lib with a fix

I think that is a good plan. We will only consider engineering a workaround if requests are unable/unwilling to fix the issue.

python-cloudant should handle the presence of jsonschema gracefully

I don't understand what jsonschema has to do with anything - is that the package that requires simplejson to be present?

the same similar that requests@2.x doesn't brake when jsonschema is present.

Requests 2.x does break when simplejson is present if you pass a json.JSONDecoder.

The python-cloudant library can detect is jsonschema is present or other means detect the decoder, and handle gracefully.

We will not be special casing for jsonschema or any other JSON imports. If requests does not issue a fix for the problem then we will likely bypass their built in JSON handling entirely (which is a large scale change and hence my reluctance to do it unless requests force us to).

csantanapr added a commit to csantanapr/ibm-functions-runtime-python that referenced this issue Oct 23, 2018
jasonpet pushed a commit to ibm-functions/runtime-python that referenced this issue Oct 23, 2018
* revert back cloudant from 2.10.0 to 2.9.0
workaround with problem with cloudant lib cloudant/python-cloudant#409

* update changelog
@csantanapr
Copy link

Thanks @ricellis for the update.
Sorry I meant simplejson not jsonschema it was brain :fart

I think another option would be to have a check if simplejson AND requests@2.x then handle it special case.

@csantanapr
Copy link

One piece of data, I tried to remove simplejson from the list of requirement.txt for the python environment, but I found that many other dependencies down the dependency tree require simplejson, so the environment ends up with simplejson present.

@csantanapr
Copy link

@ricellis what's the latest status on this?
I need to provide to IBM Functions customers a new version of the cloudant lib but with a fix.

@ricellis
Copy link
Member

ricellis commented Nov 7, 2018

@csantanapr as you can see from psf/requests#4842 I proposed a fix, which they didn't accept. I asked whether they were willing to consider fixing it at all and they went quiet. I've asked again, if we don't hear anything in the next day or two we'll fallback to doing the large-scale changes to workaround in our lib.

@csantanapr
Copy link

Thanks keep us posted

@ricellis ricellis added this to the 2.next milestone Nov 15, 2018
python-cloudant Triage automation moved this from New to Done Nov 15, 2018
@csantanapr
Copy link

This is awsome thank you 🙏🏻

@ricellis
Copy link
Member

@csantanapr yw, we'll do a release with the fix real soon.

@ricellis
Copy link
Member

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants