Skip to content
This repository has been archived by the owner on Mar 11, 2022. It is now read-only.

Populating _id with a url results in document that cannot be deleted #32

Closed
ewann opened this issue Nov 21, 2015 · 5 comments
Closed

Populating _id with a url results in document that cannot be deleted #32

ewann opened this issue Nov 21, 2015 · 5 comments
Assignees
Labels
Milestone

Comments

@ewann
Copy link

ewann commented Nov 21, 2015

I don't know if this is a bug or feature request. Either way, to repro:

from cloudant.account import CouchDB
import base64
client = CouchDB('admin', 'admin', url='http://127.0.0.1:5984')
client.connect()

try:
    db = client['some_test_db']
except Exception,e:
    db = client.create_database('some_test_db')

unencoded = 'http://www.google.com'

encoded = base64.b64encode(unencoded)

#succeeds
data = {
    '_id': encoded,
    'someKey': 'someValue'
    }
db.create_document(data)
db[encoded]
db[encoded].delete()

#failure on document delete:
data = {
    '_id': unencoded,
    'someKey': 'someValue'
    }
db.create_document(data)
db[unencoded]
db[unencoded].delete()
@alfinkel alfinkel added the bug label Nov 23, 2015
@alfinkel
Copy link
Contributor

@ewann, This is definitely a bug. I added an internal FogBugz case to track it. It will likely be fixed for the beta release. For now you can get around the problem by manually fetching the document while encoding the _id and then deleting. As in:

import urllib
doc = Document(db, urllib.quote(unencoded, safe=''))
doc.fetch()
doc.delete()
del doc

@ewann
Copy link
Author

ewann commented Nov 25, 2015

thanks. In case anyone follows along behind, here is I suspect the same bug, differently manifest:

#succeeds:
import urllib
my_document = my_database[urllib.quote('Mm5kMTQ0ODQxNTY0MC45OA==', safe='')]
#my_document = my_database['Mm5kMTQ0ODQxNTY0MC45OA==']
print my_document
from cloudant.document import Document
my_document['title'] = 'Jules'
my_document['text'] = 6
my_document.save()

#fails:
my_document = my_database['Mm5kMTQ0ODQxNTY0MC45OA==']
print my_document
from cloudant.document import Document
my_document['title'] = 'Jules'
my_document['text'] = 6
my_document.save()

Backtrace:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/cloudant/document.py", line 148, in save
    self.create()
  File "/usr/lib/python2.7/site-packages/cloudant/document.py", line 110, in create
    resp.raise_for_status()
  File "/usr/lib/python2.7/site-packages/requests/models.py", line 837, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 409 Client Error: Conflict for url: http://127.0.0.1:5984/db

@alfinkel
Copy link
Contributor

I'm not sure that this is exactly the same as your original issue. Here you are getting a conflict originating from the Document save(). It also looks like the conflict stems from the Document create() being invoked which should not happen if the document already exists remotely. While investigating this problem I did uncover another bug #50 in the iteration of documents within a database object. This may be the cause for your problem but it also may not. @ewann can you provide a more complete example of the entire block of code you are using that is causing this most recent issue?

One other thing that may be an issue is my original proposed "work around".

For example when you do:

my_document = my_database[urllib.quote('Mm5kMTQ0ODQxNTY0MC45OA==', safe='')]
...
my_document = my_database['Mm5kMTQ0ODQxNTY0MC45OA==']

You actually are unknowingly adding two members to the locally cached my_database object. One whose key is 'Mm5kMTQ0ODQxNTY0MC45OA==' and the other whose key is 'Mm5kMTQ0ODQxNTY0MC45OA%3D%3D'. Both of which represent the same document. This could cause problems down the road. To recify this I am changing my initial workaround to this:

import urllib
doc = Document(db, urllib.quote(unencoded, safe=''))
doc.fetch()
doc.delete()
del doc

This is a safer work around that does not cause any unwanted side effects.

@alfinkel alfinkel added this to the 2.0.0b1 milestone Dec 8, 2015
@alfinkel
Copy link
Contributor

alfinkel commented Dec 8, 2015

This comment is for our internal purposes to fix the bug in a future release. Not a work around.

To resolve the issue I propose that we change the Document.document_url() method to wrap self._document_id in a call to urllib.quote() using the safe='' argument. As in:

    @property
    def document_url(self):
        """constructs and returns the document URL"""
        if self._document_id is None:
            return None
        return posixpath.join(
            self._database_host,
            urllib.quote_plus(self._database_name),
            urllib.quote(self._document_id, safe='')  # <-- This is the fix!!!
        )

The fix needs to be thoroughly tested and test case(s) added as well.

@alfinkel
Copy link
Contributor

alfinkel commented Jan 8, 2016

This bug is resolved and the fix should be made available with the 2.0.0b1 release.

@alfinkel alfinkel closed this as completed Jan 8, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants