-
Notifications
You must be signed in to change notification settings - Fork 82
Treat image attachments as binary blobs;tools/dump fetch attachments #246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dump, since the view wont include the data field even with attachments=true&include_docs=true
elif 'charset' not in params: | ||
# exclude images from being treated as a string | ||
# XXX: is there a better way to do this?? | ||
elif 'charset' not in params and 'image/' not in ctype: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is need to make reverse check to encode any text mimes text/*
plus few common ones like application/json
, application/xml
counting everyone else as binaries.
@kxepal did you see my earlier comment about the mimetype check? I think your approach would be an improvement over the current code, but I'm a little bit worried about have a hard-coded list of exceptional mimetypes. |
@djc oh, sorry - I didn't received any notification about, just this one.
No need for:
They aren't so much to worry about. However, more wise solution may be always to leave attachments as is in binary form. Do we have any reasons to work with some of them as with text? |
Ah yeah, so @traxxas, are you on something pre-1.5? Are you in a position to upgrade to 1.5? As for relying on It may make sense to leave attachments in binary data, but of course the MultipartWriter doesn't have a clear notion of whether a content block represents an attachment or not. Actually, I'm not sure it even makes sense to add a diff --git a/couchdb/multipart.py b/couchdb/multipart.py
index 9d4fc78..392db10 100644
--- a/couchdb/multipart.py
+++ b/couchdb/multipart.py
@@ -150,11 +150,6 @@ class MultipartWriter(object):
else:
content = content.encode('utf-8')
mimetype = mimetype + ';charset=utf-8'
- elif 'charset' not in params:
- try:
- content.decode('utf-8')
- finally:
- mimetype = mimetype + ';charset=utf-8'
headers['Content-Type'] = mimetype
if content: |
@traxxas any progress? |
The _all_docs view only returns attachment stubs and doesn't accept the attachments=true attribute so I changed the dump tool to fetch full documents in the main loop. Also added guard check to mulitpart writer to not attempt to decode images and not bloat them with a utf-8 encoding. There is probably a better way to detect binary mimetypes, if pointed in the right direction I can improve that check.