New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attachment api sucks #1251

Closed
daleharvey opened this Issue Jan 22, 2014 · 19 comments

Comments

Projects
None yet
3 participants
@daleharvey
Copy link
Member

daleharvey commented Jan 22, 2014

db.putAttachment(docId, attachmentId, rev, doc, type, [callback]);

is pretty terrible, in couchdb there has been proposals to change the api to allow

db.put(blob, id, rev) 
// or?
db.put(blob, {id: id, rev: rev});

We also have the ability to store {some: 'json', with: aBlob}, which couchdb doesnt do. We need to have compatibility with couch and I want to be as flexible as possible, but seems like we could improve this


Want to back this issue? Place a bounty on it! We accept bounties via Bountysource.

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented Jan 22, 2014

even removing the magic properties sounds quite nice without requiring blobs

db.put({some: 'json'}, 'mydoc', rev) // current .put
db.put({some: 'json'}, 'mydoc') // probably dissallowed, but may new_edits: false?
db.put({some: 'json'}) // current .post

// {data: 41}
db.get(id, function(obj) {
  obj.id
  obj.rev
  obj.value.data; 
})
@calvinmetcalf

This comment has been minimized.

Copy link
Member

calvinmetcalf commented Jan 29, 2014

since db.put({some: 'json', _id: 'mydoc'}); works fine if mydoc doesn't exist, db.put({some: 'json'}, 'mydoc') should be fine, plus those put versions are backwards compatible, (the first 2 are at least),

not so sure about the magic get stuff, having doc be the response is honestly the best options I think,

as for blobs, what if pouchdb just, as part of the stringifying process, just found blobs and arraybuffers and turned them into attachments, then when retrieving the document put them back? aka transparent attachments.

calvinmetcalf added a commit that referenced this issue Jan 31, 2014

calvinmetcalf added a commit that referenced this issue Feb 4, 2014

nolanlawson added a commit to nolanlawson/pouchdb that referenced this issue Feb 8, 2014

calvinmetcalf added a commit that referenced this issue Feb 8, 2014

calvinmetcalf added a commit that referenced this issue Feb 8, 2014

calvinmetcalf added a commit that referenced this issue Mar 8, 2014

-n
(#1251) - put sugar
@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

So back on this, I think this is the last of the 'core' api that really sucks,

Problem 1: there are 50 different types of attatchments, base64, arraybuffer, blob, Uint8Array, all with different support between browser

Problem 2: different backing stores support different types of objects, chrome doesnt support blob, thats just landing but we will need to fallback for a while

Problem 3: half the backing stores implementations are broken, they use seperate encoding, utf8 / 16 etc which we patch every time

Problem 4: putattachment / getAttachment sucks

Problem 5: _attachments sucks

Problem 6: We dont store the original type of the object, if someone chucks a blob in pouch in chrome, they should get a blob back

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

So I think the first thing, we currently try to store blobs when we can, otherwise we use arraybuffer, can we just use arraybuffer or Uint8Array everywhere?

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

for Problem number 5, I think @calvinmetcalf's idea would be sweet #1251 (comment), probably should be the last to be implemented though

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

for Problem 4, I am thinking us adding the ability to do some type of json pointer to get (or an additional getField or something would be nice

@calvinmetcalf

This comment has been minimized.

Copy link
Member

calvinmetcalf commented May 9, 2014

so the reason we use blob is that it supports mime-types like couchdb does, we can just default non blob stuff to application/ocelot

edit: I stand by my typo

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

So thats related to Problem 6, if you put in an array buffer I think you should be getting an array buffer back, that would mean we need to store stuff about what the original format is, which means we can still store an arraybuffer then reconstruct the correct blob on the way back

This would be trading cleanliness of our internal api (I would like to see us get rid of blobsupport detection) against doing extra work , however we already do so much work on blobs that its likely just gonna be an improvement anyway

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

So I think the api issues need sorted first, if not only so we dont have to rewrite all the new tests, but also because put/getAttachment is aimed towards blobs (not sure why the type is in there, should be part of the object)

As calvin suggested:

db.put({
  _id: 'mydata', 
  foo: new Blob(...)
});

Would be nice, we can convert the underlying implementation to use _attachments at least for the http adapter (probably for all?), we also need to ensure we handle the case where the people use the traditional attachments api

db.put({
  _id: 'mydata', 
  _attachments: {
    foo: { 
      data: new Blob(...)
    }
});

If keys conflict (they put a blob in there + _attachments with the same key), we probably error, are there any use cases where people would really want attachments and properties with the same key / different value, because this makes that use case impossible, it also worries me that

  1. we have to traverse the entire object on every put, and make modifications to the object
  2. we would have an api where we do completely different things based on the type of an object
@calvinmetcalf

This comment has been minimized.

Copy link
Member

calvinmetcalf commented May 9, 2014

can you put additional keys besides content-type and data in there, ?

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

Couch has a few things that looks fairly tied to the http implementation (follow for multipart requests etc - https://github.com/apache/couchdb/blob/master/src/couchdb/couch_doc.erl#L226), I dont think there is anything we need to support, I was also thinking

db.put({
  _id: 'mydata', 
  _attachments: {
    foo: new Blob(...)
});

would solve our need to traverse + modify the document, and doing stuff based on the type, its not quite as nice, but its pretty reasonable and much less magic

@calvinmetcalf

This comment has been minimized.

Copy link
Member

calvinmetcalf commented May 9, 2014

one thing we might do is separate the 2 concepts

  1. storing binary data
  2. couchdb attachment api

simply going and filtering out blobs and turning them into attachments is going to be tricky especially in cases of things like

{
    foo :{
      [
         3,
         "lala",
         new Blob()
    ]
  }
}
@calvinmetcalf

This comment has been minimized.

Copy link
Member

calvinmetcalf commented May 9, 2014

db.put({
_id: 'mydata',
_attachments: {
foo: string/buffer/blob
});

could be stored in an adapter specific way with a mime type to tell you how to get it back

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

agreed, I was hoping to get rid of _attachments completely, but I think there is too many gotchas in that api

So the other part of the api is get/putAttachment, I think we should replace this with some jsonpointer implementation

db.put({
  _id: 'mydata', 
  _attachments: {
    foo: new Blob(...)
  }
});

db.putPointer(new Blob(), '/mydata/foo');
db.putPointer(new Blob(), 'mydata/foo', mydata.rev);

db.getPointer('mydata/foo');

It would be nice to not introduce a new method, I 'think' we could use a hack of representing a jsonpointer with a regex or a prefix, is the trailing slash ever significant in a jsonpointer?

db.put(new Blob(), /mydata/_attachments/foo/);
db.put(new Blob(), 'pointer:/mydata/foo/');

db.get(/mydata/foo/);
@calvinmetcalf

This comment has been minimized.

Copy link
Member

calvinmetcalf commented May 9, 2014

so one idea might be a custom serialization/deserialization function based around the second argument to JSON.parse/stringify

@daleharvey

This comment has been minimized.

Copy link
Member

daleharvey commented May 9, 2014

one idea for what? to get rid of _attachments?

@calvinmetcalf

This comment has been minimized.

Copy link
Member

calvinmetcalf commented May 9, 2014

so we could do put/get with slashes, that would totally work, also lets just talk on irc

@nolanlawson

This comment has been minimized.

Copy link
Member

nolanlawson commented May 9, 2014

Found an example of application/ocelot: https://gist.github.com/nolanlawson/05ca9890b56064e663e0. It's a really lean and muscular format; we should use it to encode all our attachments.

@nolanlawson

This comment has been minimized.

Copy link
Member

nolanlawson commented Apr 22, 2015

Closing as I think the attachment API is actually pretty great these days. blob-util helps iron out some rough edges, and #2858 and #3758 are good follow-up issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment