add documentation on asynchronous example of putting a file chunk by chunk? #45

deanhiller · 2012-05-02T01:45:42Z

I posted this on stack overflow but looks like no astyanax questions there yet

http://stackoverflow.com/questions/10406597/any-way-to-do-an-asynchronous-put-with-astyanax-for-cassandra-or-hector

I saw the ChunkedExample that takes an InputStream, but I am receiving http requests chunk after chunk and as I receive them, I want to stream them into cassandra "without" waiting for a reply from cassandra until the very last byte and then check all responses to make sure the whole file succeeded in being written.

Also, then need an example on getting it back out. It looks like you may have this streaming capability,, right?

Also, is there a forum for questions? I am trying to build an ORM/indexing(albeit lucene indexing with solr/solandra) in one which actually has NamedQuery but with lucene/solr syntax on github right now and trying to find a way to add this file streaming capabiltiy which is a little outside the ORM......solr then avoids the whole reverse indexing hack which is kind of very custom.

thanks,
Dean

elandau · 2012-05-02T06:10:36Z

I updated the chunked example, which had bad code for the read.

Regarding the write path are you reading chunks for a single file from a single HTTP request or can you get the chunks out of order via multiple requests (kinda like S3). For the former, you could use ChunkedInputStream to bridge the chunked http request and the object store API. You can use a high concurrency level in the ObjectWriter by calling withConcurrencyLevel() to speed up writing to cassandra. If you use this with token aware and have a lot of connections in your pool then you'd be able to get through the chunks quickly.

There is no callback in the API. It just returns the ObjectMetadata if the write was successful or throws an exception if not. It will do proper cleanup if it does fail to write the entire block.

deanhiller · 2012-05-03T17:41:32Z

sweeeeet, love the cleanup part(was wondering about that). I have a question on the threading model though.

Isn't it asynchronous writes to cassandra(though on stackoverflow, you said thrift can't do asynchornous though I thought it could these days)...ie. I want to avoid write(1ms delay to response))write(another 1ms) write 1000 more times and have 1000ms. Of course, having 10 threads brings this to 100ms. BUT I heard thrift has an asycnhronous now so wouldn't it be better to write, write, write, receive request, receive request, etc. etc. (ie. do all the writes up front with no waiting for the response). That would be perfect.

elandau closed this as completed Nov 15, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add documentation on asynchronous example of putting a file chunk by chunk? #45

add documentation on asynchronous example of putting a file chunk by chunk? #45

deanhiller commented May 2, 2012

elandau commented May 2, 2012

deanhiller commented May 3, 2012

add documentation on asynchronous example of putting a file chunk by chunk? #45

add documentation on asynchronous example of putting a file chunk by chunk? #45

Comments

deanhiller commented May 2, 2012

elandau commented May 2, 2012

deanhiller commented May 3, 2012