Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add documentation on asynchronous example of putting a file chunk by chunk? #45

Closed
deanhiller opened this issue May 2, 2012 · 2 comments

Comments

@deanhiller
Copy link

I posted this on stack overflow but looks like no astyanax questions there yet

http://stackoverflow.com/questions/10406597/any-way-to-do-an-asynchronous-put-with-astyanax-for-cassandra-or-hector

I saw the ChunkedExample that takes an InputStream, but I am receiving http requests chunk after chunk and as I receive them, I want to stream them into cassandra "without" waiting for a reply from cassandra until the very last byte and then check all responses to make sure the whole file succeeded in being written.

Also, then need an example on getting it back out. It looks like you may have this streaming capability,, right?

Also, is there a forum for questions? I am trying to build an ORM/indexing(albeit lucene indexing with solr/solandra) in one which actually has NamedQuery but with lucene/solr syntax on github right now and trying to find a way to add this file streaming capabiltiy which is a little outside the ORM......solr then avoids the whole reverse indexing hack which is kind of very custom.

thanks,
Dean

@elandau
Copy link
Contributor

elandau commented May 2, 2012

I updated the chunked example, which had bad code for the read.

Regarding the write path are you reading chunks for a single file from a single HTTP request or can you get the chunks out of order via multiple requests (kinda like S3). For the former, you could use ChunkedInputStream to bridge the chunked http request and the object store API. You can use a high concurrency level in the ObjectWriter by calling withConcurrencyLevel() to speed up writing to cassandra. If you use this with token aware and have a lot of connections in your pool then you'd be able to get through the chunks quickly.

There is no callback in the API. It just returns the ObjectMetadata if the write was successful or throws an exception if not. It will do proper cleanup if it does fail to write the entire block.

@deanhiller
Copy link
Author

sweeeeet, love the cleanup part(was wondering about that). I have a question on the threading model though.

Isn't it asynchronous writes to cassandra(though on stackoverflow, you said thrift can't do asynchornous though I thought it could these days)...ie. I want to avoid write(1ms delay to response))write(another 1ms) write 1000 more times and have 1000ms. Of course, having 10 threads brings this to 100ms. BUT I heard thrift has an asycnhronous now so wouldn't it be better to write, write, write, receive request, receive request, etc. etc. (ie. do all the writes up front with no waiting for the response). That would be perfect.

@elandau elandau closed this as completed Nov 15, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants