Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

multi_get doesn't accept column names #136

Closed
mheffner opened this Issue Jan 19, 2012 · 16 comments

Comments

Projects
None yet
5 participants
Contributor

mheffner commented Jan 19, 2012

From lib/cassandra/cassandra.rb, multi_get is documented to support a list of columns:

  # Multi-key version of Cassandra#get.
  #
  # This method allows you to select multiple rows with a single query.
  # If a key that is passed in doesn't exist an empty hash will be
  # returned.
  #
  # Supports the same parameters as Cassandra#get.
  #
  # * column_family   - The column_family that you are inserting into.
  # * keys            - An array of keys to select.
  # * columns         - Either a single super_column or a list of columns.
  # * sub_columns     - The list of sub_columns to select.
  # * options         - Valid options are:
  #   * :count        - The number of columns requested to be returned.
  #   * :start        - The starting value for selecting a range of columns.
  #   * :finish       - The final value for selecting a range of columns.
  #   * :reversed     - If set to true the results will be returned in reverse order.
  #   * :consistency  - Uses the default read consistency if none specified.
  #
  def multi_get(column_family, keys, *columns_and_options)

However, I can't seem to pass a list of columns to multi_get. For example:

@client.multi_get(CF, ['key1', 'key2'], ['col1', 'col2'], :consistency =>  Cassandra::Consistency::QUORUM)

Dies with: #<StandardError: Value should be a string>

/.bundle/gems/ruby/1.9.1/gems/thrift-0.7.0/lib/thrift/client.rb:35:in `write_string'
/.bundle/gems/ruby/1.9.1/gems/thrift-0.7.0/lib/thrift/client.rb:35:in `write'
/.bundle/gems/ruby/1.9.1/gems/thrift-0.7.0/lib/thrift/client.rb:35:in `send_message'
/.bundle/gems/ruby/1.9.1/gems/cassandra-0.12.1/vendor/0.8/gen-rb/cassandra.rb:107:in `send_multiget_slice'
/.bundle/gems/ruby/1.9.1/gems/cassandra-0.12.1/vendor/0.8/gen-rb/cassandra.rb:102:in `multiget_slice'

Running with DEBUG=1 fails while parsing the output with:

#<NoMethodError: undefined method `[]' for nil:NilClass>

/.bundle/gems/ruby/1.9.1/gems/thrift-0.7.0/lib/thrift/struct_union.rb:160:in `inspect_field'
/.bundle/gems/ruby/1.9.1/gems/thrift-0.7.0/lib/thrift/struct_union.rb:187:in `block in inspect_collection'
/.bundle/gems/ruby/1.9.1/gems/thrift-0.7.0/lib/thrift/struct_union.rb:186:in `each'
/.bundle/gems/ruby/1.9.1/gems/thrift-0.7.0/lib/thrift/struct_union.rb:186:in `inspect_collection'
/.bundle/gems/ruby/1.9.1/gems/thrift-0.7.0/lib/thrift/struct_union.rb:174:in `inspect_field'

@ghost ghost assigned natemueller Jul 23, 2012

Contributor

natemueller commented Jul 23, 2012

Doesn't look too hard, I'll take a stab at it. The whole get_columns vs multi_get_columns vs multi_get thing has bugged me for a while.

Contributor

christian-blades-cb commented Oct 19, 2012

If no-one has made any progress here, I'm running into the same issue and I think I see the problem.

Are there some instructions around somewhere about how to set up an environment to run the unit tests? Or better yet, has someone set up a vagrant box with a sane test environment?

Contributor

natemueller commented Oct 19, 2012

No, every time I look at this I get dragged into a black hole of refactoring. Setting up the test environment should be simple. Once you install the bundle there are rake targets to download, install and run a cassandra server for the tests.

Contributor

mheffner commented Oct 19, 2012

@christian-blades-cb I'd recommend Travis CI, they have Cassandra support.

Contributor

natemueller commented Oct 19, 2012

@jarib was trying to get the tests running on Travis but ran into a few issues with their setup. If you could get it working I'd love to merge it in.

Contributor

mheffner commented Oct 19, 2012

@natemueller @jarib Mind elaborating on what the issues were? I've setup a private repo that uses their Cassandra support for running tests.

Member

psanford commented Oct 19, 2012

@mheffner, are you running the tests against all the previous (still supported) versions of Cassandra (0.6, 0.7, 0.8. 1.0)? I believe the default Cassandra support in Travis only provides the latest version.

Contributor

natemueller commented Oct 19, 2012

Last I heard was:

I'm still working on fixes from the Travis people, but got things temporarily running by using "language: java" instead of Ruby.

He filed an issue with Travis so there's a good chance that's fixed. If you've got a config that works submit a pull request and I'll merge it in.

Contributor

jarib commented Oct 19, 2012

I'm waiting for travis-ci/travis-ci#686 to be resolved. My branch is here.

Contributor

mheffner commented Oct 19, 2012

@psanford We do not, we're just testing our code against the version they offer (1.1x) which matches our production base. Not sure what the support is for multiple versions of services on Travis is.

Contributor

jarib commented Oct 19, 2012

Looks like 0.6 and 0.7 are the only versions that require Java 6. That means we can get a Travis build working for 0.8 and 1.x, and enable the older versions when travis-ci/travis-ci#686 is resolved. I'll give that a shot.

Contributor

natemueller commented Oct 19, 2012

Thanks! Got that merged in and turned on the hooks. I'll test it out later tonight.

Contributor

christian-blades-cb commented Oct 20, 2012

@jarib It works great from my fork.

Contributor

christian-blades-cb commented Oct 20, 2012

About this ticket: It breaks down to a disagreement between the comments on get and multi_get and what _multiget actually does.

_multiget seems to be written assuming if you pass in a column or sub_column, you'll only ever want to get a single column back from the set. You can see on protocol:62 that it assumed you just passed in a single column name, and a few lines later at the bottom of the conditional, it forms a Hash, again assuming a single column.

It does the same for sub_columns in the conditional branch above.

I wanted to change that behavior to do what seems "more correct" and go ahead and sanely accept a list of columns, but that's a breaking change to the API, because I'm modifying the structure of the Hash returned.

Instead, I've modified multi_get_columns to do its thing without hitting the Cassandra cluster with a request for each passed-in key. I've also changed the comments around get and multi_get to reflect reality a bit better. twitter/cassandra#169

Contributor

natemueller commented Oct 21, 2012

Good change. Looks like you ran into some of the same problems I did and I think you found the best solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment