-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bigtable] InvalidChunk #2980
Comments
It's tough to say without actually seeing the failed chunk. Can you run this with a debugger? It may be easiest to just dump it in a script and run |
Actually the value is an empty string. |
Gotcha, ISTM the chunk is equivalent to: >>> from google.cloud.bigtable._generated import bigtable_pb2
>>> Chunk = bigtable_pb2.ReadRowsResponse.CellChunk
>>> chunk = Chunk(timestamp_micros=1485886500000000)
>>> chunk
timestamp_micros: 1485886500000000
>>> |
@tseaver The culprit is exactly as described by @junghoahnsc. I don't quite recognize the validation since it was re-worked for UPDATE: I just verified that writing an empty cell works: >>> row = table.row(b'row-key')
>>> row.set_cell(u'col-fam-id', b'col-name', b'')
>>> row.commit() but then reading this row doesn't fail either (because the chunk type must be different): >>> partial_result = table.read_row(b'row-key')
>>> cells = partial_result[u'col-fam-id'][b'col-name']
>>> cells[0].value
''
>>> cells[0].timestamp
datetime.datetime(2017, 2, 2, 22, 5, 4, 333000, tzinfo=<UTC>) @junghoahnsc The state of the chunks is also relevant here, so I need a little more info to reproduce. |
Too much water under the bridge since June :). I think we need to ask @garye, @mbrukman, or @sduskis. The context is:
|
Hi, I am trying to run this step by step: |
The error is triggered in _validate_chunk_row_in_progress method: |
Yeah I think the chunk and timestamp check there is invalid. |
Is there any workaround until it is fixed? |
Is there any way you can validate the same read using Go or Java? Just to gain confidence that the fix I suggested will work... And/or can you just remove those checks from your local copy and verify? |
A timestamp of 0 is valid. I think that there needs to be a chunk.value. |
Timestamp of 0 is fine and empty value for chunk.value is fine, but I think "not chunk.value" is true if it's an empty string which is wrong. |
empty values are not fine in these cases. An empty cell isn't a thing in BT. If there's no reset, there needs to be a value, IIRC. |
@junghoahnsc @yuanbaisc Can you consume the response using the low-level API? >>> # NOT: partial_result = table.read_row(b'row-key')
>>> from google.cloud.bigtable.table import _create_row_request
>>> request_pb = _create_row_request(table.name, row_key=b'row-key')
>>> response_iterator = client._data_stub.ReadRows(request_pb)
>>> all_responses = list(response_iterator) UPDATE: If there is private data returned, you can redact it or feel free to email me (email in my GitHub profile) |
@garye there is no issue in reading in Go/Java.
|
Regarding my earlier comment, @garye and I spoke offline. While there's no such thing as a Bigtable cell with an value, someone can set a filter that strips the value; that's useful in cases where all you want to do is check for existence of a row. The check for value should be removed as well as @garye suggested. I authored an implementation guide which may be incorrect. Sorry about that. |
IIUC, it looks like the current validation check is not correct. Is it easy to remove that? |
OS type and version
Ubuntu 14.04
Python version and virtual environment information
python --version
Python 2.7.6
google-cloud-python version
pip show google-cloud
,pip show google-<service>
orpip freeze
Name: google-cloud-bigtable
Version: 0.22.0
See the code example
This exception is raised all the time. Did I use incorrectly?
Thanks,
The text was updated successfully, but these errors were encountered: