Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only data for one column is returned when reading row with same qualifier in multiple families #9870

Closed
erifol opened this issue Feb 13, 2023 · 6 comments · Fixed by #9921
Closed
Assignees
Labels
api: bigtable Issues related to the Bigtable API. priority: p2 Moderately-important priority. Fix may not be included in next release. status: investigating The issue is under investigation, which is determined to be non-trivial. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@erifol
Copy link

erifol commented Feb 13, 2023

Environment details

  • OS: macOS Big Sur
  • .NET version: 7
  • Package name and version: Google.Cloud.Bigtable.V2 3.3.0

Steps to reproduce

  1. Create a table with two column families
  2. Create a row with the same column qualifier appearing in both column families
  3. Try to read the row - only data for one of the columns is returned

Seems that the issue could be related to RowAsyncEnumerator;

if (chunk.Qualifier != null && chunk.Qualifier != owner._currentCell.Column?.Qualifier)
. It seems to me that the cell chunk data is not handled if two cell chunks with different families but equal qualifiers appears consecutively.

@jskeet
Copy link
Collaborator

jskeet commented Feb 13, 2023

Assigning to Rishabh to investigate. (I'd have a look myself, but I'm off sick at the moment.)

@jskeet jskeet added status: investigating The issue is under investigation, which is determined to be non-trivial. api: bigtable Issues related to the Bigtable API. labels Feb 13, 2023
@Rishabh-V
Copy link
Contributor

I'll have a look at it first thing tomorrow morning IST and share my investigation. Thanks.

@erifol
Copy link
Author

erifol commented Feb 13, 2023

I tried to write a simple program to reproduce the problem. Hope it is helpful.

        var tableId = "TestTable";
        var table = new Table
        {
            Granularity = Table.Types.TimestampGranularity.Millis,
            ColumnFamilies =
            {
                { "MyFamily1", new ColumnFamily { GcRule = new GcRule { MaxNumVersions = 10 } } },
                { "MyFamily2", new ColumnFamily { GcRule = new GcRule { MaxNumVersions = 10 } } }
            }
        };

        try
        {
            await _tableAdminClient.CreateTableAsync(
                parent: new InstanceName(_settings.ProjectId, _settings.InstanceId),
                tableId,
                table);
        }
        catch (Exception)
        {
            // ignored, table already exists
        }

        var tableName = TableName.FromProjectInstanceTable(_settings.ProjectId, _settings.InstanceId, tableId);
        var rowKey = ByteString.CopyFromUtf8("row-key-0");

        var mutateRowRequest = new MutateRowRequest()
        {
            TableNameAsTableName = tableName,
            RowKey = ByteString.CopyFromUtf8("row-key-0"),
            Mutations =
            {
                Mutations.SetCell(
                    "MyFamily1", 
                    ByteString.CopyFromUtf8("my-qualifier"),
                    ByteString.CopyFromUtf8("in-family-1")),
                Mutations.SetCell(
                    "MyFamily2", 
                    ByteString.CopyFromUtf8("my-qualifier"),
                    ByteString.CopyFromUtf8("in-family-2"))
            }
        };

        await _bigtableClient.MutateRowAsync(mutateRowRequest);

        var row = await _bigtableClient.ReadRowAsync(tableName, rowKey);

        foreach (var family in row.Families)
        {
            foreach (var column in family.Columns)
            {
                Console.WriteLine(column.Cells.First().Value.ToStringUtf8());
            }
        }

The program only prints
in-family-1,
but I would expect it to also print 'in-family-2'. If I change the name of the qualifier in the second column family to my-qualifier-2, the program prints
in-family-1
in-family-2

@Rishabh-V
Copy link
Contributor

Right. I wrote a similar sample app and was able to reproduce this issue.

In addition to different qualifiers working, I observed that if there is another cell with a different qualifier in the first family, then also all the column values are printed.

I will further investigate tomorrow and update when I know more about it. Thanks.

@jskeet jskeet added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Feb 14, 2023
@igorbernstein2
Copy link

It looks like the problem is that when the family changes on lines:

if (chunk.FamilyName != null)
{
if (chunk.FamilyName != owner._currentCell.Family?.Name)
{
owner._currentCell.Family = new Family { Name = chunk.FamilyName };
Debug.Assert(!owner._currentFamilies.ContainsKey(chunk.FamilyName));
owner._currentFamilies[chunk.FamilyName] = owner._currentCell.Family;
}
owner.Assert(chunk.Qualifier != null, "NewCell has a familyName, but no qualifier");
}

The _currentCell contents are not fully reset, only the family info is. So the specific tweak for this bug would be to reset currentCell.Column when FamilyName changes. One possible implementation would be: in the Family change if statement body set a flag familyChanged and use that as an or condition in the column if statement condition.

@igorbernstein2
Copy link

Also, please add a test in the conformance test json file for this issue:
https://github.com/googleapis/conformance-tests/blob/main/bigtable/v2/readrows.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the Bigtable API. priority: p2 Moderately-important priority. Fix may not be included in next release. status: investigating The issue is under investigation, which is determined to be non-trivial. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants