Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

older style structured properties issue #129

Closed
alexrwix opened this issue Jul 1, 2019 · 7 comments · Fixed by #176
Closed

older style structured properties issue #129

alexrwix opened this issue Jul 1, 2019 · 7 comments · Fixed by #176
Assignees
Labels
priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@alexrwix
Copy link
Contributor

alexrwix commented Jul 1, 2019

There is an additional issue with py27 ndb models with dotted property names.
Related to #126

In the case of a structured property which has repeated properties of different lengths, only the first two items are loaded.

For example:

{
       "int_prop": 42,
       "struct_prop.items.one": ["first", "first-1"],
       "struct_prop.items.two": ["second", "second-1"],
       "struct_prop.items.three": ["third", "third-1", "third-2"]
}

Current result (py37 python-ndb):
TestRepeatedStructuredModel(key=Key('TestRepeatedStructuredModel', '123456'), int_prop=42, struct_prop=StructuredRepeatedPropModel(key=Key('StructuredRepeatedPropModel', None), items=[StructuredPropModel(key=Key('StructuredPropModel', None), one='first', three='third', two='second'), StructuredPropModel(key=Key('StructuredPropModel', None), one='first-1', three='third-1', two='second-1')]))

Expected result (py27 ndb):
TestRepeatedStructuredModel(key=Key('TestRepeatedStructuredModel', '123456'), int_prop=42, struct_prop=StructuredRepeatedPropModel(key=Key('StructuredRepeatedPropModel', None), items=[StructuredPropModel(key=Key('StructuredPropModel', None), one='first', three='third', two='second'), StructuredPropModel(key=Key('StructuredPropModel', None), one='first-1', three='third-1', two='second-1'), StructuredPropModel(key=Key('StructuredPropModel', None), one=None, three='third-2', two=None)]))

All working as expected, if all dotted repeated properties have the same length

{
        "int_prop": 42,
        "struct_prop.items.one": ["first", "first-1"],
        "struct_prop.items.two": ["second", "second-1"],
        "struct_prop.items.three": ["third", "third-1"]
}

Entity Protobuf:

{
  partition_id {
    project_id: "kuk"
  }
  path {
    kind: "TestRepeatedStructuredModel"
    name: "123456"
  }
}
properties {
  key: "int_prop"
  value {
    integer_value: 42
  }
}
properties {
  key: "struct_prop.items.one"
  value {
    array_value {
      values {
        string_value: "first"
      }
      values {
        string_value: "first-1"
      }
    }
  }
}
properties {
  key: "struct_prop.items.three"
  value {
    array_value {
      values {
        string_value: "third"
      }
      values {
        string_value: "third-1"
      }
      values {
        string_value: "third-2"
      }
    }
  }
}
properties {
  key: "struct_prop.items.two"
  value {
    array_value {
      values {
        string_value: "second"
      }
      values {
        string_value: "second-1"
      }
    }
  }
}

Test example:

@ndb_client_context()
    def test_entity_with_dotted_repeated_structured_property_different_length(self):
        db_driver = DbDriver(kind="TestRepeatedStructuredModel")

        ds_key = '123456'
        data = {
            "int_prop": 42,
            "struct_prop.items.one": ["first", "first-1"],
            "struct_prop.items.two": ["second", "second-1"],
            "struct_prop.items.three": ["third", "third-1", "third-2"]
        }

        db_driver.put(ds_key, data)

        key = ndb.Key(TestRepeatedStructuredModel, ds_key)

        retrieved = key.get()
        print(f'retrieved: {retrieved}')

        assert retrieved.int_prop == 42
        assert len(retrieved.struct_prop.items) == 3   // Error: Returning 2 items

        assert key.delete() is None
        assert key.get() is None

        db_driver.clear()
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Jul 1, 2019
@chrisrossi chrisrossi self-assigned this Jul 2, 2019
@chrisrossi chrisrossi added 🚨 This issue needs some love. and removed triage me I really want to be triaged. labels Jul 2, 2019
@chrisrossi
Copy link
Contributor

Looking at this. Thanks.

@chrisrossi
Copy link
Contributor

chrisrossi commented Jul 2, 2019

Hi @alexrwix ,

I'm trying to understand how or why this would come up.

In an interactive console, I've used the py27 version of NDB to create and store an entity:

from google.appengine.ext import ndb


class SubKind(ndb.Model):
    one = ndb.StringProperty()
    two = ndb.StringProperty()
    three = ndb.StringProperty()


class MyKind(ndb.Model):
    foo = ndb.StructuredProperty(SubKind, repeated=True)


entity = MyKind(foo=[
    SubKind(one="Hamlet", two="Othello"),
    SubKind(three="Julius Caesar"),
])

entity.put()

The entity it creates in Datastore is:

  | Name/ID | foo.one | foo.three | foo.two |  
  | id=5632891572715520 | ["Hamlet",null] | [null,"Julius Caesar"] | ["Othello",null] |  

As you can see, missing values for subproperties are filled in with null which is what I would intuitively expect. If it were to actually store lists of uneven length, it would be indeterminate which property values are supposed to go in which subentity. For example, let's say it had actually elided the null values and stored this:

  | Name/ID | foo.one | foo.three | foo.two |  
  | id=5632891572715520 | ["Hamlet"] | ["Julius Caesar"] | ["Othello"] |  

That would get interpreted as having a single subentity where one, two, and three are "Hamlet", "Julius Caesar", and "Othello" respectively, which is not what we were trying to store.

Hewing more closely to your example:

entity = MyKind(foo=[
    SubKind(one="first", two="second", three="third"),
    SubKind(one="first-1", two="second-1", three="third-1"),
    SubKind(three="third-2"),
])

entity.put()

We get:

  | Name/ID | foo.one | foo.three | foo.two |  
  | id=5718229955641344 | ["first","first-1",null] | ["third","third-1","third-2"] | ["second","second-1",null] |  

Because the nulls are included, this should work just fine with the code as-is.

Consider this variant:

entity = MyKind(foo=[
    SubKind(one="first", two="second", three="third"),
    SubKind(three="third-2"),
    SubKind(one="first-1", two="second-1", three="third-1"),\
])

entity.put()

Which gets us:

  | Name/ID | foo.one | foo.three | foo.two |  
  | id=5767558594560000 | ["first",null,"first-1"] | ["third","third-2","third-1"] | ["second",null,"second-1"] |  

Here, if we were to elide the nulls such that foo.one and foo.two each only had two values, then we'd get back the wrong subentities when we tried to decode.

Is there a real-world use case I'm missing where NDB would be writing the subentities without null spacers? If so, how does that happen, and what's the argument for assuming null paddings at the end of the shorter sequences when it seems like they could go anywhere in the sequence?

@alexrwix
Copy link
Contributor Author

alexrwix commented Jul 2, 2019

Hi Chris.
Yes of course.
I'm checking python-ndb with real production content.

Our repeated structured property models inherited from ndb.Expando so the same models sometimes have a different set of properties.

Below is an example of real datastore entity

output property is:

class FileOutput(ndb.Expando):
    items        = ndb.StructuredProperty(FileOutputItem, repeated=True)

class FileOutputItem(ndb.Expando):
    key          = ndb.StringProperty()
    type         = ndb.StringProperty()
    tag          = ndb.StringProperty()
    status       = ndb.StringProperty()
    url          = ndb.StringProperty()
    secure       = ndb.BooleanProperty(default=False)

Here is an entitiy example loaded directly from datastore:

<Entity('Site',
'5d3777e1-3c08-460b-9a30-f8ac858713c9',
'SiteFile',
'19fc1a_3bc5448b27144c16bc6a5bf6270ba580'){
  'output.items.quality': [
    '480p',
    None,
    '144p',
    '720p',
    '1080p',
    '720p',
    '1080p',
    '480p'
  ],
  'date_created': datetime.datetime(2019,
  6,
  27,
  9,
  39,
  54,
  169515,
  tzinfo=<UTC>),
  'user_id': '19fc1a9e-4bdd-4018-a6b8-88499c81e7b5',
  'output.items.url': [
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/480p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/storyboard/sprite/manifest.json',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/storyboard/144p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/720p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/1080p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/clip/720p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/clip/1080p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/clip/480p/mp4/file.mp4',
    'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f000.jpg',
    'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f001.jpg',
    'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f002.jpg',
    'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f003.jpg'
  ],
  'date_updated': datetime.datetime(2019,
  6,
  27,
  9,
  39,
  54,
  169397,
  tzinfo=<UTC>),
  'hash': '7c353c043ce1ced5ac12dea9f52a4c19',
  'output.items.kind': [
    None,
    'manifest',
    'video',
    None,
    None,
    None,
    None,
    None,
    None,
    None,
    None,
    None
  ],
  'output.items.type': [
    'video',
    'storyboard',
    'storyboard',
    'video',
    'video',
    'clip',
    'clip',
    'clip',
    'image',
    'image',
    'image',
    'image'
  ],
  'input.video_bitrate': 7996143,
  'output.items.fps': [
    25,
    None,
    25.0,
    25,
    25,
    25,
    25,
    25
  ],
  'app_id': None,
  'height': 1080,
  'file_size': 6806321,
  'icon_url': 'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f002.jpg',
  'input.display_aspect_ratio': '16:9',
  'output.items.tag': [
    'High',
    None,
    'Low',
    'HD',
    'HD',
    'HD',
    'HD',
    'High',
    None,
    None,
    None,
    None
  ],
  'output.items.height': [
    480,
    None,
    144,
    720,
    1080,
    720,
    1080,
    480,
    1080,
    1080,
    1080,
    1080
  ],
  'file_name': '19fc1a_3bc5448b27144c16bc6a5bf6270ba580',
  'input.duration': 6800,
  'input.audio_bitrate': -1,
  'output.items.status': [
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY'
  ],
  'output.items.key': [
    '480p.mp4',
    'storyboard-sprite.json',
    'storyboard-144p.mp4',
    '720p.mp4',
    '1080p.mp4',
    'clip-720p.mp4',
    'clip-1080p.mp4',
    'clip-480p.mp4',
    'poster.0',
    'poster.1',
    'poster.2',
    'poster.3'
  ],
  'mime_type': 'video/mp4',
  'width': 1920,
  'output.items.video_bitrate': [
    1248686,
    None,
    100000,
    3080326,
    5991112,
    2994678,
    5911071,
    1246376
  ],
  'input.type': 'video',
  'input.sample_aspect_ratio': '1:1',
  'site_id': '5d3777e1-3c08-460b-9a30-f8ac858713c9',
  'file_url': 'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/file',
  'op_status': 'READY',
  'output.items.duration': [
    6760,
    None,
    6800,
    6760,
    6760,
    6760,
    6760,
    6760,
    None,
    None,
    None,
    None
  ],
  'state': '',
  'input.fps': 25.0,
  'output.items.format': [
    'mp4',
    'json',
    'mp4',
    'mp4',
    'mp4',
    'mp4',
    'mp4',
    'mp4',
    'jpg',
    'jpg',
    'jpg',
    'jpg'
  ],
  'output.items.secure': [
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False
  ],
  'output.items.audio_bitrate': [
    -1,
    None,
    -1,
    -1,
    -1,
    -1,
    -1,
    -1,
    None,
    None,
    None,
    None
  ],
  'parent_folder_key': <Key('Site',
  '5d3777e1-3c08-460b-9a30-f8ac858713c9',
  'SiteFolder',
  'media-root'),
  project=wixprivatemedia-gptest>,
  'tags': [
    '_fileOrigin_uploaded'
  ],
  'original_file_name': 'BGbreaking.mp4',
  'output.items.display_aspect_ratio': [
    '427:240',
    None,
    '16:9',
    '16:9',
    '16:9',
    '16:9',
    '16:9',
    '427:240'
  ],
  'output.items.width': [
    854,
    None,
    256,
    1280,
    1920,
    1280,
    1920,
    854,
    1920,
    1920,
    1920,
    1920
  ],
  'input.height': 1080,
  'ns': None,
  'media_type': 'video',
  'store': None,
  'input.width': 1920,
  'input.rotation': 0
}>

After loading with py37 python-ndb, output items array has 8 items instead of with 12

But the biggest problem is that this issue is INCONSISTENT!
It depends on properties order in the entity that returned from datastore!

In case of this properties order, all items loaded correctly

<Entity('Site',
'5d3777e1-3c08-460b-9a30-f8ac858713c9',
'SiteFile',
'19fc1a_3bc5448b27144c16bc6a5bf6270ba580'){
  'parent_folder_key': <Key('Site',
  '5d3777e1-3c08-460b-9a30-f8ac858713c9',
  'SiteFolder',
  'media-root'),
  project=wixprivatemedia-gptest>,
  'tags': [
    '_fileOrigin_uploaded'
  ],
  'original_file_name': 'BGbreaking.mp4',
  'output.items.display_aspect_ratio': [
    '427:240',
    None,
    '16:9',
    '16:9',
    '16:9',
    '16:9',
    '16:9',
    '427:240'
  ],
  'output.items.width': [
    854,
    None,
    256,
    1280,
    1920,
    1280,
    1920,
    854,
    1920,
    1920,
    1920,
    1920
  ],
  'input.height': 1080,
  'ns': None,
  'media_type': 'video',
  'store': None,
  'input.width': 1920,
  'input.rotation': 0,
  'output.items.quality': [
    '480p',
    None,
    '144p',
    '720p',
    '1080p',
    '720p',
    '1080p',
    '480p'
  ],
  'date_created': datetime.datetime(2019,
  6,
  27,
  9,
  39,
  54,
  169515,
  tzinfo=<UTC>),
  'user_id': '19fc1a9e-4bdd-4018-a6b8-88499c81e7b5',
  'output.items.url': [
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/480p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/storyboard/sprite/manifest.json',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/storyboard/144p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/720p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/1080p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/clip/720p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/clip/1080p/mp4/file.mp4',
    'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/clip/480p/mp4/file.mp4',
    'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f000.jpg',
    'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f001.jpg',
    'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f002.jpg',
    'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f003.jpg'
  ],
  'date_updated': datetime.datetime(2019,
  6,
  27,
  9,
  39,
  54,
  169397,
  tzinfo=<UTC>),
  'hash': '7c353c043ce1ced5ac12dea9f52a4c19',
  'output.items.kind': [
    None,
    'manifest',
    'video',
    None,
    None,
    None,
    None,
    None,
    None,
    None,
    None,
    None
  ],
  'output.items.type': [
    'video',
    'storyboard',
    'storyboard',
    'video',
    'video',
    'clip',
    'clip',
    'clip',
    'image',
    'image',
    'image',
    'image'
  ],
  'input.video_bitrate': 7996143,
  'output.items.fps': [
    25,
    None,
    25.0,
    25,
    25,
    25,
    25,
    25
  ],
  'app_id': None,
  'height': 1080,
  'file_size': 6806321,
  'icon_url': 'media/19fc1a_3bc5448b27144c16bc6a5bf6270ba580f002.jpg',
  'input.display_aspect_ratio': '16:9',
  'output.items.tag': [
    'High',
    None,
    'Low',
    'HD',
    'HD',
    'HD',
    'HD',
    'High',
    None,
    None,
    None,
    None
  ],
  'file_name': '19fc1a_3bc5448b27144c16bc6a5bf6270ba580',
  'output.items.height': [
    480,
    None,
    144,
    720,
    1080,
    720,
    1080,
    480,
    1080,
    1080,
    1080,
    1080
  ],
  'input.duration': 6800,
  'input.audio_bitrate': -1,
  'output.items.status': [
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY',
    'READY'
  ],
  'output.items.key': [
    '480p.mp4',
    'storyboard-sprite.json',
    'storyboard-144p.mp4',
    '720p.mp4',
    '1080p.mp4',
    'clip-720p.mp4',
    'clip-1080p.mp4',
    'clip-480p.mp4',
    'poster.0',
    'poster.1',
    'poster.2',
    'poster.3'
  ],
  'mime_type': 'video/mp4',
  'width': 1920,
  'input.type': 'video',
  'output.items.video_bitrate': [
    1248686,
    None,
    100000,
    3080326,
    5991112,
    2994678,
    5911071,
    1246376
  ],
  'input.sample_aspect_ratio': '1:1',
  'site_id': '5d3777e1-3c08-460b-9a30-f8ac858713c9',
  'op_status': 'READY',
  'file_url': 'video/19fc1a_3bc5448b27144c16bc6a5bf6270ba580/file',
  'state': '',
  'output.items.duration': [
    6760,
    None,
    6800,
    6760,
    6760,
    6760,
    6760,
    6760,
    None,
    None,
    None,
    None
  ],
  'input.fps': 25.0,
  'output.items.format': [
    'mp4',
    'json',
    'mp4',
    'mp4',
    'mp4',
    'mp4',
    'mp4',
    'mp4',
    'jpg',
    'jpg',
    'jpg',
    'jpg'
  ],
  'output.items.secure': [
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False,
    False
  ],
  'output.items.audio_bitrate': [
    -1,
    None,
    -1,
    -1,
    -1,
    -1,
    -1,
    -1,
    None,
    None,
    None,
    None
  ]
}

@chrisrossi
Copy link
Contributor

So, I've tweaked my py2.7 test script to use Expando and leave three undeclared, to be handled by the Expando class:

from google.appengine.ext import ndb

class SubKind(ndb.Expando):
    one = ndb.StringProperty()
    two = ndb.StringProperty()


class MyKind(ndb.Expando):
    foo = ndb.StructuredProperty(SubKind, repeated=True)


entity = MyKind(foo=[
    SubKind(one="Hamlet", two="Othello"),
    SubKind(three="Julius Caesar"),
])

entity.put()

Doing this, I do get an entity with uneven repeated property lengths, like you describe:

  | Name/ID | foo.one | foo.three | foo.two |  
  | id=5644755899777024 | ["Hamlet",null] | ["Julius Caesar"] | ["Othello",null] |  

The part that I'm still having trouble with is the fact that the subentities, in effect, are getting scrambled. If we just naively pad the end of the shorter property with nulls, we'll get the following subentities:

SubKind(one="Hamlet", two="Othello", three="Julius Caeser")
SubKind()  # all properties are None

This, obviously, isn't what we tried to store and as far as I can tell is a bug, not a feature to be replicated. Unless I am missing something, the sub-entities in your production database are going to be somewhat mixed up with properties that were saved as being part of one sub-entity, being assigned to a different sub-entity upon retrieval.

Do you see what the problem is here? Is there something I'm missing?

@alexrwix
Copy link
Contributor Author

alexrwix commented Jul 3, 2019

Hi Chris.
Yes. This specific example looks very problematic.
It really looks like py27 ndb bug.
In the current py37 version, we don't have such problem because it's saved as an embedded property.
Am I right?

I have never tried such scenario.
In our case, all "SubKind" constant properties always have values.

What is your suggestion?
How loading problem can be resolved?
We have hundreds of millions of records with such properties.

@chrisrossi
Copy link
Contributor

Hi Chris.
Yes. This specific example looks very problematic.
It really looks like py27 ndb bug.
In the current py37 version, we don't have such problem because it's saved as an embedded property.
Am I right?

Correct.

I have never tried such scenario.

As far as I can tell, your example is such a scenario.

In our case, all "SubKind" constant properties always have values.

In your example, quality has 8 values but url has 12 values. Therefore, there are 4 items that don't have quality values. The issue is that we have no way of knowing which 4 items. We can no longer trust that quality values match up to the correct items upon marshalling from Datastore. We don't have enough data to figure that out. This is what I mean by your item data being "scrambled".

What is your suggestion?
How loading problem can be resolved?
We have hundreds of millions of records with such properties.

That is a good question. I don't know yet. We probably need to have some internal discussion. I believe I understand the problem, but want to discuss with my team what the best approach for a solution might be.

@alexrwix
Copy link
Contributor Author

alexrwix commented Jul 4, 2019

Thank you.
Looking forward to your solution.

@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. 🚨 labels Jul 9, 2019
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. 🚨 labels Jul 29, 2019
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Aug 8, 2019
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 labels Aug 8, 2019
chrisrossi pushed a commit to chrisrossi/python-ndb that referenced this issue Aug 28, 2019
chrisrossi pushed a commit to chrisrossi/python-ndb that referenced this issue Aug 28, 2019
rwhogg added a commit to rwhogg/python-ndb that referenced this issue Nov 14, 2022
Legacy NDB had a bug where, with repeated Expando properties, it could
end up writing arrays of different length if some entities had missing
values for certain subproperties. When Cloud NDB read this out, it might
have only loaded entities corresponding to the shorter array length.
See issue googleapis#129 .

To fix this, with PR googleapis#176 , we made Cloud NDB check if there are length
differences and pad out the array as necessary.

However, depending on the order properties are returned from Datastore in,
we may have accidentally padded with subentities of the wrong kind,
because it was possible to skip over updating the kind if we alternated
between updating repeated properties.

(Eg: A.a with 2 elements, B.b with 3 elements, A.c with 3 elements ->
A would end up with an element of B's kind)
rwhogg added a commit to rwhogg/python-ndb that referenced this issue Nov 14, 2022
Legacy NDB had a bug where, with repeated Expando properties, it could
end up writing arrays of different length if some entities had missing
values for certain subproperties. When Cloud NDB read this out, it might
have only loaded entities corresponding to the shorter array length.
See issue googleapis#129 .

To fix this, with PR googleapis#176 , we made Cloud NDB check if there are length
differences and pad out the array as necessary.

However, depending on the order properties are returned from Datastore in,
we may have accidentally padded with subentities of the wrong kind,
because it was possible to skip over updating the kind if we alternated
between updating repeated properties.

(Eg: A.a with 2 elements, B.b with 3 elements, A.c with 3 elements ->
A would end up with an element of B's kind)
parthea pushed a commit that referenced this issue Nov 16, 2022
… ds (#824)

Legacy NDB had a bug where, with repeated Expando properties, it could
end up writing arrays of different length if some entities had missing
values for certain subproperties. When Cloud NDB read this out, it might
have only loaded entities corresponding to the shorter array length.
See issue #129 .

To fix this, with PR #176 , we made Cloud NDB check if there are length
differences and pad out the array as necessary.

However, depending on the order properties are returned from Datastore in,
we may have accidentally padded with subentities of the wrong kind,
because it was possible to skip over updating the kind if we alternated
between updating repeated properties.

(Eg: A.a with 2 elements, B.b with 3 elements, A.c with 3 elements ->
A would end up with an element of B's kind)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants