Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/8.0] Reduce memory usage during relationship cycle detection by keeping track of visited properties whenever there are FK forks #33211

Merged
merged 1 commit into from
Mar 5, 2024

Conversation

AndriySvyryd
Copy link
Member

@AndriySvyryd AndriySvyryd commented Mar 1, 2024

Fixes #33176

Description

During model building, when calculating certain aspects of entity properties, we default to the value on the corresponding principal property if the current property is part of a foreign key. This process is repeated until a non-default value is found or we determine that the foreign keys form a cycle.

In 8.0.2 we improved our cycle detection to detect cycles even when starting on a property that itself is not part of the cycle and we also started examining all the branches to increase the chance of finding a non-default value.

This is the algorithm that was used:

  • Consider the properties and foreign keys to be vertices and edges of a directed multigraph correspondingly.
  • We traverse the graph using BFS.
  • In an effort to save memory, we don't store all the visited properties, instead we just pick one and use it as reference when determining whether we've visited a property alredy.
    • Since the one we picked might have not been on the cycle we pick a new one every P traversal, where P is a prime number that is increased each time, we reach it. It needs to be increased so we are able to detect cycles of arbitrary length and it needs to be prime to ensure that when traversing graphs that have multiple overlapping cycles at some point, we'll pick a property that's contained in all of the overlapping cycles and all visitation branches terminate.

AFAIK the algorithm is correct, but for some models it needs a traversal queue that doesn't fit in memory. Consider the model in the added test, where BaseEntity has an owned property OwnedType with a nested owned property NestedOwnedType and it also has 4 derived types in TPT hierarchy. For this type of hierarchy, we create a foreign key between the derived type and the base using the primary key properties, this effectively creates a 1-length cycle. So, if we start traversing this model from the NestedOwnedType when we get to BaseEntity P is 7 and on each level of the traversal it will recursively branch 4 times. And since the starting property is not on any cycle a new reference property needs to be used and P needs to be increased to 11. So it will take 4^(7 - 2 + 11) = 2^17 visits for the method to give up and since it uses BFS, the queue will need to be able to accommodate 2^15 nodes, which is way too large for just 7 entity types.

To fix this we'll start recording visited properties as soon as we hit a branching path. With this we'll still not use any additional memory if there are no branches, otherwise, we'll use at most O(n) where n is the number of entity types in the model.

Customer impact

Users that hit this scenario will get an OutOfMemoryException during model building. Theoretically, they could work around this by explicitly specifying a null ValueConverter on every primary key property, but this isn't practical, and the exception has no information that could help them. There is an AppContext switch that they could use to disable the previous fix, however if they also have cycles in the model that are longer than 2, they'll just hit a different exception.

How found

Customer reported on 8.0.2. So far there's been only 1 report.

The most likely scenario where this issue manifests is in TPT mapping with 4 or more derived types where the base type is the principal of a relationship chain at least 2 entity types long. While TPT mapping is not very common, if it's used it's likely that there will be 4 or more derived types and relationship chains of length 2 or more are also common.

Regression

Yes, from 8.0.1, introduced in #32598

Testing

Tests added.

Risk

Low, the main logic is not changed. Quirk added.

…ack of visited properties whenever there are FK forks

Fixes #33176
@maumar maumar merged commit b0faa52 into release/8.0 Mar 5, 2024
7 checks passed
@maumar maumar deleted the Issue33176 branch March 5, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants