Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Android] App crashing with condition `xref_count == xref_index' not met #106410

Open
kklose23 opened this issue Aug 14, 2024 · 13 comments · May be fixed by #113044
Open

[Android] App crashing with condition `xref_count == xref_index' not met #106410

kklose23 opened this issue Aug 14, 2024 · 13 comments · May be fixed by #113044
Assignees
Labels
area-GC-mono in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@kklose23
Copy link

Description

This is happening when using shell navigation to navigate to a bunch of pages. Unfortunately I was unable to reproduce it on a simpler project to post. So I'm uncertain exactly what's causing this. However, it was working on MAUI version 8.0.40 and broke on subsequent versions.

I was able to reproduce on the Android emulator

Others have posted about this as well:
dotnet/maui#23827
dotnet/maui#23634
dotnet/maui#23826

Steps to Reproduce

See description

Link to public reproduction project repository

No response

Version with bug

8.0.60 SR6

Is this a regression from previous behavior?

Yes, this used to work in .NET MAUI

Last version that worked well

8.0.40 SR5

Affected platforms

Android

Affected platform versions

No response

Did you find any workaround?

No

Relevant log output

[namix.mobileapp] * Assertion at /__w/1/s/src/mono/mono/metadata/sgen-tarjan-bridge.c:1174, condition `xref_count == xref_index' not met, function:processing_build_callback_data, xref_count is 1101 but we added 1096 xrefs
[libc] Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 8651 (namix.mobileapp), pid 8651 (namix.mobileapp)
Copy link
Contributor

Hi I'm an AI powered bot that finds similar issues based off the issue title.

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you!

Open similar issues:

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

@Redth
Copy link
Member

Redth commented Aug 14, 2024

@jonathanpeppers could you or someone on Android team have a look?

@jonathanpeppers
Copy link
Member

I don't think this is related to dotnet/android.

It says it's on this line: src/mono/mono/metadata/sgen-tarjan-bridge.c:1174

If I look on the release/8.0 branch might be this g_assertf():

I think we can transfer to runtime.

@jonathanpeppers jonathanpeppers transferred this issue from dotnet/maui Aug 14, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 14, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 14, 2024
Copy link
Contributor

Tagging subscribers to this area: @BrzVlad
See info in area-owners.md if you want to be subscribed.

@vitek-karas
Copy link
Member

@BrzVlad could you please take a look?

@vitek-karas vitek-karas added this to the 9.0.0 milestone Aug 15, 2024
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Aug 15, 2024
@BrzVlad
Copy link
Member

BrzVlad commented Aug 15, 2024

I've seen reports for crashes like this for a few years now, ever since we switched to the tarjan bridge implementation as default, but we havent yet received a repro project so that we can investigate. The recommended solution is to use the new bridge by adding this to the environment vars: MONO_GC_PARAMS=bridge-implementation=new.

@teo-tsirpanis teo-tsirpanis removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 16, 2024
@mangod9 mangod9 modified the milestones: 9.0.0, Future Aug 28, 2024
@Pmr-precure
Copy link

Pmr-precure commented Oct 3, 2024

We are having the same issue at the moment, version 8.0.91

@kklose23
Copy link
Author

kklose23 commented Oct 3, 2024

We were able to fix the issue using the suggestion above. We added an Environment.txt file with the following:
MONO_GC_PARAMS=bridge-implementation=new

And added this to the project file:

  <ItemGroup>
    <AndroidEnvironment Include="Platforms\Android\Environment.txt" />
  </ItemGroup>

@pubudulk
Copy link

pubudulk commented Oct 8, 2024

We were able to fix the issue using the suggestion above. We added an Environment.txt file with the following: MONO_GC_PARAMS=bridge-implementation=new

And added this to the project file:

  <ItemGroup>
    <AndroidEnvironment Include="Platforms\Android\Environment.txt" />
  </ItemGroup>

Are you suggesting to use the Environment.txt file approach rather than code approach mentioned below? We tried setting the bridge implementation like below but it didnt work.
Java.Lang.JavaSystem.SetProperty("MONO_GC_PARAMS", "bridge-implementation=new,nursery-size=128‌​m,soft-heap-limit=51‌​2m")

@andersondamasio
Copy link

Also having this issue
Version .Net 9 Maui 9.0.40

/__w/1/s/src/mono/mono/metadata/sgen-tarjan-bridge.c:1176, condition xref_count == xref_index' not met, function:processing_build_callback_data, xref_count is 137 but we added 133 xrefs

The problem arises when browsing some pages of the application for a while.

In the .Net 8 version this same problem did not seem to happen.

The application closes without throwing debugger-level exceptions.

Among the suggestions, is there any appropriate action to try to resolve the problem?

@filipnavara
Copy link
Member

I have a pretty good idea why is this happening.

This code path can introduce duplicates into the color->other_colors array:

// Maybe we should make sure we are not adding duplicates here. It is not really a problem
// since we will get rid of duplicates before submitting the SCCs to the client in gather_xrefs
if (color_data)
add_other_colors (color_data, &other->xrefs);

Then we get to the code that counts xref_count:

// Eliminate non-visible SCCs from the SCC list and redistribute xrefs
for (cur = root_color_bucket; cur; cur = cur->next) {
ColorData *cd;
for (cd = &cur->data [0]; cd < cur->next_data; ++cd) {
if (!color_visible_to_client (cd))
continue;
color_merge_array_empty ();
gather_xrefs (cd);
reset_xrefs (cd);
dyn_array_ptr_set_all (&cd->other_colors, &color_merge_array);
xref_count += dyn_array_ptr_size (&cd->other_colors);
}
}

It uses the color_visible_to_client helper method which in turn calls bridgeless_color_is_heavy:

static gboolean
bridgeless_color_is_heavy (ColorData *data) {
if (disable_non_bridge_scc)
return FALSE;
int fanin = data->incoming_colors;
int fanout = dyn_array_ptr_size (&data->other_colors);
return fanin > HEAVY_REFS_MIN && fanout > HEAVY_REFS_MIN
&& fanin*fanout >= HEAVY_COMBINED_REFS_MIN;
}

The fanout metric is computed from the count of other_colors which contain duplicates. Once the references are gathered with gather_xrefs they get deduplicated. This can change the number of fanout nodes just enough that it no longer satisfies the "heavy" property. We still increase the xref_count but in the second loop that iterates over the same data it will get skipped and result in a mismatch:

for (cur = root_color_bucket; cur; cur = cur->next) {
ColorData *src;
for (src = &cur->data [0]; src < cur->next_data; ++src) {
if (!color_visible_to_client (src))
continue;

filipnavara added a commit to filipnavara/runtime that referenced this issue Mar 2, 2025
@filipnavara
Copy link
Member

filipnavara commented Mar 2, 2025

@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Mar 2, 2025
@filipnavara
Copy link
Member

Repro:

Archive.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-GC-mono in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.