Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sgen] Improve sgen-tarjan-bridge assertions #9314

Closed
kumpera opened this issue Jun 26, 2018 · 26 comments
Closed

[sgen] Improve sgen-tarjan-bridge assertions #9314

kumpera opened this issue Jun 26, 2018 · 26 comments
Assignees

Comments

@kumpera
Copy link
Contributor

kumpera commented Jun 26, 2018

We recently got a report of the following assert:

F/        (13301): * Assertion at /Users/builder/jenkins/workspace/xamarin-android-d15-7/xamarin-android/external/mono/mono/metadata/sgen-tarjan-bridge.c:1140, condition `xref_count == xref_index' not met

There's nothing we can do about this without a repro. What we can do is improve assertion to include details on xref_count and xref_index. Plus do some code review of the code around it.

@lewurm
Copy link
Contributor

lewurm commented Jun 27, 2018

#9341

@davidjohnoliver
Copy link

There's nothing we can do about this without a repro.

In case you're not aware, there's a repro linked in reactiveui/ReactiveUI#1379. I'm not the author of that issue but I tested the repro (Samsung Galaxy A5, Xamarin.Android 9.1.0.29) and can confirm that it crashes with the above assert message.

We've encountered this problem in several large Uno apps, unfortunately only in very narrow circumstances that I haven't been able to repro in a shareable sample, but seemingly associated with creating a lot of managed+native GC pressure.

@deakjahn
Copy link

deakjahn commented Dec 19, 2018

Same here. In a concrete app, after some UI manipulations, certainly not something that can be reproed easily. Considering that it's GC, couldn't it be changed to a warning rather than a crash? Some memory leak is still better than completely bringing down an application...

@claudioredi
Copy link

We're getting hundred of crashes per week related to this assert in a relatively big app.

Is moving from Tarjan to New algorithm something that it's worth to try? It's very hard to understand what triggers this since this happens mostly on the wild in random circumstances.

@BrzVlad
Copy link
Member

BrzVlad commented Jun 19, 2019

@claudioredi Yes, you should be fine. We switched the default from new to tarjan about 2 years ago and the crash is tarjan bridge specific.

@jeromelaban
Copy link
Contributor

The "new" bridge is hurting the GC performance very significantly... this issue is still very active and is impacting somewhat recent devices.

@claudioredi
Copy link

claudioredi commented Jun 19, 2019

@jeromelaban It definitively seems to be affecting new devices (or newer Android versions) more.

So you would not recommend moving to New algorithm even if we have hundred of these crashes per week?

@jeromelaban
Copy link
Contributor

@claudioredi it really depends on the performance impact you're getting... The repro provided here is pretty consistent, so it's a matter of choosing between to bad behaviors for enduser experience :(

@claudioredi
Copy link

claudioredi commented Jun 19, 2019

Thanks @jeromelaban. The only benchmark I found comparing the different GC strategies is this one https://devblogs.microsoft.com/xamarin/xamarin-android-garbage-collection-improvements/
(Blog post created by the reporter of this ticket)

It's old and I'm sure that's not the source of the final and universal truth since different applications will not necessarily behave the same... but the difference in there between New and Tarjan seems to be minimal.

@pmahend1
Copy link

pmahend1 commented Nov 7, 2019

Getting the same issue while using ConcurrentStack along with ParallelTask

@BrzVlad
Copy link
Member

BrzVlad commented Jan 8, 2020

Reliably reproduced this issue using the repro from ReactiveUI. Applying the fix for another tarjan bridge bug (376c46b) fixed the crash for me.

@BrzVlad BrzVlad closed this as completed Jan 8, 2020
@davidjohnoliver
Copy link

Thanks @BrzVlad , great news!

@ryanholden8
Copy link

Wanted to say thank you for finding a fix for this issue! We are experiencing this issue. It is holding us back us back from moving forward with an app release since the crashes can happen several times a day.

Is there someway of knowing when this will be released to Xamarin.Android? It looks like this fix is on the master branch where as Xamarin.Android 10.2 is pulling it's mono runtime from the 2019-10 branch. Is there documentation somewhere on how to understand the release process with these PRs?

@BrzVlad
Copy link
Member

BrzVlad commented Feb 12, 2020

We branch every two months. The first branch containing the fix is 2020-02. Having xamarin android use this branch would take 1-2 months. After that, I'm not sure, but probably a similar amount of time to have the new xamarin android in the preview release.

I'm not sure why this issue is a blocker. Using the new bridge implementation should work fine. Do you have problems also with the new bridge ?

@ryanholden8
Copy link

@BrzVlad - Thanks for the quick reply. That's a good point about switching the CG implementations. From reading the comments above, it sounded like the performance impact was large. Our target devices have low resources. But it's definitely worth testing out! Will report back. Thanks again!

@BrzVlad
Copy link
Member

BrzVlad commented Feb 12, 2020

Of course it would be better to use the tarjan bridge but I wouldn't worry too much about it. It was the default up until a few years ago (the new bridge). It is not some debug, workaround fallback, just the previous, less optimized, version.

@pmahend1
Copy link

Setting GC to new bridge does work fine. I am not sure benchmarking between the two though.

@zoli13
Copy link

zoli13 commented Mar 4, 2020

Had exactly the same issue with a simple scenario (open an rg.popup with a single collectionview with 260 images, close, open, close...and crash with tarjan assert: xref_count == xref_index
Adding MONO_GC_PARAMS=bridge-implementation=new to Android environment fixed it (at least so far it seems)

@Deepfreezed
Copy link

I am still getting this error on tarjan-bridge. Using the latest Xamarin version available (10.2.0.100). I can easily reproduce this by keep switching Xamarin Forms Shell Tabs, eventually it will error out. Current solution is to use 'new bridge' mentioned above.

Assertion at /Users/builder/jenkins/workspace/archive-mono/2019-10/android/release/mono/metadata/sgen-tarjan-bridge.c:1140, condition `xref_count == xref_index' not met, function:processing_build_callback_data, xref_count is 119 but we added 117 xrefs

Fatal signal 6 (SIGABRT), code -6 in tid 27694 (.mobile.android)

@BrzVlad
Copy link
Member

BrzVlad commented Mar 30, 2020

The fix is not in 10.2 nor in the 10.3 preview, it will be in the following version.

@Deepfreezed
Copy link

@BrzVlad Thanks for the info.

@brendanzagaeski
Copy link
Contributor

Release status update

The latest Preview version of Xamarin.Android 10.3.0.33 now uses the newer Mono Framework version that includes this fix. Specifically, 376c46b is now present in Xamarin.Android 10.3.0.33, which uses the Mono runtime and class libraries from d90665a.

The fix is not yet included in a Release version. I will update this item again when a Release version is available that includes the fix.

Fix included in Xamarin.Android 10.3.0.33

Fix included on Windows in Visual Studio 2019 version 16.6 Preview 2. To try the Preview version that includes the fix, check for the latest updates in Visual Studio Preview.

Fix included on macOS in Visual Studio 2019 for Mac version 8.6 Preview 1. To try the Preview version that includes the fix, check for the latest updates on the Preview updater channel.

@brendanzagaeski
Copy link
Contributor

Release status update

A new Release version of Xamarin.Android has now been published that includes the fix for this item.

Fix included in Xamarin.Android 10.3.1.0.

Fix included on Windows in Visual Studio 2019 version 16.6. To get the new version that includes the fix, check for the latest updates or install the latest version from https://visualstudio.microsoft.com/downloads/.

Fix included on macOS in Visual Studio 2019 for Mac version 8.6. To get the new version that includes the fix, check for the latest updates on the Stable updater channel.

@oharkins
Copy link

oharkins commented May 1, 2023

Getting this issue in Xamarin.Android 13.2.0.0
[] * Assertion at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/release/mono/metadata/sgen-tarjan-bridge.c:1172, condition xref_count == xref_index' not met, function:processing_build_callback_data, xref_count is 877 but we added 863 xrefs`

@BrzVlad
Copy link
Member

BrzVlad commented May 2, 2023

@oharkins Is this a reliable crasher ? Any change for a project reproducing this ?

@marcojak
Copy link

Having the same issue again with Visual Studio 17.6 and Xamarin.Android 13.2.0.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests