Skip to content

Remove is_disjoint=True optimization from merge code #242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 18, 2025

Conversation

michelletran-codecov
Copy link
Collaborator

This particular optimization is causing some problems with creating a lot of indirect changes. Removing this from the code will fix all these indirect changes caused by incorrect merging. Unfortunately, this change also rolls back the optimization that was introduced.

@Swatinem this optimization was introduced here, so just checking whether there's any downstream implications of reintroducing the less optimal code. For more information, please take a look at the issue.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

@michelletran-codecov michelletran-codecov requested review from Swatinem and a team June 17, 2025 17:16
Copy link

seer-by-sentry bot commented Jun 17, 2025

Sentry detected 1 potential issue in your recent changes

Suspicion: Removing `finish_merge()` might leave `ReportLine` objects with `coverage=None` from the `is_disjoint=True` optimization, potentially causing `TypeError` when accessed in report processing code like `get_line_totals`.
  • Description: The code change removes the finish_merge() call, which was responsible for calculating coverage for ReportLine objects created with coverage=None when the is_disjoint=True optimization was used in merge_line. The ReportLine.create() method defaults coverage to None. When joined=True and is_disjoint=True, merge_line creates ReportLine objects with coverage=None and only combines sessions. Without finish_merge(), these None coverage values persist. Multiple code paths, such as in libs/shared/shared/reports/totals.py (get_line_totals), apps/worker/services/comparison/changes.py, and apps/codecov-api/api/shared/commit/serializers.py, access line.coverage without null checks. Passing None to functions like coverage_type() or line_type() which expect specific types (e.g., bool, str) will result in a TypeError. For example, match coverage_type(line.coverage) would fail. If this bug isn't fixed, accessing these incomplete ReportLine objects could lead to unexpected behavior or technical issues in report processing and display.
  • Code location: libs/shared/shared/reports/resources.py:322~326
  • Suggested fix: Restore the finish_merge() call after the report merge operation to ensure coverage values are calculated for lines created with coverage=None during the is_disjoint=True optimization.

Did you find this useful? React with a 👍 or 👎

@codecov-notifications
Copy link

codecov-notifications bot commented Jun 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Copy link

codecov bot commented Jun 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.21%. Comparing base (90251ff) to head (57df988).
Report is 5 commits behind head on main.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #242      +/-   ##
==========================================
- Coverage   94.21%   94.21%   -0.01%     
==========================================
  Files        1214     1214              
  Lines       45022    45021       -1     
  Branches     1435     1435              
==========================================
- Hits        42418    42417       -1     
  Misses       2303     2303              
  Partials      301      301              
Flag Coverage Δ
workerintegration 61.60% <100.00%> (-0.01%) ⬇️
workerunit 90.62% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@michelletran-codecov michelletran-codecov force-pushed the fix_indirect_changes_issue branch from bbb251c to b45b2c9 Compare June 17, 2025 18:44
Copy link
Contributor

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not opposed to this, but it would be good to understand what exactly is going on?
why exactly is this causing problems? can you create a regression test that reproduces this?

@@ -54,9 +54,7 @@ def merge_reports(
)

joined = get_joined_flag(commit_yaml, session.flags or [])
master_report.merge(report, joined, is_disjoint=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be a good idea to then revert the whole change, otherwise you will leave this flag and code around unused.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revert has a bunch of conflicts. To alleviate the customer issues, I would push this out first then clean up the flag in another commit.

This particular optimization is causing some problems with creating a lot of indirect changes.
Removing this from the code will fix all these indirect changes caused by incorrect merging.
Unfortunately, this change also rolls back the optimization that was introduced.
@michelletran-codecov michelletran-codecov force-pushed the fix_indirect_changes_issue branch from b45b2c9 to 57df988 Compare June 18, 2025 16:22
@michelletran-codecov
Copy link
Collaborator Author

not opposed to this, but it would be good to understand what exactly is going on?
why exactly is this causing problems? can you create a regression test that reproduces this?

I don't actually know. I spent a day tracing the code and logging stuff, but can't really find anything extremely incorrect about the code. When logging, the intermediate reports are correct, but something about merging with the master_report is breaking (and unsure if it's related to append or finalize_merge or even some race condition between Upload Finisher tasks). At this point, I feel like we've spent a long time debugging this error (gone through 2 support cycles, and have been broken for over a month), and should probably actually push a fix for customers, as there's a few who are complaining and noticing this error.

@michelletran-codecov michelletran-codecov added this pull request to the merge queue Jun 18, 2025
Merged via the queue into main with commit 80fc2a8 Jun 18, 2025
39 checks passed
@michelletran-codecov michelletran-codecov deleted the fix_indirect_changes_issue branch June 18, 2025 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants