Skip to content

Conversation

@ben-schwen
Copy link
Member

Closes #7404

Not sure about the test since we want to essentially test for system.time of the test. Maybe its better to use an atime test? The added tests bombs runners by taking 2 hours with regression :(

@codecov
Copy link

codecov bot commented Dec 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.06%. Comparing base (b0c4ac3) to head (dc00c4e).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7480   +/-   ##
=======================================
  Coverage   99.06%   99.06%           
=======================================
  Files          86       86           
  Lines       16618    16619    +1     
=======================================
+ Hits        16463    16464    +1     
  Misses        155      155           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

github-actions bot commented Dec 16, 2025

  • HEAD=memrecycle_factor slower P<0.001 for memrecycle regression fixed in #5463
  • HEAD=memrecycle_factor slower P<0.001 for DT[by,verbose=TRUE] improved in #6296
    Comparison Plot

Generated via commit dc00c4e

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 2 minutes and 58 seconds
Installing different package versions 21 seconds
Running and plotting the test cases 2 minutes and 38 seconds

Copy link
Member

@aitap aitap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent diagnosis, thank you. Indeed, the costly calls to need2utf8 can be gated by the levels being non-identical (which we need to test anyway). With the regression, profiler shows almost all the time spent in need2utf8charIsASCII.

I think this is more suitable for an atime test than a normal regression test.

src/assign.c Outdated
if (needUtf8Coerce) {
sourceLevels = PROTECT(coerceUtf8IfNeeded(sourceLevels)); protecti++;
targetLevels = PROTECT(coerceUtf8IfNeeded(targetLevels)); protecti++;
if (sourceIsFactor && R_compute_identical(sourceLevels, targetLevels, 0)) needUtf8Coerce = false;
Copy link
Member

@aitap aitap Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the needUtf8Coerce = false assignment covered? I've tried compiling with -Og and setting a breakpoint on the exact instruction setting the register to 0 and it didn't fire during test.data.table(). I think it might be unreachable.

The results of R_compute_identical() shouldn't change after coerceUtf8IfNeeded() because identical() takes encodings into account:

https://github.com/r-devel/r-svn/blob/96eee1cdda590de914d48fed05d8f0783f921da4/src/main/memory.c#L4978-L4997

This is quite convenient because a factor with levels enc2utf8('ø') and a factor with levels iconv('ø', to = 'latin1') will pass the first R_compute_identical() already, without any other string conversions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I removed the unreachable code. Will add an atime test as a separate PR. I guess our current "issue" with atime tests and why we have to keep these branches alive, is that we squash when merging and hence the commits will disappear

@tdhock @Anirban166 does this sound right?

@aitap
Copy link
Member

aitap commented Dec 17, 2025 via email

@ben-schwen ben-schwen merged commit b6ad1a4 into master Dec 17, 2025
13 checks passed
@ben-schwen ben-schwen deleted the memrecycle_factor branch December 17, 2025 11:10
@ben-schwen
Copy link
Member Author

@TysonStanley this should probably be also picked for the release (since it is a regression fix for smth included in 1.17.2)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

benchmark regression

2 participants