Skip to content

SOLR-18239: Fixed NPE during size estimator calculations#4427

Merged
epugh merged 8 commits into
apache:mainfrom
jaykay12:SOLR-18239
May 15, 2026
Merged

SOLR-18239: Fixed NPE during size estimator calculations#4427
epugh merged 8 commits into
apache:mainfrom
jaykay12:SOLR-18239

Conversation

@jaykay12
Copy link
Copy Markdown
Contributor

@jaykay12 jaykay12 commented May 14, 2026

https://issues.apache.org/jira/browse/SOLR-18239

Description

Fix NPE which comes in the Size Estimator if Solr Document contains any field with null value

Solution

Added null checks at 2 occurrences wherever this class was defined.

Tests

Earlier, there were no checks or assertions for null valued field, added those assertions in existing test. In the other file, there were no test which covered this piece, added small test taking reference of the existing one there as well as it was an entirely different sub module.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide
  • I have added a changelog entry for my change

@jaykay12 jaykay12 changed the title SOLR-18239 : Fixed NPE during size estimator calculations SOLR-18239: Fixed NPE during size estimator calculations May 14, 2026
Copy link
Copy Markdown
Contributor

@epugh epugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a bit of tidying needed, and of course ci checks.

@@ -0,0 +1,31 @@
# (DELETE ALL COMMENTS UP HERE AFTER FILLING THIS IN
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget to do the clean up!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup done.

Comment thread changelog/unreleased/SOLR-18239-fix-npe-size-estimator.yml Outdated
@jaykay12
Copy link
Copy Markdown
Contributor Author

jaykay12 commented May 14, 2026

thanks @epugh for quickly taking a look.

just a bit of tidying needed, and of course ci checks.

tidying up ✅
since I have raised the PR from fork, 3 such CI checks are awaiting approval for trigger from maintainer community, please approve those workflows.

@epugh epugh requested a review from sigram May 15, 2026 13:31
@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 15, 2026

This all looks good to me, and I'm inclined to merge it when I get a good tests run. The only thing I wonder about, and this is my lack of overall knowledge, is that while we are fixing the NPE, is there any chance that the fact that you can get an NPE implies a bug or issue furthur up the process? How does a field come in that has a null value? Is that acceptable?

Copy link
Copy Markdown

@Aaron-J-Dockter Aaron-J-Dockter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a NullPointerException in Solr’s “object size estimator” logic when encountering fields with null values (in both core and cross-dc modules), and extends tests + changelog to cover the regression.

Changes:

  • Add a null guard in primitiveEstimate(...) to prevent NPEs during size estimation.
  • Extend existing core tests and add cross-dc tests to cover estimate(null) and maps/documents containing null field values.
  • Add an unreleased changelog entry for the fix.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
solr/modules/cross-dc/src/test/org/apache/solr/crossdc/update/processor/MirroringUpdateProcessorTest.java Adds a new unit test covering size estimation behavior with null values (including child docs).
solr/modules/cross-dc/src/java/org/apache/solr/crossdc/update/processor/MirroringUpdateProcessor.java Adds a null check in the size estimator to avoid NPE on obj.getClass().
solr/core/src/test/org/apache/solr/update/processor/IgnoreLargeDocumentProcessorFactoryTest.java Expands existing tests to include estimate(null) and maps with a null entry.
solr/core/src/java/org/apache/solr/update/processor/IgnoreLargeDocumentProcessorFactory.java Adds a null check in the core size estimator to avoid NPE on obj.getClass().
changelog/unreleased/SOLR-18239-fix-npe-size-estimator.yml Adds release-note fragment for the fix.
Comments suppressed due to low confidence (2)

solr/core/src/java/org/apache/solr/update/processor/IgnoreLargeDocumentProcessorFactory.java:181

  • primitiveEstimate checks clazz.isPrimitive(), but since obj is an Object this branch is effectively unreachable (primitives can’t be instances) and primitiveSizes entries (including boxed types) will never be used. If the intent is to account for boxed primitives, switch the check to primitiveSizes.containsKey(clazz) (or get/getOrDefault) so Integer, Long, etc. are counted; otherwise remove the unused map entries/branch to avoid misleading dead code.
    private static long primitiveEstimate(Object obj, long def) {
      if (obj == null) return def;
      Class<?> clazz = obj.getClass();
      if (clazz.isPrimitive()) {
        return primitiveSizes.get(clazz);
      }
      if (obj instanceof String) {
        return (long) ((String) obj).length() * Character.BYTES;
      }
      return def;

solr/modules/cross-dc/src/java/org/apache/solr/crossdc/update/processor/MirroringUpdateProcessor.java:527

  • primitiveEstimate uses clazz.isPrimitive(), but because obj is an Object this condition will never be true for numeric/boolean values (they’re always boxed), so primitiveSizes is effectively unused (except for Strings handled later). Consider changing this to a primitiveSizes.containsKey(clazz) lookup so boxed primitives are counted, or remove the dead code to avoid confusion.
    private static long primitiveEstimate(Object obj, long def) {
      if (obj == null) return def;
      Class<?> clazz = obj.getClass();
      if (clazz.isPrimitive()) {
        return primitiveSizes.get(clazz);
      }
      if (obj instanceof String) {
        return ((String) obj).length() * (long) Character.BYTES;
      }
      return def;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread changelog/unreleased/SOLR-18239-fix-npe-size-estimator.yml
@jaykay12
Copy link
Copy Markdown
Contributor Author

jaykay12 commented May 15, 2026

Not sure, why the CI checks are failing for these 4 tests. link

Screenshot 2026-05-15 at 11 10 06 PM

I tried running these 4 tests individually on my local setup, they worked fine.
Running ./gradlew :solr:core:test on local once to check for the entire suite as well.
Screenshot 2026-05-15 at 11 37 43 PM

@epugh would you know why this would be happening in the actions CI checks?

@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 15, 2026

thanks for trying..... So, Solr's tests are pretty dependent on timing and cpu... Crave runs them on a big box, and so sometimes thing fail.. Any number of tests can fail due to varitions in CI... Often if they fail, I will test them individually, and see if they fail using the seed value.... then that is a true bug.

you ran locally, and they passed, so that is pretty indicative that things are good.

We actualy use various tools including https://develocity.apache.org/scans/tests?search.relativeStartTime=P90D&search.rootProjectNames=solr*&search.timeZoneId=America%2FNew_York to detect which tests are flaky... And then try and improve them. It's on ongoing battle!

@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 15, 2026

This all looks good to me, and I'm inclined to merge it when I get a good tests run. The only thing I wonder about, and this is my lack of overall knowledge, is that while we are fixing the NPE, is there any chance that the fact that you can get an NPE implies a bug or issue furthur up the process? How does a field come in that has a null value? Is that acceptable?

LLM helped me confirm that yes, lots of places a field value can be null!

@epugh epugh merged commit b7aeda9 into apache:main May 15, 2026
4 checks passed
epugh added a commit that referenced this pull request May 15, 2026
Co-authored-by: Eric Pugh <epugh@opensourceconnections.com>
epugh added a commit that referenced this pull request May 15, 2026
Co-authored-by: Eric Pugh <epugh@opensourceconnections.com>
(cherry picked from commit b7aeda9)
@jaykay12 jaykay12 deleted the SOLR-18239 branch May 15, 2026 18:56
epugh added a commit that referenced this pull request May 18, 2026
Co-authored-by: Eric Pugh <epugh@opensourceconnections.com>
(cherry picked from commit b7aeda9)
(cherry picked from commit ea61a99)
epugh added a commit that referenced this pull request May 18, 2026
Co-authored-by: Eric Pugh <epugh@opensourceconnections.com>
(cherry picked from commit b7aeda9)
(cherry picked from commit ea61a99)
epugh added a commit that referenced this pull request May 18, 2026
Co-authored-by: Eric Pugh <epugh@opensourceconnections.com>
(cherry picked from commit b7aeda9)
(cherry picked from commit ea61a99)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants