Skip to content

Conversation

@ThrawnCA
Copy link
Contributor

Add unit tests demonstrating incorrect behaviour of StringUtils.abbreviate on short strings (between abbrevMarker.length + 1 and abbrevMarker.length * 2).

The documentation states that the offset character will always appear somewhere in the result, but when an offset is applied to a short string, it may not be.

Thanks for your contribution to Apache Commons! Your help is appreciated!

Before you push a pull request, review this list:

  • Read the contribution guidelines for this project.
  • Run a successful build using the default Maven goal with mvn; that's mvn on the command line by itself.
  • Write unit tests that match behavioral changes, where the tests fail if the changes to the runtime are not applied. This may not always be possible, but it is a best practice.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Each commit in the pull request should have a meaningful subject line and body. Note that a maintainer may squash commits during the merge process.

- treat null marker as empty string
- ensure offset and maxWidth are applied as usual (simple 'substring' won't cut it)
- Abbreviated strings should always retain the 'offset' character
assertAbbreviateWithOffset("...ijklmno", 15, 10);
assertAbbreviateWithOffset("...ijklmno", 16, 10);
assertAbbreviateWithOffset("...ijklmno", Integer.MAX_VALUE, 10);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for extra blank lines, you have // comments.

@ThrawnCA
Copy link
Contributor Author

ThrawnCA commented Jan 19, 2026

I can see some possible paths to fixing this, but more clarity is needed on the desired behaviour.

In cases such as abbreviate("abcdefghijklmno", "...", 6, 10), there are multiple ways that the string could be abbreviated while keeping the contract:

  1. We could prioritise retaining as much of the original string as possible, and truncate it to "abcdefg..."
  2. We could prioritise making the offset the leftmost character, and truncate it to "...ghij..."

If 1) is preferred, then the existing unit test:

assertAbbreviateWithOffset("...ghij...", 6, 10);

is incorrect. It should instead abbreviate to "abcdefg..."

On the other hand, if 2) is preferred, then the existing unit test:

assertAbbreviateWithOffset("abcdefg...", 4, 10);

is incorrect. This case should instead abbreviate to "...efgh..."

Or, if both behaviours are acceptable, then the unit tests are too strict.

The decision on this will inform what direction the fix for short strings will take.

@garydgregory
Copy link
Member

garydgregory commented Jan 19, 2026

Hello @ThrawnCA

Thank you for your report.

The new (failing) test cases look correct to me. I'll wait for your changes...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants