Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#3159 Fix misleading subsequence failure messages in string assertions #3166

Closed
wants to merge 6 commits into from

Conversation

JohnBryte
Copy link
Contributor

@JohnBryte JohnBryte commented Sep 1, 2023

  - Resolved ambiguity in error messages when a subsequence element is repeated in the expectation but not in the actual string.
  - Refactored `assertContainsSubsequence` to handle different subsequence scenarios.
  - Improved error handling by introducing `handleNotFound` to distinctly report missing elements and discrepancies in expected duplicates.
  - Enhanced `getNotFoundSubsequence` to correctly compute and return missing or inadequately represented subsequences.

  This change ensures clearer and more accurate failure messages when asserting string subsequences.

Check List:

Following the contributing guidelines will make it easier for us to review and accept your PR.

Description:

The core assertContainsSubsequence method has undergone a significant overhaul to better cater to a broader range of subsequence scenarios. The driving force behind this refactoring is twofold:

  • Enhanced Subsequence Handling:

    The refactoring ensures that the method can adeptly handle cases where there might be duplicate subsequences present. The essence of this handling lies in distinguishing between two different situations:
    Where the actual string contains the subsequence but not as many times as the subsequence array dictates.
    Where the actual string doesn't contain a particular subsequence at all.

  • Introduction of handleNotFound Method:

    A fresh method, handleNotFound, has been extracted to specifically address the nuances of missing subsequences. A core task of this new method is:
    To scrutinize whether missing subsequences arise because the actual string didn't cater to the repeated instances as specified by the subsequence array.

  • Behavioural Precedence:

In instances where at least one subsequence is completely absent from the actual string, this missing subsequence takes precedence over any discrepancies in the count of other subsequences. As illustrated in the example:

String actual = "Yoda";
String[] subsequence = { "Yo", "da", "Han", "Foo" };

This will yield the same output as:

String actual = "Yoda";
String[] subsequence = { "Yo", "da", "Han", "Foo" , "da"};

Even though "da" appears twice in the latter example, the predominant error message would be about the absence of the strings "Han" and "Foo" in the actual string. This decision was made to ensure that the most pertinent errors (complete absence) are flagged prior to counting discrepancies.

…sertions

      - Resolved ambiguity in error messages when a subsequence element is repeated in the expectation but not in the actual string.
      - Refactored `assertContainsSubsequence` to handle different subsequence scenarios.
      - Improved error handling by introducing `handleNotFound` to distinctly report missing elements and discrepancies in expected duplicates.
      - Enhanced `getNotFoundSubsequence` to correctly compute and return missing or inadequately represented subsequences.

      This change ensures clearer and more accurate failure messages when asserting string subsequences.
…sertions

      - Rewording of error message according to suggestion.
      - new Unit test to assert error message in ShouldContainSubsequenceOfCharSequence_create_Test.java
@JohnBryte
Copy link
Contributor Author

Just a quick question, currently it seems my PR fails the checks because of:

[ERROR] /home/runner/work/assertj/assertj/assertj-core/src/test/java/org/assertj/core/error/ShouldContainSubsequenceOfCharSequence_create_Test.java:[73,87] cannot find symbol
  symbol:   method of(java.lang.String,int)
  location: interface java.util.Map

I was wondering if I need to use ImmutableMap, as I've seen that ImmutableMap is used almost exclusively
Or should I do something like

Map<CharSequence, Integer> notFound = new HashMap<>();
notFound.put(key, value);

Please let me know, so I can try to solve the conflicts

@JohnBryte
Copy link
Contributor Author

I now (hopefully) fixed the formatting issues. If the checks could be run again on occasion I would greatly appreciate it

@joel-costigliola
Copy link
Member

joel-costigliola commented Sep 11, 2023

Thanks @JohnBrytek, the checks are ok, the windows build failure is not related.

Copy link
Member

@joel-costigliola joel-costigliola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnBryte thanks for the PR, first round of review done.

+
" [\"{\", \"title\", \"author\", \"title\", \"}\"]%n" +
"But%n" +
"2nd occurrence of \"title\" was not found%n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we go the next line ? I feel it would be better like this (unless I'm missing something)

 "But the 2nd occurrence of \"title\" was not found%n" +

I'm adding the before 2nd as it sounds to me to be better English wise

Copy link
Contributor Author

@JohnBryte JohnBryte Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Joel, thanks for all the feedback!
The idea behind the next line was (my latest push has an explicit test to demonstrate) if there are multiple duplicates in they would all be listed underneath each other and form a "block".
E.g.:

    String[] sequenceValues = { "{", "title", "George", "title", "title", "George", "}" };
    String actual = "{ 'title':'A Game of Thrones', 'author':'George Martin', 'title':'The Kingkiller Chronicle', 'author':'Patrick Rothfuss'}";

The expected outcome (with newline) would be something like:

"But"
"the 3rd occurrence of \"title\" was not found%n"
"the 2nd occurrence of \"George\" was not found%n"

Without the new line:

"But the 3rd occurrence of \"title\" was not found%n"
"the 2nd occurrence of \"George\" was not found%n"

But this can be easily adapted to you preference!
Or should this scenario handled in another way?
For example, fail with the first duplicate encountered?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification on the wording, I would say it is better to report as many errors as possible and for the wording my preference would go to this:

But: 
- the 3rd occurrence of \"title\" was not found
- the 2nd occurrence of \"George\" was not found

and if there is only one error:

But the 3rd occurrence of \"title\" was not found

        - Adopted static imports for cleaner code.
        - Modified visibility and added Javadoc to certain methods for clarity.
        - Simplified ordinal determination logic by removing unnecessary else/else if conditions.
        - Expanded test coverage for ordinal method with a parameterized test approach.
@joel-costigliola
Copy link
Member

Thanks for the quick update, I'll do a second review and likely merge the PR unless I find something significant.

@joel-costigliola
Copy link
Member

@JohnBryte only one comment on the PR about the wording on the message.

         - Modify the error message for the case where there's more than one missing subsequence to list them using bullet points.
         - Keep a simple format for the case of a single missing subsequence.
@joel-costigliola
Copy link
Member

Integrated thanks @JohnBryte for the good work!

@JohnBryte
Copy link
Contributor Author

Thank you very much!

@sbrannen
Copy link
Contributor

Thanks for addressing this, @JohnBryte! 👏

@scordio scordio added this to the 3.25.0 milestone Dec 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error message for containsSubsequence is misleading for duplicate subsequence
4 participants