Skip to content

Codec 317 | Fix. ColognePhonetic: Duplicate code in some cases#424

Closed
Shalujha0907 wants to merge 5 commits intoapache:masterfrom
Shalujha0907:CODEC-317-ColognePhonetic-Bug-Fix
Closed

Codec 317 | Fix. ColognePhonetic: Duplicate code in some cases#424
Shalujha0907 wants to merge 5 commits intoapache:masterfrom
Shalujha0907:CODEC-317-ColognePhonetic-Bug-Fix

Conversation

@Shalujha0907
Copy link
Contributor

ColognePhonetic: Duplicate code in some cases

This PR fixes the above issue.

Summary
This PR fixes duplicate-code handling in ColognePhonetic when processing characters that do not directly produce output (especially H) and adds regression tests for the affected scenarios.

Root Cause
The duplicate filter depends on the previous effective phonetic code, but skipped/intermediate characters were still influencing lastCode, so adjacent-equivalence checks were performed against the wrong value.

Impact
Fixes incorrect duplicate handling around skipped/intermediate characters.
Preserves expected Cologne Phonetic output rules.
Improves confidence via targeted regression tests.

Copy link
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @Shalujha0907
-1: This PR doesn't fix the bug and adds more broken test cases.
Note the rule "Collapse of all multiple consecutive code digits".

@garydgregory
Copy link
Member

Hello @Shalujha0907

This is now fixed in git master.

Please verify your use case from git master or a 1.22.0-SNAPSHOT from https://repository.apache.org/content/repositories/snapshots/commons-codec/commons-codec/1.22.0-SNAPSHOT/

If appropriate, then close or update this PR with additional tests or fixes for this duplicate issue.

Thank you!

@garydgregory
Copy link
Member

The Jira ticket is https://issues.apache.org/jira/browse/CODEC-317

@Shalujha0907
Copy link
Contributor Author

This is now fixed in git master.

Please verify your use case from git master or a 1.22.0-SNAPSHOT from https://repository.apache.org/content/repositories/snapshots/commons-codec/commons-codec/1.22.0-SNAPSHOT/

If appropriate, then close or update this PR with additional tests or fixes for this duplicate issue.

Thank you!

Hello @garydgregory

Understood! May be we don't need a fix.
But Just one doubt how come "hoffmann" output can be "0366" ? This also violates the rule. Isn't ?

Thank You!

@garydgregory
Copy link
Member

Hello @Shalujha0907

Please check git master.

@Shalujha0907
Copy link
Contributor Author

Hello @Shalujha0907

Please check git master.

Hello! Sure! I have seen it!
I'll close this pr.
Thank You

@Shalujha0907 Shalujha0907 deleted the CODEC-317-ColognePhonetic-Bug-Fix branch February 17, 2026 14:08
@Shalujha0907
Copy link
Contributor Author

Closing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants