-
-
Notifications
You must be signed in to change notification settings - Fork 623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalization of unicode cahracter: allow excluding the symbols in the symbols.dic file from the normalization #16624
Comments
cc: @LeonarddeR I hope this will not be a show stopper for this feature, indeed the symbols defined in the symbols file are really crucial to be pronounced as defined there, and not as prescribed by the normalization. This could be tricky to fix. |
Please provide exact steps to reproduce instead of just summing up what's wrong. IMO the bug report template would be more suitable here, since this is definitely not intentional. |
Also, please consider testing with #16622 since I"m pretty sure it is already fixed there. |
Ah thanks, it seems with #16622 it works properly. |
@seanbudd While I understand you reason for closing this, I'm inclined to leave this open and mark this as fixed as soon as #16622 is closed. I think the point raised by @Adriani90 is perfectly valid. It is an expected side effect of the current approach where normalization is only applied to text info and object speech normalization. Character processing and symbol pronunciation is applied thereafter. |
This comment was marked as resolved.
This comment was marked as resolved.
@LeonarddeR - is it possible for the fix for this to be independent to #16622 / #16616 and make it into 2024.3? |
Theoretically yes, that is if we normalize, we need to do symbol processing first, i.e. throw the text through |
2024.3 is still far away, and #16622 seem to fix this. I still think it makes sense to have it enabled by default. This will result in broader community awareness. If side egffects appear, the default behavior could be disabled again later on. Having this enabled will definitely not introduce any severe bug, freeze or crash. |
Closing as fixed by #16622 |
Is your feature request related to a problem? Please describe.
The normalization feature for unicode characters takes priority over symbols in the symbols.dic file. This leads to e.g. following problems
So many characters are now reported in math equations, but not with the symbols.dic pronounciation.
Describe the solution you'd like
Exclude always symbols added to symbols.dic file from normalization.
Describe alternatives you've considered
None
Additional context
None
The text was updated successfully, but these errors were encountered: