Skip to content

Conversation

@alexprengere
Copy link
Contributor

@alexprengere alexprengere commented Nov 28, 2025

@alexprengere alexprengere force-pushed the textwrap_without_ansi_escapes branch from 21b7359 to 6517434 Compare November 28, 2025 14:46
@alexprengere
Copy link
Contributor Author

alexprengere commented Nov 28, 2025

This is just a POC, it needs tests, docs, etc.
Note that there are other issues with unicode handling, textwrap will split multiple-code-points glyphs.

>>> textwrap.wrap("Cafe\u0301 is good", width=4)
['Cafe', '́ is', 'good']

@picnixz
Copy link
Member

picnixz commented Nov 29, 2025

Sorry but we should not change textwrap now. It's a major change and this may break more than one code (it's also a new feature so we cannot fix 3.14 directly). I've had this kind of issue with Sphinx recently: sphinx-doc/sphinx#13762 where 0-width glyphs are not handled correctly by textwrap. I will close this PR and suggest that you first focus on fixing argparse itself for the colors if you want.

@picnixz picnixz closed this Nov 29, 2025
@picnixz
Copy link
Member

picnixz commented Nov 29, 2025

Oh actually the patch is quite simple. However as it's a new feature we cannot backport it =/ so we need to think of a solution that can be backported. If it's not possible I would suggest:

  • Leaving it like this in 3.14.
  • Fix it in 3.15 with the new feature you suggested.

If the PR is not ready, I will make it a draft.

@picnixz picnixz reopened this Nov 29, 2025
@picnixz picnixz marked this pull request as ready for review November 29, 2025 09:18
@picnixz picnixz marked this pull request as draft November 29, 2025 09:18
@alexprengere
Copy link
Contributor Author

Thanks for the reply! I was not thinking about backporting the changes to 3.14 to be honest.

The cleanest solution would be to incorporate a complete unicode handling with a technique similar to wcwidth, that would allow for proper wrapping of all glyphs.

That being said, as this would be indeed a major addition to stdlib, a good first step might be to add the new option to ignore_ansi_escape to textwrap, as this is a much smaller change, and backward-compatible as well, if it defaults to False. My initial PR was a draft already, but I can finish it if you feel this is OK.

@picnixz
Copy link
Member

picnixz commented Nov 29, 2025

Well.. looks like the current tests are broken which is not good. I need to think about the changes to textwrap. Ideally, we should be able to specify a len function instead. That's what I did when I wrote my Sphinx' fix. We have a custom _wrap_text. But OTOH, I think we should do something in textwrap to make it extensible or subclassable so that the logic of detecting a word & length is actually overridable. For now, let's hold a bit so that you are not working for naught.

@serhiy-storchaka Do you think it makes sense to make textwrap more customizable in how it detects the different words and splits. We can already do it through a regex but we should also be able to provide a custom length function to have a better tweakable behavior. This is just an idea and this really needs more polishing especially with how we would like to provide such interface if this makes sense.

@alexprengere
Copy link
Contributor Author

Sorry, the draft contained a bug due to a copy paste error, I just pushed a fix.

@serhiy-storchaka
Copy link
Member

See #56708, and also #56777 and #68853. I think that they would also resolve this issue.

@alexprengere
Copy link
Contributor Author

#28136 seems to me the simplest solution to fix this, more extensible than this PR, avoids introducing ANSI escape code knowledge in textwrap module.
So I can close this if #28136, or a similar solution, is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants