Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

White space U+000C FORM FEED in template not preserved #769

Closed
bignose-debian opened this issue Sep 13, 2017 · 4 comments · Fixed by #1366
Closed

White space U+000C FORM FEED in template not preserved #769

bignose-debian opened this issue Sep 13, 2017 · 4 comments · Fixed by #1366
Milestone

Comments

@bignose-debian
Copy link

bignose-debian commented Sep 13, 2017

Expected Behavior

A template containing the white space character U+000C FORM FEED should render with that white space character preserved in the rendered output.

>>> import pprint
>>> import textwrap
>>> template_content = textwrap.dedent("""\
...     Lorem ipsum, dolor sit amet, consectetur adipiscing elit.
...     \N{FORM FEED}
...     Pellentesque maximus a ligula ut vehicula.
...     """)
>>> pprint.pprint(template_content)
('Lorem ipsum, dolor sit amet, consectetur adipiscing elit.\n'
 '\x0c'
 '\n'
 'Pellentesque maximus a ligula ut vehicula.\n')
>>> import jinja2
>>> jinja2.__version__
'2.10'
>>> template = jinja2.Template(template_content)
>>> pprint.pprint(template.render())
('Lorem ipsum, dolor sit amet, consectetur adipiscing elit.\n'
 '\x0c'
 '\n'
 'Pellentesque maximus a ligula ut vehicula.')

Actual Behavior

The U+000C character is removed and replaced with U+000A LINE FEED.

>>> import pprint
>>> import textwrap
>>> template_content = textwrap.dedent("""\
...     Lorem ipsum, dolor sit amet, consectetur adipiscing elit.
...     \N{FORM FEED}
...     Pellentesque maximus a ligula ut vehicula.
...     """)
>>> pprint.pprint(template_content)
('Lorem ipsum, dolor sit amet, consectetur adipiscing elit.\n'
 '\x0c'
 '\n'
 'Pellentesque maximus a ligula ut vehicula.\n')
>>> import jinja2
>>> jinja2.__version__
'2.9.6'
>>> template = jinja2.Template(template_content)
>>> pprint.pprint(template.render())
('Lorem ipsum, dolor sit amet, consectetur adipiscing elit.\n'
 '\n'
 '\n'
 'Pellentesque maximus a ligula ut vehicula.')

Environment

  • Python version: Python 3.5.4
  • Jinja version: 2.9.6
@bignose-debian
Copy link
Author

This problem still occurs with Jinja 2.10:

>>> pprint.pprint(template_content)
('Lorem ipsum, dolor sit amet, consectetur adipiscing elit.\n'
 '\x0c'
 '\n'
 'Pellentesque maximus a ligula ut vehicula.\n')
>>> import jinja2
>>> jinja2.__version__
'2.10'
>>> template = jinja2.Template(template_content)
>>> pprint.pprint(template.render())
('Lorem ipsum, dolor sit amet, consectetur adipiscing elit.\n'
 '\n'
 '\n'
 'Pellentesque maximus a ligula ut vehicula.')

@davidism
Copy link
Member

If this affects you and you can track it down, it would be helpful to see a patch or at least where the behavior is happening. I can't really say if it's intended or not without that, and I'd have to dig through the code myself to see.

@bignose-debian
Copy link
Author

If this affects you and you can track it down […]

My attempts to track it down in a debugger session lead to code I can't inspect. The ‘Template.root_render_func’ turns out to not exist in the Jinja source code; maybe it's compiled from text at run time? I don't know how to track it further than that.

@davidism
Copy link
Member

Yes, you need to look at the parser/compiler, not the template code. You can ask Jinja to render the Python module instead of the final output and see what's going on.

mvolfik added a commit to mvolfik/jinja that referenced this issue Mar 9, 2021
Python str.splitlines() splits by more characters[1], which, however, causes problems when keeping these special characters in processed templates is desirable, i.e. these bug reports: fixes pallets#769, pallets#952, pallets#1313

[1] https://docs.python.org/3/library/stdtypes.html#str.splitlines
mvolfik added a commit to mvolfik/jinja that referenced this issue Mar 9, 2021
Python str.splitlines() splits by more characters[1], which, however, causes problems when keeping these special characters in processed templates is desirable, i.e. these bug reports: fixes pallets#769, pallets#952, pallets#1313

[1] https://docs.python.org/3/library/stdtypes.html#str.splitlines
mvolfik added a commit to mvolfik/jinja that referenced this issue Mar 9, 2021
Python str.splitlines() splits by more characters[1], which, however,
causes problems when keeping these special characters in processed
templates is desirable, i.e. these bug reports: pallets#769, pallets#952, pallets#1313.

The keep_trailing_newlines logic is reworked because splitlines()
removes them already (so they had to be added), while re.split doesn't
so they have to be removed.

[1] https://docs.python.org/3/library/stdtypes.html#str.splitlines
@mvolfik mvolfik mentioned this issue Mar 9, 2021
6 tasks
@davidism davidism added this to the 3.0.0 milestone Mar 9, 2021
davidism pushed a commit to mvolfik/jinja that referenced this issue Apr 5, 2021
Python str.splitlines() splits by more characters[1], which, however,
causes problems when keeping these special characters in processed
templates is desirable, i.e. these bug reports: pallets#769, pallets#952, pallets#1313.

The keep_trailing_newlines logic is reworked because splitlines()
removes them already (so they had to be added), while re.split doesn't
so they have to be removed.

[1] https://docs.python.org/3/library/stdtypes.html#str.splitlines
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants