Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta extension: Empty line in YAML is considered as `End of Document` #390

Closed
alixedi opened this issue Feb 25, 2015 · 14 comments

Comments

@alixedi
Copy link

commented Feb 25, 2015

For:

PyYAML==3.11

This fails:

doc = """---
First name: John
Last name: Doe

email: john@doe.com

---

Hello World!
"""

import markdown
md = markdown.Markdown(extensions=[markdown.extensions.meta.MetaExtension(yaml=True)])
assert md.convert(doc) == '<p>Hello World</p>'

I understand that YAML can have blank lines.

@alixedi

This comment has been minimized.

Copy link
Author

commented Feb 25, 2015

Looking at meta.py, line 80, it appears this is deliberate?

if line.strip() == '' or have_yaml and YAML_END_RE.match(line):
    break # blank line or end of YAML header - done
@waylan

This comment has been minimized.

Copy link
Member

commented Feb 25, 2015

I am very much inclined to reject this. I was hesitant to add support for YAML to begin with. Previously, we didn't even support the YAML style deliminators. At that time, a blank line was the only way to determine the end of the meta data and the start of the document. As you note, the implementation is still very much dependent on that blank line. And i intend to continue to support meta-data without YAML style deliminators for the foreseeable future. However, as stated in the documentation, this feature (meta-data) was inspired by MultiMarkdown which also includes support for YAML style deliminators (but not YAML data). Therefore, in the last release, we added support for the deliminators. However, it is the blank line that really ends the data section (the deliminators are just ignored as long as they are on the first or last line). Yes, this means that a blank line gets precedence over the closing deliminator in determining the end of the data section. This was an intentional design decision which gives us full conformity with MultiMarkdown's implementation.

In the process of adding the above mentioned support, someone requested that we also support YAML data. I was hesitant, but saw the value in users being able to have more control over the structure of their data so I included it as an optional feature. That said, I am more inclined to remove it that to expand it. It seems to me that if you want full-featured YAML support, a third party implementation would be better suited to your needs.

@facelessuser

This comment has been minimized.

Copy link
Collaborator

commented Feb 25, 2015

I kinda feel like YAML support could be its own plugin...maybe even should. When people see YAML support, they are going to think they can use the traditional YAML frontmatter (which is what they really want). But the truth is you can't, which I think is fine.

I kind of thought YAML was already implemented as 3rd party extension listed in the wiki https://github.com/teoric/python-markdown-yaml-meta-data, so I was kind of surprised to see it added officially. I haven't used the 3rd party or the official because I strip out YAML via a wrapper around Markdown to use as the data as actual HTML meta data, or use special keywords to set settings or dynamically control which extensions get loaded for a specific file.

@waylan

This comment has been minimized.

Copy link
Member

commented Feb 26, 2015

@facelessuser in retrospect I agree. I suspect I will be removing the support for YAML data. I would have never implemented it myself, but when the pull-request came in with it already implemented, I accepted it without thinking it through fully. At the time, it didn't occur to me that YAML permits blank lines and that this would not offer full support. Taking a step back, it is now clear to me that the existing meta-data extension and YAML support are two very different things which should not exist together in the same extension.

I haven't used the 3rd party or the official because I strip out YAML via a wrapper around Markdown to use as the data as actual HTML meta data, or use special keywords to set settings or dynamically control which extensions get loaded for a specific file.

Personally, this is how I think it should be done. I'm always surprised how many people want to use the extension and continue to insist when I suggest they use a wrapper instead. I suppose it is because no one wants to reinvent the wheel. However, if someone wrote a good implementation which was available as a third party package (perhaps via PyPI), I might be inclined to deprecate the extension altogether.

As an aside, I wrote the original meta-data extension many-many years ago as an experiment. At the time I was a relatively new user of Python-Markdown only. However, when I announced the meta-data extension (which I believe was the third I had written) it caught the attention of the maintainer of Python-Markdown (Yuri at the time) and he asked me to join the project. At the time, no extensions were not being shipped with Python-Markdown and when we decided to start doing that, a number of my extensions were all added in at once. I was never particularly happy that meta-data was included because once i built it and started using it, I realized that a wrapper would make for a much more useful tool. I just never developed the need or found the time to write it myself.

@waylan

This comment has been minimized.

Copy link
Member

commented Feb 26, 2015

Perhaps I should explain why I think this would be better as a wrapper than an extension. Consider two projects which make heavy use of the existing Meta-Data extension. mkdocs and Pelican. Both of those projects have a list of default extensions that their users will get out-of-the-box without changing any settings. In each instance, Meta-Data is one of those extensions (and it is required for things to work in each of those tools). However, each project offers a setting to their users where the user can define their own list of extensions. Naturally, that setting can be used to remove/replace the default list. However, what happens when the user removes the meta-data extension? Then the list needs to be checked to ensure that that extension is included and if not, insert it before passing the list on to Markdown. Otherwise things will break. Wouldn't it make more sense for those projects to just run the input file through some code which extracted the meta-data first, then pass the rest of the document on to Markdown without needing to juggle the extension list. And remember, the list of extensions can be class instances or string names. You can't just do a simple if not 'meta' in extensions. You need to loop through the list and check the type of each item. It is true that mkdocs is simpler as their settings file is YAML (however PyYAML does allow python objects to be specified in YAML files so I wouldn't rule the possibility out), but Pelican uses a Python file for settings. You can totally import an extension and create an instance to add to the list of extensions (which is great for the user to configure each extension how s\he desires). The point is, I would much rather see these projects doing something like this:

data, md = get_data(source)
html = markdown.markdown(md, **get_md_kwargs(config, data))
Template(body=html, context=get_context(config, data))

While that is mostly pseudo code which leaves the implementation details out, I think the general concept is a much better way of working with meta data. As an alternate example, the above functions could all be methods on a class. The important point is that the meta-data is extracted before passing the document on to Markdown.

@facelessuser

This comment has been minimized.

Copy link
Collaborator

commented Feb 26, 2015

With stuff like PyYAML, getting the frontmatter and parsing it is trivial. Something like this can be added to a wrapper real easy. Heck you could use whatever content you wanted, JSON, YAML, .

import re
import yaml


def get_frontmatter(string):
    """ Get frontmatter from string """
    frontmatter = {}

    if string.startswith("---"):
        m = re.search(r'^(---(.*?)---[ \t]*\r?\n)', string, re.DOTALL)
        if m:
            try:
                frontmatter = yaml.load(m.group(2))
            except:
                pass
            string = string[m.end(1):]

    return frontmatter, string

I struggled to find a reason that required Python Markdown to parse my frontmatter, when I realized there wasn't really any reason to have Python Markdown do it for me (as it did absolutely nothing with it once it stripped it out), I realized I should just parse it out myself and treat it exactly as it was, just extra data that Python Markdown doesn't care about. It worked out better this way because quite often, I wanted to do something with the info before I handed the actual text to Python Markdown.

@waylan

This comment has been minimized.

Copy link
Member

commented Feb 26, 2015

Right. It occurs to me that when I first created the meta-data extension, JSON and YAML were, at most, in their very early stages (definitely not popular or standard library material) and so the benefit of having a parser of a key/value data format which was easy to write and required no additional dependencies was very attractive to users (the only standard library option was INI files). Today, given the great strides that have been made in python package management, third party dependencies are mostly a non-issue and the more fully developed data formats should be more desirable (PYAML's c dependency is optional so that's not a blocker either). In fact, the maintainers of those projects are better equipped to maintain that sort of thing than a Markdown project would be. That being the case, I think maybe meta-data will be deprecated in the near future. We should leave data parsing to data parsers.

@waylan

This comment has been minimized.

Copy link
Member

commented Feb 26, 2015

To be clear, if I deprecate meta-data, the extension would be broken out into its own package which would get no further development by me. If other's wanted to use it or fork it, they certainly could, although, as stated above, there are better options.

@waylan

This comment has been minimized.

Copy link
Member

commented Feb 26, 2015

@tomchristie or @d0ugal (from mkdocs) and @getpelican (from Pelican), do you guys have any input on the future of Python-Markdown's Meta-Data extension. I'd like to hear from you (presumably Python-Markdown's biggest users now that Django removed the markup contrib app) before doing anything drastic like deprecating an extension that both projects rely heavily on. See the earlier comments in this discussion for the details, but the current idea is to remove the broken YAML support immediately (in 2.6.1 as a bug-fix) as it has not even been a week and then start the deprecation process for the entire extension with the next major release which would be months out at the earliest.

Oh and look for a plan for a 3.0 published sometime in the near future. I'm ruminating on it now and will start to put something together soon. I don't expect much more in the 2.x series (3.0 might be next or if not, after 2.7).

@kernc

This comment has been minimized.

Copy link
Contributor

commented Feb 26, 2015

Could something like

if not have_yaml and line.strip() == '' or have_yaml and YAML_END_RE.match(line):

be used for removingfixing broken YAML support?

@waylan

This comment has been minimized.

Copy link
Member

commented Feb 27, 2015

@kernc that could work, but I'm not sure we should. One thing is certain, I'm not interested in maintaining YAML support long-term. So every time I get a bug report from here on out, I'll wish I had ripped it out now. I'm not sure why people are so insistent that the built-in extensions must support their desired feature. Why can't this be in a third party extension? Especially now that the built-in extensions get no preferential treatment.

@tomchristie

This comment has been minimized.

Copy link

commented Feb 27, 2015

@waylan I've no problem with it being moved into a third party package - certainly sounds like the right approach for your project, and it's probably not much work for us - if you wanted to handle opening a ticket on MkDocs pointing us at the right place etc for the new package, then I can't see any objection.

@d0ugal

This comment has been minimized.

Copy link
Contributor

commented Mar 3, 2015

+1, I agree with @tomchristie.

@waylan waylan closed this in 4f9d4ff Mar 9, 2015

@waylan

This comment has been minimized.

Copy link
Member

commented Mar 9, 2015

For anyone who is interested, I threw together a package that parses meta-data as I would want it to work. It is a standalone package (not an extension) and it can work with any lightweight markup language (not just Markdown). I haven't written the YAML implementation yet, but the MultiMarkdown style meta-data is feature complete. You can even define types for specific keys. Any feedback is welcome (please send feedback through that project not here). Thanks.

https://github.com/waylan/docdata

kernc added a commit to kernc/pelican that referenced this issue May 27, 2015
Add support for YAML metadata in Markdown files
~~Python-Markdown>=2.6 and its meta extension supports YAML headers
and optional `yaml` switch which, when used, parses data with
PyYAML and hence a wee bit different metadata object gets provided.~~
Not anymore.

YAML is supported by python-markdown-yaml-meta-data extension
which uses PyYAML, which returns parsed lists of strings instead
of raw strings, datetime objects instead of string date
representations etc. Pelican needed only slight adjusting, and
now support Jekyll-like YAML headers with aforementioned Markdown
extension.

Related:
* https://github.com/teoric/python-markdown-yaml-meta-data
* getpelican/pelican-plugins#382
* Python-Markdown/markdown#390 (comment)
jsonn pushed a commit to jsonn/pkgsrc that referenced this issue Jul 19, 2015
wiz
Update to 2.6.2:
Python-Markdown 2.3 Release Notes
=================================

We are pleased to release Python-Markdown 2.3 which adds one new extension,
removes a few old (obsolete) extensions, and now runs on both Python 2 and
Python 3 without running the 2to3 conversion tool. See the list of changes
below for details.

Python-Markdown supports Python versions 2.6, 2.7, 3.1, 3.2, and 3.3.

Backwards-incompatible Changes
------------------------------

* Support has been dropped for Python 2.5. No guarantees are made that the
library will work in any version of Python lower than 2.6. As all supported
Python versions include the ElementTree library, Python-Markdown will no
longer try to import a third-party installation of ElementTree.

* All classes are now "new-style" classes. In other words, all classes
subclass from 'object'. While this is not likely to affect most users,
extension authors may need to make a few minor adjustments to their code.

* "safe_mode" has been further restricted. Markdown formatted links must be
of a known white-listed scheme when in "safe_mode" or the URL is discarded.
The white-listed schemes are: 'HTTP', 'HTTPS', 'FTP', 'FTPS', 'MAILTO', and
'news'. Schemeless URLs are also permitted, but are checked in other ways -
as they have been for some time.

* The ids assigned to footnotes now contain a dash (`-`) rather than a colon
(`:`) when `output_format` it set to `"html5"` or `"xhtml5"`. If you are making
reference to those ids in your JavaScript or CSS and using the HTML5 output,
you will need to update your code accordingly. No changes are necessary if
you are outputting XHTML (the default) or HTML4.

* The `force_linenos` configuration setting of the CodeHilite extension has been
marked as **Pending Deprecation** and a new setting `linenums` has been added to
replace it. See documentation for the [CodeHilite Extension] for an explanation
of the new `linenums` setting. The new setting will honor the old
`force_linenos` if it is set, but it will raise a `PendingDeprecationWarning`
and will likely be removed in a future version of Python-Markdown.

[CodeHilite Extension]: extensions/codehilite.html

* The "RSS" extension has been removed and no longer ships with Python-Markdown.
If you would like to continue using the extension (not recommended), it is
archived on [GitHub](https://gist.github.com/waylan/4773365).

* The "HTML Tidy" Extension has been removed and no longer ships with Python-Markdown.
If you would like to continue using the extension (not recommended), it is
archived on [GitHub](https://gist.github.com/waylan/5152650). Note that the
underlying library, uTidylib, is not Python 3 compatible. Instead, it is
recommended that the newer [PyTidyLib] (version 0.2.2+ for Python 3
comparability - install from GitHub not PyPI) be used. As the API for that
library is rather simple, it is recommended that the output of Markdown be
wrapped in a call to PyTidyLib rather than using an extension (for example:
`tidylib.tidy_fragment(markdown.markdown(source), options={...})`).

[PyTidyLib]: http://countergram.com/open-source/pytidylib

What's New in Python-Markdown 2.3
---------------------------------

* The entire code base now universally runs in Python 2 and Python 3 without
any need for running the 2to3 conversion tool. This not only simplifies testing,
but by using Unicode_literals, results in more consistent behavior across
Python versions. Additionally, the relative imports (made possible in Python 2
via absolute_import) allows the entire library to more easily be embedded in a
sub-directory of another project. The various files within the library will
still import each other properly even though 'markdown' may not be in Python's
root namespace.

* The [Admonition Extension] has been added, which implements [rST-style][rST]
admonitions in the Markdown syntax. However, be warned that this extension
is experimental and the syntax and behavior is still subject to change. Please
try it out and report bugs and/or improvements.

[Admonition Extension]: extensions/admonition.html
[rST]: http://docutils.sourceforge.net/docs/ref/rst/directives.html#specific-admonitions

* Various bug fixes have been made.  See the
[commit log](https://github.com/waylan/Python-Markdown/commits/master)
for a complete history of the changes.

Python-Markdown 2.4 Release Notes
=================================

We are pleased to release Python-Markdown 2.4 which adds one new extension
and fixes various bugs. See the list of changes below for details.

Python-Markdown supports Python versions 2.6, 2.7, 3.1, 3.2, and 3.3.

Backwards-incompatible Changes
------------------------------

* The `force_linenos` configuration setting of the CodeHilite extension has been
marked as **Deprecated**. It had previously been marked as "Pending Deprecation"
in version 2.3 when a new setting `linenums` was added to replace it. See
documentation for the [CodeHilite Extension] for an explanation of the new
`linenums` setting. The new setting will honor the old `force_linenos` if it
is set, but `force_linenos` will raise a `DeprecationWarning` and will likely
be removed in a future version of Python-Markdown.

[CodeHilite Extension]: extensions/code_hilite.html

* URLs are no longer percent-encoded. This improves compatibility with the
original (written in Perl) Markdown implementation. Please percent-encode
your URLs manually when needed.

What's New in Python-Markdown 2.4
---------------------------------

* Thanks to the hard work of [Dmitry Shachnev] the [Smarty Extension] has been
added, which implements [SmartyPants] using Python-Markdown's Extension API.
This offers a few benefits over a third party script. The HTML does not need
to be "tokenized" twice, no hacks are required to combine SmartyPants and
code highlighting, and we get markdown's escaping feature for free. Please try
it out and report bugs and/or improvements.

[Dmitry Shachnev]: https://github.com/mitya57
[Smarty Extension]: extensions/smarty.html
[SmartyPants]: http://daringfireball.net/projects/smartypants/

* The [Table of Contents Extension] now supports new `permalink` option
for creating [Sphinx]-style anchor links.

[Table of Contents Extension]: extensions/toc.html
[Sphinx]: http://sphinx-doc.org/

* It is now possible to enable Markdown formatting inside HTML blocks by
appending `markdown=1` to opening tag attributes. See [Markdown Inside HTML
Blocks] section for details. Thanks to [ryneeverett] for implementing this
feature.

[Markdown Inside HTML Blocks]: extensions/extra.html#nested-markdown-inside-html-blocks
[ryneeverett]: https://github.com/ryneeverett

* The code blocks now support emphasizing some of the code lines. To use this
feature, specify `hl_lines` option after language name, for example (using
the [Fenced Code Extension]):

        ```.python hl_lines="1 3"
        # This line will be emphasized.
        # This one won't.
        # This one will be also emphasized.
        ```

    Thanks to [A. Jesse Jiryu Davis] for implementing this feature.

[Fenced Code Extension]: extensions/fenced_code_blocks.html
[A. Jesse Jiryu Davis]: https://github.com/ajdavis

* Various bug fixes have been made.  See the
[commit log](https://github.com/waylan/Python-Markdown/commits/master)
for a complete history of the changes.

Python-Markdown 2.5 Release Notes
=================================

We are pleased to release Python-Markdown 2.5 which adds a few new features
and fixes various bugs. See the list of changes below for details.

Python-Markdown version 2.5 supports Python versions 2.7, 3.2, 3.3, and 3.4.

Backwards-incompatible Changes
------------------------------

* Python-Markdown no longer supports Python version 2.6. You must be using Python
  versions 2.7, 3.2, 3.3, or 3.4.

[importlib]: https://pypi.python.org/pypi/importlib

* The `force_linenos` configuration key on the [CodeHilite Extension] has been **deprecated**
  and will raise a `KeyError` if provided. In the previous release (2.4), it was
  issuing a `DeprecationWarning`. The [`linenums`][linenums] keyword should be used
  instead, which provides more control of the output.

[CodeHilite Extension]: extensions/code_hilite.html
[linenums]: extensions/code_hilite.html#usage

* Both `safe_mode` and the associated `html_replacement_text` keywords will be deprecated
  in version 2.6 and will raise a **`PendingDeprecationWarning`** in 2.5. The so-called
  "safe mode" was never actually "safe" which has resulted in many people having a false
  sense of security when using it. As an alternative, the developers of Python-Markdown
  recommend that any untrusted content be passed through an HTML sanitizer (like [Bleach])
  after being converted to HTML by markdown.

    If your code previously looked like this:

	    html = markdown.markdown(text, same_mode=True)

	Then it is recommended that you change your code to read something like this:

	    import bleach
        html = bleach.clean(markdown.markdown(text))

	If you are not interested in sanitizing untrusted text, but simply desire to escape
	raw HTML, then that can be accomplished through an extension which removes HTML parsing:

		from markdown.extensions import Extension

		class EscapeHtml(Extension):
			def extendMarkdown(self, md, md_globals):
				del md.preprocessors['html_block']
				del md.inlinePatterns['html']

		html = markdown.markdown(text, extensions=[EscapeHtml()])

	As the HTML would not be parsed with the above Extension, then the serializer will
	escape the raw HTML, which is exactly what happens now when `safe_mode="escape"`.

[Bleach]: http://bleach.readthedocs.org/

* Positional arguments on the `markdown.Markdown()` are pending deprecation as are
  all except the `text` argument on the `markdown.markdown()` wrapper function.
  Only keyword arguments should be used. For example, if your code previously
  looked like this:

         html = markdown.markdown(text, ['extra'])

	Then it is recommended that you change it to read something like this:

	    html = markdown.markdown(text, extensions=['extra'])

	!!! Note
	    This change is being made as a result of deprecating `"safe_mode"` as the
		`safe_mode` argument was one of the positional arguments. When that argument
		is removed, the two arguments following it will no longer be at the correct
		position. It is recommended that you always use keywords when they are supported
		for this reason.

* In previous versions of Python-Markdown, the built-in extensions received
  special status and did not require the full path to be provided. Additionally,
  third party extensions whose name started with `"mdx_"` received the same
  special treatment. This behavior will be deprecated in version 2.6 and will
  raise a **`PendingDeprecationWarning`** in 2.5. Ensure that you always use the full
  path to your extensions. For example, if you previously did the following:

        markdown.markdown(text, extensions=['extra'])

    You should change your code to the following:

	    markdown.markdown(text, extensions=['markdown.extensions.extra'])

    The same applies to the command line:

        $ python -m markdown -x markdown.extensions.extra input.txt

    See the [documentation](reference.html#extensions) for a full explanation
    of the current behavior.

* The previously documented method of appending the extension configuration as
  a string to the extension name will be deprecated in Python-Markdown
  version 2.6 and will raise a **`PendingDeprecationWarning`** in 2.5. The
  [`extension_configs`](reference.html#extension_configs) keyword should
  be used instead. See the [documentation](reference.html#extension-configs)
  for a full explanation of the current behavior.

What's New in Python-Markdown 2.5
---------------------------------

*   The [Smarty Extension] has had a number of additional configuration settings
    added, which allows one to define their own substitutions to better support
    languages other than English. Thanks to [Martin Altmayer] for implementing this
	feature.

[Smarty Extension]: extensions/smarty.html
[Martin Altmayer]:https://github.com/MartinAltmayer

*   Named Extensions (strings passed to the [`extensions`][ex] keyword of
    `markdown.Markdown`) can now point to any module and/or Class on your PYTHONPATH.
	While dot notation was previously supported, a module could not be at the root of
	your PYTHONPATH. The name had to contain at least one dot (requiring it to be a
	sub-module). This restriction no longer exists.

	Additionally, a Class may be specified in the name. The class must be at the end of
	the name (which uses dot notation from PYTHONPATH) and be separated by a colon from
	the module.

	Therefore, if you were to import the class like this:

		from path.to.module import SomeExtensionClass

	Then the named extension would comprise this string:

		"path.to.module:SomeExtensionClass"

	This allows multiple extensions to be implemented within the same module and still
	accessible when the user is not able to import the extension directly (perhaps from
	a template filter or the command line).

	This also means that extension modules are no longer required to include the
	`makeExtension`	function which returns an instance of the extension class. However,
	if the user does not specify the class name (she only provides `"path.to.module"`)
	the extension will fail to load without the `makeExtension` function included in
	the module. Extension authors will want to document carefully what is required to
	load their extensions.

[ex]: reference.html#extensions

*   The Extension Configuration code has been refactored to make it a little easier
    for extension authors to work with configuration settings. As a result, the
    [`extension_configs`][ec] keyword now accepts a dictionary rather than requiring
    a list of tuples. A list of tuples is still supported so no one needs to change
    their existing code. This should also simplify the learning curve for new users.

	Extension authors are encouraged to review the new methods available on the
	`markdown.extnesions.Extension` class for handling configuration and adjust their
	code going forward. The included extensions provide a model for best practices.
	See the [API] documentation for a full explanation.

[ec]: reference.html#extension_configs
[API]: extensions/api.html#configsettings

*   The [Command Line Interface][cli] now accepts a `--extensions_config` (or `-c`)
    option which accepts a file name and passes the parsed content of a [YAML] or
	[JSON]   file to the [`extension_configs`][ec] keyword of the `markdown.Markdown`
	class. The contents of the YAML or JSON must map to a Python Dictionary which
	matches the format required by the `extension_configs` keyword. Note that
	[PyYAML] is required to parse YAML files.

[cli]: cli.html#using-extensions
[YAML]: http://yaml.org/
[JSON]: http://json.org/
[PyYAML]: http://pyyaml.org/

*   The [admonition extension][ae] is no longer considered "experimental."

[ae]: extensions/admonition.html

*   There have been various refactors of the testing framework. While those changes
    will not directly effect end users, the code is being better tested which will
    benefit everyone.

*   Various bug fixes have been made.  See the
    [commit log](https://github.com/waylan/Python-Markdown/commits/master)
    for a complete history of the changes.

Python-Markdown 2.6 Release Notes
=================================

We are pleased to release Python-Markdown 2.6 which adds a few new features
and fixes various bugs. See the list of changes below for details.

Python-Markdown version 2.6 supports Python versions 2.7, 3.2, 3.3, and 3.4 as well as PyPy.

Backwards-incompatible Changes
------------------------------

### `safe_mode` Deprecated

Both `safe_mode` and the associated `html_replacement_text` keywords are deprecated
in version 2.6 and will raise a **`DeprecationWarning`**. The `safe_mode` and
`html_replacement_text` keywords will be ignored in version 2.7. The so-called
"safe mode" was never actually "safe" which has resulted in many people having a false
sense of security when using it. As an alternative, the developers of Python-Markdown
recommend that any untrusted content be passed through an HTML sanitizer (like [Bleach])
after being converted to HTML by markdown.

If your code previously looked like this:

    html = markdown.markdown(text, safe_mode=True)

Then it is recommended that you change your code to read something like this:

    import bleach
    html = bleach.clean(markdown.markdown(text))

If you are not interested in sanitizing untrusted text, but simply desire to escape
raw HTML, then that can be accomplished through an extension which removes HTML parsing:

    from markdown.extensions import Extension

    class EscapeHtml(Extension):
        def extendMarkdown(self, md, md_globals):
        del md.preprocessors['html_block']
        del md.inlinePatterns['html']

    html = markdown.markdown(text, extensions=[EscapeHtml()])

As the HTML would not be parsed with the above Extension, then the serializer will
escape the raw HTML, which is exactly what happens now when `safe_mode="escape"`.

[Bleach]: http://bleach.readthedocs.org/

### Positional Arguments Deprecated

Positional arguments on the `markdown.Markdown()` class are deprecated as are
all except the `text` argument on the `markdown.markdown()` wrapper function.
Using positional arguments will raise a **`DeprecationWarning`** in 2.6 and an error
in version 2.7. Only keyword arguments should be used. For example, if your code
previously looked like this:

    html = markdown.markdown(text, [SomeExtension()])

Then it is recommended that you change it to read something like this:

    html = markdown.markdown(text, extensions=[SomeExtension()])

!!! Note
    This change is being made as a result of deprecating `"safe_mode"` as the
    `safe_mode` argument was one of the positional arguments. When that argument
    is removed, the two arguments following it will no longer be at the correct
    position. It is recommended that you always use keywords when they are supported
    for this reason.

### "Shortened" Extension Names Deprecated

In previous versions of Python-Markdown, the built-in extensions received
special status and did not require the full path to be provided. Additionally,
third party extensions whose name started with `"mdx_"` received the same
special treatment. This behavior is deprecated and will raise a
**`DeprecationWarning`** in version 2.6 and an error in 2.7. Ensure that you
always use the full path to your extensions. For example, if you previously
did the following:

    markdown.markdown(text, extensions=['extra'])

You should change your code to the following:

    markdown.markdown(text, extensions=['markdown.extensions.extra'])

The same applies to the command line:

    $ python -m markdown -x markdown.extensions.extra input.txt

Similarly, if you have used a third party extension (for example `mdx_math`), previously
you might have called it like this:

    markdown.markdown(text, extensions=['math'])

As the `"mdx"` prefix will no longer be appended, you will need to change your code
as follows (assuming the file `mdx_math.py` is installed at the root of your PYTHONPATH):

    markdown.markdown(text, extensions=['mdx_math'])

Extension authors will want to update their documentation to reflect the new behavior.

See the [documentation](reference.html#extensions) for a full explanation
of the current behavior.

### Extension Configuration as Part of Extension Name Deprecated

The previously documented method of appending the extension configuration options as
a string to the extension name is deprecated and will raise a
**`DeprecationWarning`** in version 2.6 and an error in 2.7.
The [`extension_configs`](reference.html#extension_configs) keyword should
be used instead. See the [documentation](reference.html#extension-configs)
for a full explanation of the current behavior.

### HeaderId Extension Pending Deprecation

The [HeaderId][hid] Extension is pending deprecation and will raise a
**`PendingDeprecationWarning`** in version 2.6. The extension will be
deprecated in version 2.7 and raise an error in version 2.8. Use the
[Table of Contents][TOC] Extension instead, which offers most of the
features of the HeaderId Extension and more (support for meta data is missing).

Extension authors who have been using the `slugify` and `unique` functions
defined in the HeaderId Extension should note that those functions are now
defined in the Table of Contents extension and should adjust their import
statements accordingly (`from markdown.extensions.toc import slugify, unique`).

[hid]: extensions/header_id.html

### The `configs` Keyword is Deprecated

Positional arguments and the `configs` keyword on the `markdown.extension.Extension` class
(and its subclasses) are deprecated. Each individual configuration option should be passed
to the class as a keyword/value pair. For example. one might have previously initiated
an extension subclass like this:

    ext = SomeExtension(configs={'somekey': 'somevalue'})

That code should be updated to pass in the options directly:

    ext = SomeExtension(somekey='somevalue')

Extension authors will want to note that this affects the `makeExtension` function as well.
Previously it was common for the function to be defined as follows:

    def makeExtension(configs=None):
        return SomeExtension(configs=configs)

Extension authors will want to update their code to the following instead:

    def makeExtension(**kwargs):
        return SomeExtension(**kwargs)

Failing to do so will result in a **`DeprecationWarning`** and will raise an error in the next
release. See the [Extension API][mext] documentation for more information.

In the event that an `markdown.extension.Extension` subclass overrides the `__init__` method
and implements its own configuration handling, then the above may not apply. However, it is
recommended that the subclass still calls the parent `__init__` method to handle configuration
options like so:

    class SomeExtension(markdown.extension.Extension):
        def __init__(**kwargs):
            # Do pre-config stuff here
            # Set config defaults
            self.config = {
                'option1' : ['value1', 'description1'],
                'option2' : ['value2', 'description2']
            }
            # Set user defined configs
            super(MyExtension, self).__init__(**kwargs)
            # Do post-config stuff here

Note the call to `super` to get the benefits of configuration handling from the parent class.
See the [documentation][config] for more information.

[config]: extensions/api.html#configsettings
[mext]: extensions/api.html#makeextension

What's New in Python-Markdown 2.6
---------------------------------

### Official Support for PyPy

Official support for [PyPy] has been added. While Python-Markdown has most likely
worked on PyPy for some time, it is now officially supported and tested on PyPy.

[PyPy]: http://pypy.org/

### YAML Style Meta-Data

The [Meta-Data] Extension now includes optional support for [YAML] style
meta-data. By default, the YAML deliminators are recognized, however, the
actual data is parsed as previously.  This follows the syntax of
[MultiMarkdown], which inspired this extension.

<del>Alternatively, if the `yaml` option is set, then the data is parsed as YAML.</del>
<ins>As the `yaml` option was buggy, it was removed in 2.6.1. It is suggested that a third
party extension be used if you want true YAML support. See [Issue #390][#390] for a full
explanation.</ins>

[MultiMarkdown]: http://fletcherpenney.net/MultiMarkdown_Syntax_Guide#metadata
[Meta-Data]: extensions/meta_data.html
[YAML]: http://yaml.org/
[#390]: Python-Markdown/markdown#390

### Table of Contents Extension Refactored

The [Table of Contents][TOC] Extension has been refactored and some new features
have been added.  See the documentation for a full explanation of each feature
listed below:

*   The extension now assigns the Table of Contents to the `toc` attribute of
    the Markdown class regardless of whether a "marker" was found in the document.
    Third party frameworks no longer need to insert a "marker," run the document
    through Markdown, then extract the Table of Contents from the document.

*   The Table of Contents Extension is now a "registered extension." Therefore, when the `reset`
    method of the Markdown class is called, the `toc` attribute on the Markdown
    class is cleared (set to an empty string).

*   When the `marker` configuration option is set to an empty string, the parser completely
    skips the process of searching the document for markers. This should save parsing
    time when the Table of Contents Extension is being used only to assign ids to headers.

*   A `separator` configuration option has been added allowing users to override the
    separator character used by the slugify function.

*   A `baselevel` configuration option has been added allowing users to set the base level
    of headers in their documents (h1-h6). This allows the header levels to be
    automatically adjusted to fit within the hierarchy of an HTML template.

[TOC]: extensions/toc.html

### Pygments can now be disabled

The [CodeHilite][ch] Extension has gained a new configuration option: `use_pygments`.
The option is `True` by default, however, it allows one to turn off Pygments code
highlighting (set to `False`) while preserving the language detection features of
the extension. Note that Pygments language guessing is not used as that would 'use
Pygments'. If a language is defined for a code block, it will be assigned to the
`<code>` tag as a class in the manner suggested by the [HTML5 spec][spec]
(alternate output will not be entertained) and could potentially be used by a JavaScript
library in the browser to highlight the code block.

[ch]: extensions/code_hilite.html
[spec]: http://www.w3.org/TR/html5/text-level-semantics.html#the-code-element

### Miscellaneous

Test coverage has been improved including running [flake8]. While those changes
will not directly effect end users, the code is being better tested which will
benefit everyone.

[flake8]: http://flake8.readthedocs.org/en/latest/

Various bug fixes have been made.  See the
[commit log](https://github.com/waylan/Python-Markdown/commits/master)
for a complete history of the changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.