Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meta-data feature? #77

Closed
slomo opened this issue Mar 23, 2012 · 9 comments
Closed

meta-data feature? #77

slomo opened this issue Mar 23, 2012 · 9 comments

Comments

@slomo
Copy link
Contributor

slomo commented Mar 23, 2012

Are there any plans to implement meta data feature (maybe like the one linked?). If not, would you accept a patch for that? And if yes what should the api look like?

@trentm
Copy link
Owner

trentm commented Mar 23, 2012

Definitely a reasonable feature. In my experience I've used this: https://github.com/mojombo/jekyll/wiki/YAML-Front-Matter

At first blush I like the required explicit --- fencing because it (a) sticks out more and (b) avoids possible accidental interpretation of non-metadata as metadata. What do you think? Would your current usage for this require NOT having the '---' fencing?

I definitely wouldn't want to get into parsing the metadata as YAML (don't want a YAML parser dependency). Definitely for starters I'd just want the metadata values to be strings.

While I understand the case for http://freewisdom.org/projects/python-markdown/Meta-Data using lists for metadata values (to allow multiple values), I'm not a fan.

Also, I'd be inclined to NOT support the multi-line form:

Authors: Waylan Limberg
         John Doe

If supporting multi-line values at all, I'd say parsing according to email RFC882 headers would be in closes spirit to Markdown (though I realize that same argument argues against having the --- fences). If parsing as RFC882 headers, I'd consider using Python's "email" package:

From http://docs.python.org/library/email-examples.html:

# Import the email modules we'll need
from email.parser import Parser

#  Or for parsing headers in a string, use:
headers = Parser().parsestr('From: <user@example.com>\n'
        'To: <someone_else@example.com>\n'
        'Subject: Test message\n'
        '\n'
        'Body would go here\n')

I'd definitely accept a patch for this.

I think the API should be something like this:

>>> import markdown2
>>> s = """---
Foo: bar
---

*hi* there"""
>>> html = markdown2.markdown2(s, extras=["metadata"])
>>> html
'<p><em>hi</em> there</p>'
>>> html.metadata
{'foo': 'bar'}

That "metadata" property would look like the current "UnicodeWithAttrs.toc_html" property in current markdown2.py: https://github.com/trentm/python-markdown2/blob/master/lib/markdown2.py#L1823.

Thoughts?

@trentm
Copy link
Owner

trentm commented Mar 23, 2012

Note that I already have some prior art in my own code. My "restdown" processor accepts Markdown files with a leading metadata block, strips that off, then processes the rest of the file with markdown2.py. Here is the full metadata handling:

https://github.com/trentm/restdown/blob/master/bin/restdown#L178-187

That is the simple implementation of single-line-only key-value pairs. Single-line has proved good enough in my usage.

@slomo
Copy link
Contributor Author

slomo commented Mar 23, 2012

Fencing works for my use case. I think it is also very reasonable.

If i have time next week, I'll like to take a first step in adding writing some code.

@slomo
Copy link
Contributor Author

slomo commented Mar 23, 2012

I have written a few lines implementing the feature a35149e174778226218278fedd22c6fa8a4e03c8 . Anything I should add or change?

I tried also to integrate this in to the remaining code 03790bcbf3b507812779bd7ebdda341a4bb0eded, it feels very hacky, wouldn't it be more eloquent to put the code for extras in different classes (with a convert, reset ... method each) and than pass the existing class as state?

@trentm
Copy link
Owner

trentm commented Mar 24, 2012

Uh, crazy:

That first link shows your commit. I sure made me think you'd hacked my repo and pushed a commit to it already. :) Feels like a github bug.

@trentm
Copy link
Owner

trentm commented Mar 24, 2012

Comments added on the commits. Generally looks great.

A test case would be much appreciated. Here is an example commit adding a test case to give you an idea how: 8d1ec6c
The test driver doesn't provide a way to test the parsed out metadata. You could just add a "test/tm-cases/metadata.metadata" file (a repr of the metadata dict) and I can add the metadata checking to the test driver.

Yes, granted the handling of "extras" isn't the cleanest separation in markdown2.py. I don't want to change that here, though. The support for extras grew organically. Having the extras spread throughout the other markdown2 processing is messy, but also means the Markdown class doesn't need to support a large number of specific hooks for all the places you see self.extras being used currently. I haven't found they've gotten in the way for maintenance so far.

@slomo
Copy link
Contributor Author

slomo commented Mar 31, 2012

it took a while. But i had finally some time, and tried to integrate all suggestions. Pull request: #78

@trentm
Copy link
Owner

trentm commented Apr 4, 2012

added. See issue #78 for details. Thanks!

@trentm trentm closed this as completed Apr 4, 2012
@trentm
Copy link
Owner

trentm commented Apr 4, 2012

Added a wiki page for this extra: https://github.com/trentm/python-markdown2/wiki/metadata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants