export/manage as XML #8

dret · 2013-09-11T01:25:59Z

being an XML dinosaur, i'd love to get all of this as XML. but then again, there probably is not a enough regularity in the current markdown, so maybe that's too much to ask for? in whichever way it's implemented, wouldn't it be great to have a machine-readable version of all of this information? it would be simple to generate markdown, but much simpler to also generate other formats: whatever people want. and out of curiosity: how are the emacs and js versions generated currently?

andreineculau · 2013-09-11T09:26:27Z

generating XML that should be fine and rather easy - generation is a low-level hack atm (look in the dev branch). And I guess you can either go md -> xml or json -> xml, but I'd prefer the former since md is the source.

PS: I went for having the markdown as primary-source - it will be humans making additions, and I don't plan on adding more information that the "title, description, link" rule-of-thumb, so the markdown will be rather easy to parse by brute regexps.

andreineculau · 2013-09-11T11:13:26Z

I refactored a bit the master&dev branch to reflect the intention better

dret · 2013-09-12T22:53:58Z

"let's people do what they want and some regexes will parse that into robust structures" is among the more famous last words before something went down in flames. of course entirely your decision, but i think i'd rather stay away from writing regexes that probably break every now and then.
over at https://github.com/dret/HTML5-overview i have decided to go the opposite route and start from XML and drive MD from that (still need to work on that... :-), but of course that's also because i am an XML guy and have no issues with editing XML, which is something that maybe many people just don't want to do.
anyway, great initiative, and good luck!

andreineculau · 2013-09-13T14:19:58Z

Shame on me for expecting a boring reference to Now you have two problems :)

FWIW I have obviously started with the same reasoning locally (YML actually, not JSON; no visible commits) but I quickly switched to this "primitive" alternative. Just to lay down some thoughts leading to this outcome:

a project switching from structured data to MD&regexes -> never (say never). It just feels stupid. But if I ever sense that the current setup is creating grief, be sure I will switch to structured data. Not sure if I will go through the trouble of trying a Markdown2AST parser first.
I wanted to make use of github's MD rendering
I wanted the (github's rendered) MD to always be the-latest-version because that's what people will read
it's ok for the structured data to be out-of-date because it is for machines and they will be targeting a tag/hash, so they'll be out-of-date anyways.

Nice project you have as well, and
repeat after me: This data is "beautifully, unapologetically XML" :)

dret · 2013-09-13T16:05:05Z

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 would be the most appropriate reference here. i'll be happily managing my XML over at https://github.com/dret/HTML5-overview and contribute to HTTP in cthulhu markup ;-)
if you're still mildly interested: for HTML5, the XML is the master, but refreshing really is nothing more than running the xml2ms.xslt XSLT, which takes around 0.1 sec on my machine. done, all MDs refreshed, and no brittle regex magic required for anything. and i think it's more the other way around: if you provide an easily consumable starting point, you might find others (such as myself) using it to do interesting things. if you don't, these things are simply less likely to happen. so waiting for them to happen and then making the switch is kind of backwards.

andreineculau · 2013-09-15T13:22:14Z

FWIW

the most appropriate reference here

I'm not using regex to parse HTML. I'm using regex to parse some very simple MD (specifically rows only, meaning column=pipe delimited tokens).

if you provide an easily consumable starting point

But I do - it's not MD, it's JSON atm. That's what is intended as a published package. I don't expect anyone else to consume MD.

dret closed this as completed Sep 12, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

export/manage as XML #8

export/manage as XML #8

dret commented Sep 11, 2013

andreineculau commented Sep 11, 2013

andreineculau commented Sep 11, 2013

dret commented Sep 12, 2013

andreineculau commented Sep 13, 2013

dret commented Sep 13, 2013

andreineculau commented Sep 15, 2013

export/manage as XML #8

export/manage as XML #8

Comments

dret commented Sep 11, 2013

andreineculau commented Sep 11, 2013

andreineculau commented Sep 11, 2013

dret commented Sep 12, 2013

andreineculau commented Sep 13, 2013

dret commented Sep 13, 2013

andreineculau commented Sep 15, 2013