-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try parsing semantic elements instead of line-based #11
Conversation
Best of luck with this approach, the described rationale seems absolutely reasonable to me! 👍 |
Note to self: I thought tests passed before rebase at bf7104c ... |
@schoettl The tests were crashing on this branch, because of (what I think is) a typo. I took the liberty to fix the typo in this commit, if it was not accidental, I apologize for stepping on your toes^^ |
@@ -151,7 +151,7 @@ keyword-value = anything-but-newline | |||
|
|||
(* TODO allow empty properties with or without trailing space *) | |||
(* TODO looks like node-property-line also parses :END: *) | |||
node-property-line = ! <':END:> <':'> node-property-name [node-property-plus] <':'> ( <' '> node-property-value | [<' '>] ) <eol> | |||
node-property-line = ! <':END:'> <':'> node-property-name [node-property-plus] <':'> ( <' '> node-property-value | [<' '>] ) <eol> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this was a typo. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're welcome. Good work on the PR so far!
NB: I've asked my business partner on feedback on how to work with semantic elements and he's writing something up as I type this^^
They cannot work as long as the grammar defines line-based parsing
I have some changes I'd like to merge:
My TODOs after merge:
|
@schoettl This looks great! I was initial wary of this PR mainly because of its title. A previous attempt of mine to move from line based parsing to semantic blocks failed miserably. I just read through your changes and I'm totally happy with where this is going. 😄 Definitely LGTM! 👍 |
Thanks! Right, the title of this PR is wrong. I didn't switched to parsing of semantic blocks yet, but I prepared parts of it :) I'll try again in new PR. |
@schoettl I'm late to the party, but I don't want to miss the chance to say that this PR looks great. Good work and solid merge! 🙏 |
In this branch, I work on the higher level syntax according to https://orgmode.org/worg/dev/org-syntax.html
Specifically, I want to check out, if we can move away from line-based parsing towards more semantical blocks, called "elements". The orgmode parser used for export is also called org-element.el.
The spec says, that most elements of the syntax are not context-free and the categories for these elements are
Greater elements are e.g.
#+BEGIN_EXAMPLE
blocks. Some of these blocks contain raw text (EXAMPLE, SRC, COMMENT, ...), others can contain formatted text (CENTER, QUOTE, ...). Hence, it's better to parse context-aware and parse the multi-line raw content in EXAMPLE but formatted text in CENTER block.Also, paragraphs, multi-line footnote definitions, lists, tables, property drawers are maybe better parsed as units instead of line-based.