-
Notifications
You must be signed in to change notification settings - Fork 86
👌 IMPROVE: Use regular __init__ to create SyntaxTreeNodes #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
aa57244
Attempt to make SyntaxTreeNode type annotations work when subclassing
hukkinj1 fce3d21
Make _parent and _children properties
hukkinj1 aea99ce
Move _NesterTokens to module level now that SyntaxTreeNode has its ow…
hukkinj1 1b4e817
Add setters for `parent` and `children`
hukkinj1 54188a3
👌 IMPROVE: Use regular __init__ to create SyntaxTreeNodes
hukkinj1 50ae360
Improve error message
hukkinj1 300e539
Minor tweaks
hukkinj1 9f6107b
Merge branch 'master' into tree-types
chrisjsewell 7153766
Merge remote-tracking branch 'upstream/master' into tree-types
hukkinj1 e5f709b
Address PR review
hukkinj1 2f87a05
Merge branch 'tree-types' into tree-init
hukkinj1 68585af
Fix `from_tokens` references
hukkinj1 c0b698a
Fix __getitem__ type annotation
hukkinj1 ec0f98d
Merge branch 'tree-types' into tree-init
hukkinj1 3954652
Merge remote-tracking branch 'upstream/master' into tree-init
hukkinj1 4870159
Address PR review
hukkinj1 c44f026
Merge branch 'master' into tree-init
chrisjsewell File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -12,7 +12,6 @@ | |
| Optional, | ||
| Any, | ||
| TypeVar, | ||
| Type, | ||
| overload, | ||
| Union, | ||
| ) | ||
|
|
@@ -33,8 +32,7 @@ class SyntaxTreeNode: | |
| """A Markdown syntax tree node. | ||
|
|
||
| A class that can be used to construct a tree representation of a linear | ||
| `markdown-it-py` token stream. Use `SyntaxTreeNode.from_tokens` to | ||
| initialize instead of the `__init__` method. | ||
| `markdown-it-py` token stream. | ||
|
|
||
| Each node in the tree represents either: | ||
| - root of the Markdown document | ||
|
|
@@ -43,10 +41,12 @@ class SyntaxTreeNode: | |
| between | ||
| """ | ||
|
|
||
| def __init__(self) -> None: | ||
| """Initialize a root node with no children. | ||
| def __init__( | ||
| self, tokens: Sequence[Token] = (), *, create_root: bool = True | ||
| ) -> None: | ||
| """Initialize a `SyntaxTreeNode` from a token stream. | ||
|
|
||
| You probably need `SyntaxTreeNode.from_tokens` instead. | ||
| If `create_root` is True, create a root node for the document. | ||
| """ | ||
| # Only nodes representing an unnested token have self.token | ||
| self.token: Optional[Token] = None | ||
|
|
@@ -61,6 +61,28 @@ def __init__(self) -> None: | |
| # children (i.e. inline or img) | ||
| self._children: list = [] | ||
|
|
||
| if create_root: | ||
| self._set_children_from_tokens(tokens) | ||
| return | ||
|
|
||
| if not tokens: | ||
| raise ValueError( | ||
| "Can only create root from empty token sequence." | ||
| " Set `create_root=True`." | ||
| ) | ||
| elif len(tokens) == 1: | ||
| inline_token = tokens[0] | ||
| if inline_token.nesting: | ||
| raise ValueError( | ||
| "Unequal nesting level at the start and end of token stream." | ||
| ) | ||
| self.token = inline_token | ||
| if inline_token.children: | ||
| self._set_children_from_tokens(inline_token.children) | ||
| else: | ||
| self.nester_tokens = _NesterTokens(tokens[0], tokens[-1]) | ||
| self._set_children_from_tokens(tokens[1:-1]) | ||
|
|
||
| def __repr__(self) -> str: | ||
| return f"{type(self).__name__}({self.type})" | ||
|
|
||
|
|
@@ -77,16 +99,6 @@ def __getitem__( | |
| ) -> Union[_NodeType, List[_NodeType]]: | ||
| return self.children[item] | ||
|
|
||
| @classmethod | ||
| def from_tokens(cls: Type[_NodeType], tokens: Sequence[Token]) -> _NodeType: | ||
| """Instantiate a `SyntaxTreeNode` from a token stream. | ||
|
|
||
| This is the standard method for instantiating `SyntaxTreeNode`. | ||
| """ | ||
| root = cls() | ||
| root._set_children_from_tokens(tokens) | ||
| return root | ||
|
|
||
| def to_tokens(self: _NodeType) -> List[Token]: | ||
| """Recover the linear token stream.""" | ||
|
|
||
|
|
@@ -186,23 +198,14 @@ def previous_sibling(self: _NodeType) -> Optional[_NodeType]: | |
| return self.siblings[self_index - 1] | ||
| return None | ||
|
|
||
| def _make_child( | ||
| self: _NodeType, | ||
| *, | ||
| token: Optional[Token] = None, | ||
| nester_tokens: Optional[_NesterTokens] = None, | ||
| ) -> _NodeType: | ||
| """Make and return a child node for `self`.""" | ||
| if token and nester_tokens or not token and not nester_tokens: | ||
| raise ValueError("must specify either `token` or `nester_tokens`") | ||
| child = type(self)() | ||
| if token: | ||
| child.token = token | ||
| else: | ||
| child.nester_tokens = nester_tokens | ||
| def _add_child( | ||
| self, | ||
| tokens: Sequence[Token], | ||
| ) -> None: | ||
| """Make a child node for `self`.""" | ||
| child = type(self)(tokens, create_root=False) | ||
| child.parent = self | ||
| self.children.append(child) | ||
| return child | ||
|
|
||
| def _set_children_from_tokens(self, tokens: Sequence[Token]) -> None: | ||
| """Convert the token stream to a tree structure and set the resulting | ||
|
|
@@ -211,27 +214,22 @@ def _set_children_from_tokens(self, tokens: Sequence[Token]) -> None: | |
| while reversed_tokens: | ||
| token = reversed_tokens.pop() | ||
|
|
||
| if token.nesting == 0: | ||
| child = self._make_child(token=token) | ||
| if token.children: | ||
| child._set_children_from_tokens(token.children) | ||
| if not token.nesting: | ||
| self._add_child([token]) | ||
| continue | ||
|
|
||
| assert token.nesting == 1 | ||
| if token.nesting != 1: | ||
| raise ValueError("Invalid token nesting") | ||
|
|
||
| nested_tokens = [token] | ||
| nesting = 1 | ||
| while reversed_tokens and nesting != 0: | ||
| while reversed_tokens and nesting: | ||
| token = reversed_tokens.pop() | ||
| nested_tokens.append(token) | ||
| nesting += token.nesting | ||
| if nesting != 0: | ||
| if nesting: | ||
| raise ValueError(f"unclosed tokens starting {nested_tokens[0]}") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe same here as above for noting the index of the problematic token in the token stream |
||
|
|
||
| child = self._make_child( | ||
| nester_tokens=_NesterTokens(nested_tokens[0], nested_tokens[-1]) | ||
| ) | ||
| child._set_children_from_tokens(nested_tokens[1:-1]) | ||
| self._add_child(nested_tokens) | ||
|
|
||
| def pretty( | ||
| self, *, indent: int = 2, show_text: bool = False, _current: int = 0 | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice here to identify where in the token stream the issue is.
Maybe at the start of this function set
length = len(tokens), then here doindex = length - len(reversed_tokens)and add this to the exception messageThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that due to the nested nature of the tree, there's a very good chance
tokenswon't be the same stream that the programmer input toSyntaxTreeNodeinit but rather a nested sub-stream. In this case showing the index could potentially be more confusing than helpful. I'm also a bit hesitant to add logic just for the sake of an exception message that, in software working expectedly, should never be raised.Can add this if you still think its a good idea.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeh fair I think, if only for myst-parser, I want to have a new exception (i.e. that inherits from
ValueError) that stores the token, so I can report maybe the line number (from token.map) or whole token, but I can add that in a separate PR