‼️ BREAKING: Change `Token.attrs` to a dict #144

chrisjsewell · 2021-03-17T02:35:18Z

As discussed in #102, upstream the list format is only used to guarantee order: markdown-it/markdown-it#142, but in Python 3.7+ dictionary order is now guaranteed by the specification (in Python 3.6 is also preserved as an implementation detail).
This change improves typing and performance.

To minimize how breaking this change is, auto-conversion is done on Token initiation, i.e. you can still use Token("type", "tag", 0, attrs=[["key", "value"]]),
and also Token.as_dict(as_upstream=True) converts the dict back to null/list, so that they can still be directly compared to those produced in the debug tab of https://markdown-it.github.io/.

One should anyhow generally use the attrGet, attrSet, attrPush and attrJoin methods
to manipulate Token.attrs, which all have an identical signature to those upstream.

I also added the meta_serializer option to Token.as_dict, which now ensures that this method is always able to produce valid JSON.

codecov · 2021-03-17T02:36:00Z

Codecov Report

Merging #144 (cc75081) into master (00a28a6) will increase coverage by 0.03%.
The diff coverage is 93.22%.

@@            Coverage Diff             @@
##           master     #144      +/-   ##
==========================================
+ Coverage   96.38%   96.41%   +0.03%     
==========================================
  Files          72       72              
  Lines        3205     3204       -1     
==========================================
  Hits         3089     3089              
+ Misses        116      115       -1

Flag	Coverage Δ
pytests	`96.41% <93.22%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
markdown_it/tree.py	`92.35% <75.00%> (-0.63%)`	⬇️
markdown_it/token.py	`92.55% <92.30%> (+0.59%)`	⬆️
markdown_it/renderer.py	`100.00% <100.00%> (+1.80%)`	⬆️
markdown_it/rules_block/list.py	`98.84% <100.00%> (ø)`
markdown_it/rules_block/table.py	`100.00% <100.00%> (ø)`
markdown_it/rules_core/linkify.py	`98.87% <100.00%> (ø)`
markdown_it/rules_inline/autolink.py	`96.22% <100.00%> (ø)`
markdown_it/rules_inline/image.py	`98.86% <100.00%> (ø)`
markdown_it/rules_inline/link.py	`98.83% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00a28a6...cc75081. Read the comment docs.

chrisjsewell · 2021-03-17T03:07:58Z

Should finally be the last thing before a v1.0.0a1 release @choldgraf!

hukkin

Nice work!

I added some mainly type annotation related suggestions.

hukkin · 2021-03-17T03:36:41Z

markdown_it/renderer.py

            # Fake token just to render attributes
-            tmpToken = Token(type="", tag="", nesting=0, attrs=tmpAttrs)
+            tmpToken = Token(type="", tag="", nesting=0, attrs=copy.copy(token.attrs))


dict also has dict.copy() method if you want to avoid import copy 😄

markdown_it/renderer.py

hukkin · 2021-03-17T03:48:50Z

markdown_it/token.py

 import warnings

 import attr


+def convert_attrs(value: Any) -> Any:


Wouldn't the typing be something like

Suggested change

def convert_attrs(value: Any) -> Any:

def convert_attrs(value: Union[List[list], Dict[str, Union[str, float]], None]) -> Dict[str, Union[str, float]]:

I'd say no, because conversion happens before validation in attrs, so technically it should be able to take anything and not fail

hukkin · 2021-03-17T03:52:02Z

markdown_it/token.py

-    # Html attributes. Format: `[ [ name1, value1 ], [ name2, value2 ] ]`
-    attrs: Optional[List[list]] = attr.ib(default=None)
+    # Html attributes. Note this differs from the upstream "list of lists" format
+    attrs: Dict[str, Union[str, int, float]] = attr.ib(


int is implied by float so is not required. Is float used anywhere however? I know int is.

Suggested change

attrs: Dict[str, Union[str, int, float]] = attr.ib(

attrs: Dict[str, Union[str, float]] = attr.ib(

Actually would be less painful to just use Any instead of the Union[str, float]] in case someone gets an item directly without attrGet. I've added suggestions also elsewhere to annotate read API with Any and write API with Union[str, float] because of this.

I'm a bit confused why you suggest this, because isinstance(1, float) equates to False so int are certainly not a subclass of float.
As you mention, int is used in core otherwise really I was thinking that perhaps it could just be str only, since all these values always anyway have to be converted to strings in HTML attributes <div key="value">.
I can't think of any reason, or HTML tag, that would require it not to be one of str, int, or float and this restriction ensures that Token is always "JSONable"

This is in PEP 484 (https://www.python.org/dev/peps/pep-0484/#the-numeric-tower):

when an argument is annotated as having type float, an argument of type int is acceptable

The argument typing system is not one to one with isinstance, it is more "duck typed".

it could just be str

I like this idea but only if we take the liberty to diverge from upstream and actually always set str values only, i.e. convert before assigning.

The argument typing system is not one to one with isinstance, it is more "duck typed".

Fair, although I think it is better for human readability/understanding to include int specifically. Bit weird IMO, that typing should diverge from isinstance

but only if we take the liberty to diverge from upstream

yeh thats why I ultimately decided against it, just not worth the hassle

markdown_it/token.py

hukkin · 2021-03-17T04:09:00Z

markdown_it/tree.py

@@ -272,13 +272,13 @@ def tag(self) -> str:
        return self._attribute_token().tag

    @property
-    def attrs(self) -> Dict[str, Any]:
+    def attrs(self) -> Dict[str, Union[str, int, float]]:


Suggested change

def attrs(self) -> Dict[str, Union[str, int, float]]:

def attrs(self) -> Dict[str, Any]:

hukkin · 2021-03-17T04:14:40Z

markdown_it/tree.py

-        return dict(token_attrs)  # type: ignore
+        return self._attribute_token().attrs
+
+    def attrGet(self, name: str) -> Any:


My preference would be to not add this method, but use attrs.get(name) or attrs[name] instead. Much more pythonic, and since this class is new API, there is nothing forcing us to copy the Token classes attrs setter and getter methods (which only exist because attrs was not a mapping in JS upstream).

hukkin · 2021-03-17T04:28:56Z

Should finally be the last thing before a v1.0.0a1 release

If you planning v1 already, could you have a look at #66 before releasing?

Out of all the issues we have, I think that is the only one including anything breaking, so if we decide to do it would be nice to do so before v1 I think

Co-authored-by: Taneli Hukkinen <hukkinj1@users.noreply.github.com>

chrisjsewell added 3 commits March 17, 2021 03:21

‼️ BREAKING: Change Token.attrs to a dict

dcaca4a

Update token.py

bfca4f9

Update port.yaml

1a3eb7d

chrisjsewell requested a review from hukkin March 17, 2021 02:35

chrisjsewell mentioned this pull request Mar 17, 2021

👌 IMPROVE: Make Token.attrs a List[Tuple[str, Any]] #102

Closed

hukkin requested changes Mar 17, 2021

View reviewed changes

chrisjsewell and others added 3 commits March 17, 2021 12:02

Apply suggestions from code review

69e6743

Co-authored-by: Taneli Hukkinen <hukkinj1@users.noreply.github.com>

Update renderer.py

b64e16f

remove Any types

cc75081

chrisjsewell merged commit 6204674 into master Mar 17, 2021

chrisjsewell deleted the Token.attrs branch March 17, 2021 11:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

‼️ BREAKING: Change `Token.attrs` to a dict #144

‼️ BREAKING: Change `Token.attrs` to a dict #144

chrisjsewell commented Mar 17, 2021 •

edited

codecov bot commented Mar 17, 2021 •

edited

chrisjsewell commented Mar 17, 2021

hukkin left a comment

hukkin Mar 17, 2021

chrisjsewell Mar 17, 2021

hukkin Mar 17, 2021

chrisjsewell Mar 17, 2021

hukkin Mar 17, 2021

hukkin Mar 17, 2021

chrisjsewell Mar 17, 2021 •

edited

hukkin Mar 17, 2021

chrisjsewell Mar 17, 2021

hukkin Mar 17, 2021

hukkin Mar 17, 2021 •

edited

hukkin commented Mar 17, 2021

	def convert_attrs(value: Any) -> Any:
	def convert_attrs(value: Union[List[list], Dict[str, Union[str, float]], None]) -> Dict[str, Union[str, float]]:

	attrs: Dict[str, Union[str, int, float]] = attr.ib(
	attrs: Dict[str, Union[str, float]] = attr.ib(

	def attrs(self) -> Dict[str, Union[str, int, float]]:
	def attrs(self) -> Dict[str, Any]:

‼️ BREAKING: Change Token.attrs to a dict #144

‼️ BREAKING: Change Token.attrs to a dict #144

Conversation

chrisjsewell commented Mar 17, 2021 • edited

codecov bot commented Mar 17, 2021 • edited

Codecov Report

chrisjsewell commented Mar 17, 2021

hukkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrisjsewell Mar 17, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hukkin Mar 17, 2021 • edited

Choose a reason for hiding this comment

hukkin commented Mar 17, 2021

‼️ BREAKING: Change `Token.attrs` to a dict #144

‼️ BREAKING: Change `Token.attrs` to a dict #144

chrisjsewell commented Mar 17, 2021 •

edited

codecov bot commented Mar 17, 2021 •

edited

chrisjsewell Mar 17, 2021 •

edited

hukkin Mar 17, 2021 •

edited