Skip to content

TagList.tagify() doesn't normalize child.tagify() returns — None/float/Sequence slip past the boundary check #117

@schloerke

Description

@schloerke

TagList.tagify() doesn't normalize child.tagify()'s return — None (and float / Sequence) slip past the boundary check

Summary

TagList.tagify() flattens a child's .tagify() return only when it's a TagList (or TagifiedTagList post-#116). For every other shape, it does a direct slot assignment (cp[i] = tagified_child) with no normalization. This means a child whose .tagify() returns None, an int/float, or a raw Sequence gets silently inserted as-is into the tagified tree. The post-tagify boundary check only watches for stray Tagifiable instances and lets these other shapes pass. They then break (or worse, silently misrender) at get_html_string time.

This is a pre-existing latent bug, exposed when reasoning about widening Tagified (the proposed Route A in the #116 follow-up discussion). It's independent of #116 itself — the bug exists on main today.

Reproducible example

from htmltools import TagList

class ReturnsNone:
    # Note: no `-> Tagified` annotation. Pyright can't refuse what it can't see.
    def tagify(self):
        return None

class ReturnsFloat:
    def tagify(self):
        return 3.14

class ReturnsList:
    def tagify(self):
        return ["a", "b"]   # raw list, not TagList

# Each of these stores the offending value as a child slot without complaint:
tl1 = TagList(ReturnsNone(), "after").tagify()
print(list(tl1))            # -> [None, 'after']

tl2 = TagList(ReturnsFloat(), "after").tagify()
print(list(tl2))            # -> [3.14, 'after']

tl3 = TagList(ReturnsList(), "after").tagify()
print(list(tl3))            # -> [['a', 'b'], 'after']  (nested list as a single slot)

# Rendering then breaks far from the cause:
tl1.get_html_string()       # falls into the "must be a string" else branch
                            # with `child = None` -> TypeError on string concat
tl2.get_html_string()       # same -> tries to concat 3.14 as str
tl3.get_html_string()       # same -> tries to concat ['a','b'] as str

For ReturnsNone, the same input shape — None passed as a tag child into TagList(None, "after") — is correctly dropped because TagList.__init__ routes through _tagchilds_to_tagnodes (htmltools/_util.py:flatten drops None). The asymmetry is the bug.

Where it goes wrong

htmltools/_core.py, inside TagList.tagify():

if isinstance(child, Tagifiable):
    tagified_child = child.tagify()
    if isinstance(tagified_child, TagList):
        # Flatten the returned TagList into this one.
        cp[i : i + 1] = _tagchilds_to_tagnodes(
            cast("TagList[TagNode]", tagified_child)
        )
    else:
        cp[i] = tagified_child     # <-- direct assignment, no normalization

The else branch should also pass the return through _tagchilds_to_tagnodes (or an equivalent single-item normalizer) so that:

  • None is dropped (slot becomes empty / item is removed)
  • int/float is str-ified
  • Sequence is flattened
  • Bare Tagifiable (the case that's already caught by the boundary check) raises with a clear error at the offending site

The post-tagify boundary check at the bottom of TagList.tagify:

for i, child in enumerate(cast("TagList[TagNode]", cp)):
    if isinstance(child, Tagifiable) and not isinstance(child, (Tag, TagList)):
        raise TypeError(...)

Catches stray Tagifiable (the .tagify()-returned-a-tagifiable case), but not the None/float/Sequence cases.

Render path in TagList.get_html_string (around _core.py:551):

elif isinstance(child, Tagifiable):
    raise RuntimeError(...)
else:
    # If we get here, x must be a string.
    ...

The else branch assumes str. None/float/Sequence items fall here and either trip on string operations or stringify in unexpected ways.

Why static types don't fully catch it

The Tagified return type of Tagifiable.tagify() is narrow today (excludes None/float/Sequence), so a properly-annotated implementation gets a static error. However:

  1. Tagifiable is @runtime_checkable. Any class with a tagify method is structurally a Tagifiable at runtime, regardless of what .tagify() returns.
  2. A class with def tagify(self): return None (no annotation) has inferred return type None, but is still seen as Tagifiable at runtime.
  3. Users who route through Any or upcast to object lose the protocol return-type check.

The framework already defends against the analogous problem on the input side (passing None/float/Sequence to a constructor or mutator is normalized via _tagchilds_to_tagnodes). The output side of .tagify() should match.

Suggested fix

Replace the else: cp[i] = tagified_child branch with a single-item normalization step:

if isinstance(child, Tagifiable):
    tagified_child = child.tagify()
    # Normalize: drop None, str-ify float, flatten Sequence, validate the rest.
    # `_tagchilds_to_tagnodes` already does all of this for a single-element
    # iterable; wrapping in a 1-list and slice-assigning lets it handle the
    # 0-result (None), 1-result (passthrough/coerced), and many-result
    # (flattened Sequence) cases uniformly.
    cp[i : i + 1] = _tagchilds_to_tagnodes([tagified_child])

This collapses the two if isinstance(tagified_child, TagList): ... else: ... branches into one, since _tagchilds_to_tagnodes already flattens nested TagLists correctly.

The post-condition Tagifiable boundary check stays — it catches the case where a .tagify() returned a TagList whose contents include a stray Tagifiable.

Notes on scope

  • This is independent of Investigate: make TagifiedTagList a real subclass instead of a type alias #116 and lands cleanly on its own.
  • It IS, however, a prerequisite for any future widening of Tagified to match TagifiedChild (i.e., make Tagifiable.tagify() -> Tagified legally able to return float/None/Sequence). Without this fix, widening the static type would just turn a static-prevented bug into a runtime-silent one.
  • The fix doesn't change the static-type contract of Tagifiable.tagify(); it just makes the runtime contract honest for implementations that bypass the static check.

Suggested test cases

def test_tagify_returning_None_is_dropped():
    class N:
        def tagify(self):  # intentionally unannotated
            return None
    assert list(TagList(N(), "x").tagify()) == ["x"]

def test_tagify_returning_float_is_strified():
    class F:
        def tagify(self):
            return 3.14
    assert list(TagList(F(), "x").tagify()) == ["3.14", "x"]

def test_tagify_returning_list_is_flattened():
    class L:
        def tagify(self):
            return ["a", "b"]
    assert list(TagList(L(), "x").tagify()) == ["a", "b", "x"]

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions