Skip to content

Optimize HTMLTree.to_tuple_list/1 using tail recursion#670

Merged
philss merged 1 commit into
philss:mainfrom
preciz:optimize-to-tuple
Apr 10, 2026
Merged

Optimize HTMLTree.to_tuple_list/1 using tail recursion#670
philss merged 1 commit into
philss:mainfrom
preciz:optimize-to-tuple

Conversation

@preciz
Copy link
Copy Markdown
Contributor

@preciz preciz commented Apr 9, 2026

Replace Enum.reduce/3 with a custom private tail-recursive function
(do_to_tuple_list/3) in Floki.HTMLTree.to_tuple_list/1 and
to_tuple/2.

~15% faster & 30% less memory usage

Replace `Enum.reduce/3` with a custom private tail-recursive function
(`do_to_tuple_list/3`) in `Floki.HTMLTree.to_tuple_list/1` and
`to_tuple/2`. This avoids anonymous function allocation for every
children list and skips the `Enum` protocol dispatch overhead.

Because the internal IDs (`root_nodes_ids` and `children_nodes_ids`)
are stored natively in reverse document order, iterating over them and
prepending to an accumulator (`[to_tuple(...) | acc]`) inherently builds
the resulting list of tuples in the correct document order without needing
any explicit `Enum.reverse/1` calls.

Performance improvements in extracting the raw HTML tuples from the tree:
- Small HTML: ~26% faster, ~29% less memory
- Medium HTML: ~14% faster, ~29% less memory
- Big HTML: ~13% faster, ~28% less memory
@philss philss merged commit 12c7549 into philss:main Apr 10, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants