GodotDOMParser

Fetch a URL, parse its HTML, and query the DOM with CSS-like selectors — all in pure GDScript. No native dependencies, works on every platform Godot supports.

Engine: Godot 4.2+
License: MIT
Status: 0.1.0 — usable, forgiving HTML parser, subset of CSS selectors.

Install

Copy the addons/godot_dom_parser/ folder into your project's addons/ directory. (Or install via the AssetLib tab in the editor.)
Open Project → Project Settings → Plugins and enable GodotDOMParser.

All public classes register their class_name globally, so you can use DOMParser, DOMDocument, DOMNode, HTMLParser, and CSSSelector from anywhere without preload.

Quick start

extends Node

func _ready() -> void:
    var parser := DOMParser.new()
    add_child(parser)

    var doc: DOMDocument = await parser.fetch("https://example.com")
    if doc == null:
        push_error("fetch failed")
        return

    print("Title: ", doc.get_title())

    for link in doc.query_selector_all("a[href]"):
        print(link.get_attribute("href"), " -> ", link.get_text_content())

Parsing a raw HTML string

var html := "<html><body><p class='hi'>hello <b>world</b></p></body></html>"
var doc := DOMParser.parse_html(html)
print(doc.query_selector("p.hi").get_text_content())  # "hello world"

API

`DOMParser` (Node)

Member	Description
`fetch(url: String) -> DOMDocument`	Awaitable. GETs the URL and returns a parsed document, or `null` on error.
`static parse_html(html: String) -> DOMDocument`	Parse an HTML string directly.
`user_agent: String`	UA string sent with requests.
`extra_headers: PackedStringArray`	Extra request headers, `"Name: value"` format.
`timeout_seconds: float`	Request timeout.
`max_redirects: int`	Redirects to follow.
signal `document_loaded(document)`	Emitted after a successful fetch.
signal `fetch_failed(error, response_code)`	Emitted on network or HTTP error.

`DOMDocument` (extends `DOMNode`)

Member	Description
`source_url: String`	URL this document was fetched from (if any).
`raw_html: String`	The original HTML text.
`get_document_element()`	The `<html>` element (or first element child).
`get_head()` / `get_body()`	Convenience accessors.
`get_title() -> String`	Text of the `<title>` element.

`DOMNode`

Member	Description
`tag_name: String`	Lowercase tag (e.g. `"div"`). Empty for text/comment.
`attributes: Dictionary`	Attribute map (keys lowercased).
`children: Array[DOMNode]`	Child nodes.
`parent: DOMNode`	Parent (may be `null`).
`text: String`	Text content for text/comment nodes.
`is_element()` / `is_text()` / `is_void()`	Type predicates.
`get_attribute(name, default="")`	Read attribute.
`has_attribute(name)` / `set_attribute(name, value)` / `remove_attribute(name)`	Attribute CRUD.
`get_id()` / `get_classes()` / `has_class(cls)`	Shortcuts.
`get_text_content()`	Concatenated text of this node and descendants.
`get_inner_html()` / `get_outer_html()`	Serialize back to HTML.
`append_child(n)` / `remove_child(n)` / `remove()`	Tree mutation.
`get_element_by_id(id)`	First descendant element with that `id`.
`get_elements_by_tag_name(tag)`	All descendant elements with that tag (`"*"` for all).
`get_elements_by_class_name(cls)`	All descendant elements with that class.
`query_selector(sel)`	First descendant matching the selector.
`query_selector_all(sel)`	All descendants matching the selector.
`matches(sel)`	Does this node match the selector?
`walk()` / `walk_elements()`	Pre-order traversal helpers.

Supported CSS selectors

Type / universal: div, *
ID: #main
Class: .title, .a.b (multiple)
Attribute:
- [disabled] — present
- [type="text"] — exact
- [class~="hero"] — whitespace-separated word
- [href^="https"] — prefix
- [href$=".pdf"] — suffix
- [href*="foo"] — substring
- [lang|="en"] — exact or "en-" prefix
Combinators: descendant (space), child (>), adjacent sibling (+), general sibling (~)
Selector lists: a, b, c
Pseudo-classes: :first-child, :last-child, :only-child, :first-of-type, :last-of-type, :not(<simple>), :nth-child(<an+b>), :nth-last-child(<an+b>), :nth-of-type(<an+b>), :nth-last-of-type(<an+b>) (accepts integers, odd, even, and full an+b notation like 2n+1, -n+3)

Examples:

doc.query_selector_all("article.post > h2 a[href^='https']")
doc.query_selector_all("ul.nav li:first-child")
doc.query_selector_all("p:not(.muted)")

Interacting with the DOM

The tree is fully mutable. Changes are reflected by get_outer_html().

var body := doc.get_body()
var new_p := DOMNode.create_element("p")
new_p.set_attribute("class", "added")
new_p.append_child(DOMNode.create_text("injected from Godot"))
body.append_child(new_p)

for node in doc.query_selector_all(".advert"):
    node.remove()

print(doc.get_outer_html())

Limitations

Not a spec-compliant HTML5 parser. It's forgiving enough for typical pages (void elements, unquoted attributes, implicit <p>/<li> closing, raw-text for <script>/<style>), but edge cases in table foster-parenting, <template>, and malformed markup are handled heuristically.
Entity decoding covers the numeric (&#...;, &#x...;) forms plus a small named-entity table. Uncommon named entities pass through as-is.
Selectors do not (yet) support namespaces or case-sensitive attribute matching ([attr=val i]).
JavaScript is not executed. If a page renders its content client-side, you'll only see the initial HTML.

Contributing

Bug reports and PRs welcome. If you hit HTML that parses incorrectly, a minimal reproducing snippet is the most useful thing you can send.

Tests

The test suite lives in its own repository: codeWonderland/godot-dom-parser-tests. It's kept separate from the addon so the AssetLib download stays small and clutter-free — users who just want to drop the addon into their project shouldn't have to pull test fixtures, a test runner, and extra scenes.

If you're submitting a PR against this addon, please clone the tests repo alongside it, add/update tests for your change, and confirm the full suite still passes. The tests repo uses this repo as a git submodule and explains its setup in its own README.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
addons/godot_dom_parser		addons/godot_dom_parser
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GodotDOMParser

Install

Quick start

Parsing a raw HTML string

API

`DOMParser` (Node)

`DOMDocument` (extends `DOMNode`)

`DOMNode`

Supported CSS selectors

Interacting with the DOM

Limitations

Contributing

Tests

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

GodotDOMParser

Install

Quick start

Parsing a raw HTML string

API

DOMParser (Node)

DOMDocument (extends DOMNode)

DOMNode

Supported CSS selectors

Interacting with the DOM

Limitations

Contributing

Tests

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

`DOMParser` (Node)

`DOMDocument` (extends `DOMNode`)

`DOMNode`

Packages