SimdXml

SIMD-accelerated XML parsing with full XPath 1.0 support for Elixir.

SimdXml parses XML into a flat structural index (~16 bytes per tag) using SIMD instructions, then evaluates XPath expressions against it using array operations. No DOM tree, no atom creation from untrusted input, no XXE vulnerabilities.

Wraps the simdxml Rust crate via Rustler NIFs with precompiled binaries for all major platforms.

Installation

def deps do
  [{:simdxml, "~> 0.1.0"}]
end

Precompiled NIF binaries are provided for macOS (Apple Silicon, Intel), Linux (x86_64, aarch64, musl), and Windows. Set SIMDXML_BUILD=1 to compile from source if needed.

Quick start

# Parse
doc = SimdXml.parse!("<library><book lang='en'><title>Elixir</title></book></library>")

# Query with XPath
SimdXml.xpath_text!(doc, "//title")
#=> ["Elixir"]

# Navigate elements (Enumerable)
root = SimdXml.Document.root(doc)
Enum.map(root, & &1.tag)
#=> ["book"]

# Attributes
[book] = SimdXml.Element.children(root)
SimdXml.Element.get(book, "lang")
#=> "en"

Query combinators

Build XPath queries with Elixir pipes instead of strings:

import SimdXml.Query

query = descendant("book") |> where_attr("lang", "en") |> child("title") |> text()

SimdXml.query!(doc, query)
#=> ["Elixir"]

# Inspect the generated XPath
SimdXml.Query.to_xpath(query)
#=> "//book[@lang='en']/title/text()"

Queries are composable data structures — extract common fragments and reuse them:

books = descendant("book")
english = books |> where_attr("lang", "en")
titles = english |> child("title") |> text()
authors = english |> child("author") |> text()

Compiled queries

Compile once, evaluate against many documents:

query = SimdXml.compile!("//title")

SimdXml.eval_text!(doc1, query)
SimdXml.eval_text!(doc2, query)

# Optimized short-circuit operations
SimdXml.eval_count!(doc, query)     #=> 1
SimdXml.eval_exists?(doc, query)    #=> {:ok, true}

Compiled queries are NIF resources — safe to share across processes, store in ETS, or hold in module attributes.

Batch processing

Process thousands of documents with bloom filter prescanning:

query = SimdXml.compile!("//claim")
{:ok, results} = SimdXml.Batch.eval_text_bloom(xml_binaries, query)

Documents that cannot contain the target tags are skipped without parsing.

Quick grep mode

For simple //tagname extraction at memory bandwidth — no structural index:

scanner = SimdXml.Quick.new("claim")
SimdXml.Quick.extract_first(scanner, xml)    #=> "First claim text"
SimdXml.Quick.exists?(scanner, xml)          #=> true
SimdXml.Quick.count(scanner, xml)            #=> 42

Result helpers

SimdXml.Result.one(doc, "//title")           #=> "Elixir"
SimdXml.Result.fetch(doc, "//title")         #=> {:ok, "Elixir"}
SimdXml.Result.all(doc, "//title")           #=> ["Elixir"]

Why SimdXml?

	SimdXml	SweetXml	Saxy
Parser	SIMD Rust NIF	xmerl (Erlang)	Pure Elixir SAX
XPath	Full 1.0	Full 1.0 (via xmerl)	None
Memory	~16 bytes/tag	~350 bytes/node	Streaming
Atom safety	Strings only	Creates atoms	Strings only
XXE safe	No DTD processing	Vulnerable by default	No DTD processing
API	Combinators + XPath	`~x` sigil	SAX handlers
Batch	Bloom-filtered	No	No

Documentation

Full API docs and interactive Livebook guides:

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
lib		lib
native/simdxml_nif		native/simdxml_nif
pages		pages
test		test
.formatter.exs		.formatter.exs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SimdXml

Installation

Quick start

Query combinators

Compiled queries

Batch processing

Quick grep mode

Result helpers

Why SimdXml?

Documentation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SimdXml

Installation

Quick start

Query combinators

Compiled queries

Batch processing

Quick grep mode

Result helpers

Why SimdXml?

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages