Skip to content

logue/umd-core

Repository files navigation

Universal Markdown (UMD)

A next-generation Markdown parser built with Rust, combining CommonMark compliance (~75%+), Bootstrap 5 integration, semantic HTML generation, and an extensible plugin system. Maintains backward compatibility with UMD legacy syntax.

Status: Production-ready | Latest Update: 2026-03-03 | License: MIT

🧩 Philosophy

  1. Semantic-First: Markdown is not just a shorthand for HTML. It is a structured document. Universal Markdown ensures every element is wrapped in semantically correct tags (e.g., using <figure> for code blocks) to enhance SEO and accessibility.

  2. Empowerment without Complexity: Inspired by the PukiWiki legacy, we provide rich formatting (alignment, coloring, etc) without forcing users to write raw HTML. We believe in "Expressive Markdown."

  3. Universal Media Handling: Redefining the standard image tag as a versatile "Media Tag." Whether it's an image, video, or audio, the parser intelligently determines the best output.


Features

Core Markdown

  • βœ… CommonMark Compliant (~75%+ specification compliance)
  • βœ… GFM Extensions (tables, strikethrough, task lists, footnotes)
  • βœ… HTML5 Semantic Tags (optimized for accessibility and SEO)
  • βœ… Bootstrap 5 Integration (automatic utility class generation)

Media & Content

  • βœ… Auto-detect Media Files: ![alt](url) intelligently becomes <video>, <audio>, <picture>, or download link based on file extension
  • βœ… Semantic HTML Elements: &badge(), &ruby(), &sup(), &time(), etc.
  • βœ… Definition Lists: :term|definition syntax with block-level support
  • βœ… Code Blocks with Bootstrap Integration: Class-based language output (<code class="language-*">) and syntect highlighting
  • βœ… Mermaid SSR: ```mermaid blocks are rendered server-side as <figure class="code-block code-block-mermaid mermaid-diagram">...<svg>...</svg></figure>

Tables & Layout

  • βœ… Markdown Tables: Standard GFM tables with sorting capability
  • βœ… UMD Tables: Extended tables with cell spanning (|> colspan, |^ rowspan)
  • βœ… Cell Decoration: alignment (LEFT/CENTER/RIGHT/JUSTIFY), color, size control
  • βœ… Block Decorations: SIZE, COLOR, positioning with Bootstrap prefix syntax

Interactivity & Data

  • βœ… Plugin System: Inline (&function(args){content};) and block (@function(args){{ content }}) modes
  • βœ… Frontmatter: YAML/TOML metadata (separate from HTML output)
  • βœ… Footnotes: Structured data output (processed server-side by Nuxt/Laravel)
  • βœ… Custom Header IDs: # Header {#custom-id} syntax

Advanced Features

  • βœ… UMD Backward Compatibility: Legacy PHP implementation syntax support
  • βœ… Block Quotes: UMD format > ... < + Markdown > prefix
  • βœ… Discord-style Spoilers: ||hidden text|| syntax
  • βœ… Underline & Emphasis Variants: Both semantic (**bold**, *italic*) and visual (''bold'', '''italic''')

Security

  • βœ… XSS Protection: Input HTML fully escaped, user input never directly embedded
  • βœ… URL Sanitization: Blocks dangerous schemes (javascript:, data:, vbscript:, file:)
  • βœ… Safe Link Handling: <URL> explicit markup only (bare URLs not auto-linked)

Platform Support

  • βœ… WebAssembly (WASM): Browser-side rendering via wasm-bindgen
  • βœ… Server-side Rendering: Rust library for backend integration (Nuxt, Laravel, etc.)

Mermaid Example

Input:

```mermaid
flowchart TD
    A[Start] --> B[End]
```

Output (excerpt):

<figure
  class="code-block code-block-mermaid mermaid-diagram"
  data-mermaid-source="flowchart TD..."
>
  <svg><!-- rendered by mermaid-rs-renderer --></svg>
</figure>

Syntax Highlight Example

Input:

```rust
fn main() {
        println!("hello");
}
```

Output (excerpt):

<pre><code class="language-rust syntect-highlight" data-highlighted="true"><span class="syntect-source syntect-rust">...</span></code></pre>

Code Block Specification

UMD code blocks use a Rust-first hybrid strategy with frontend fallback.

Output Rules

  • pre never gets a lang attribute
  • Language is represented as class="language-xxx" on <code>
  • If Syntect highlights on server side:
    • class="language-xxx syntect-highlight"
    • data-highlighted="true" is added
  • If language is not supported by Syntect:
    • Keep class="language-xxx" and let frontend highlighter process it
  • mermaid is handled separately and rendered as SVG <figure class="... mermaid-diagram">

Processing Flow

flowchart TD
  A[Fenced code block] --> B[comrak parses code block]
  B --> C{Mermaid language}
  C -->|Yes| D[Rust renders Mermaid SVG]
  D --> E[Output mermaid-diagram figure]
  C -->|No| F{Syntect supported}
  F -->|Yes| G[Rust applies syntax highlight]
  G --> H[code with syntect and highlighted flag]
  F -->|No| I[code keeps language class]
  H --> J[Skip client rehighlight]
  I --> K[Client highlighter can process]
Loading

Frontend Integration Rule

Use selectors that exclude server-highlighted code blocks:

document
  .querySelectorAll(
    'pre code[class*="language-"]:not([data-highlighted="true"])',
  )
  .forEach((el) => Prism.highlightElement(el));

This prevents double-highlighting and keeps Mermaid processing isolated.


Getting Started

Rust Library

Add to your Cargo.toml:

[dependencies]
umd = { path = "./umd", version = "0.1" }

Basic Usage

use umd::parse;

fn main() {
    let input = "# Hello World\n\nThis is **bold** text.";
    let html = parse(input);
    println!("{}", html);
    // Output: <h1>Hello World</h1><p>This is <strong>bold</strong> text.</p>
}

With Frontmatter

use umd::parse_with_frontmatter;

fn main() {
    let input = r#"---
title: My Document
author: Jane Doe
---

# Content starts here"#;

    let result = parse_with_frontmatter(input);
    println!("Title: {}", result.frontmatter.as_ref().map(|fm| &fm.content).unwrap_or(&"".to_string()));
    println!("HTML: {}", result.html);
}

WebAssembly (Browser)

Build WASM module:

./build.sh release
# Output: pkg/umd.js, pkg/umd_bg.wasm

Use in JavaScript:

import init, { parse_markdown } from "./pkg/umd.js";

async function main() {
  await init();
  const html = parse_markdown("# Hello from WASM");
  console.log(html);
}

main();

Syntax Examples

Media Auto-detection

![Video Demo](demo.mp4) β†’ <video controls><source src="demo.mp4" type="video/mp4" />...</video>
![Background Music](bg.mp3) β†’ <audio controls><source src="bg.mp3" type="audio/mpeg" />...</audio>
![Screenshot](screen.png) β†’ <picture><source srcset="screen.png" type="image/png" /><img src="screen.png" alt="Screenshot" loading="lazy" /></picture>
![Download](file.pdf) β†’ <a href="file.pdf" download>πŸ“„ file.pdf</a>

Block Decorations

COLOR(red): Error message β†’ <p class="text-danger">Error message</p>
SIZE(1.5): Larger text β†’ <p class="fs-4">Larger text</p>
RIGHT: Right-aligned content β†’ <p class="text-end">Right-aligned content</p>
CENTER: Centered paragraph β†’ <p class="text-center">Centered paragraph</p>

Inline Semantic Elements

&badge(success){Active}; β†’ <span class="badge bg-success">Active</span>
&ruby(reading){ζΌ’ε­—}; β†’ <ruby>ζΌ’ε­—<rp>(</rp><rt>reading</rt><rp>)</rp></ruby>
&sup(superscript); β†’ <sup>superscript</sup>
&time(2026-02-25){Today}; β†’ <time datetime="2026-02-25">Today</time>

Inline Code Color Swatch

`#ffce44`
`rgb(255,0,0)`
`rgba(0,255,0,0.4)`
`hsl(100, 10%, 10%)`
`hsla(100, 24%, 40%, 0.5)`
<code
  >#ffce44<span
    class="inline-code-color"
    style="background-color: #ffce44;"
  ></span
></code>
<code
  >rgb(255,0,0)<span
    class="inline-code-color"
    style="background-color: rgb(255,0,0);"
  ></span
></code>
<code
  >rgba(0,255,0,0.4)<span
    class="inline-code-color"
    style="background-color: rgba(0,255,0,0.4);"
  ></span
></code>
<code
  >hsl(100, 10%, 10%)<span
    class="inline-code-color"
    style="background-color: hsl(100, 10%, 10%);"
  ></span
></code>
<code
  >hsla(100, 24%, 40%, 0.5)<span
    class="inline-code-color"
    style="background-color: hsla(100, 24%, 40%, 0.5);"
  ></span
></code>

Recommended CSS:

code .inline-code-color {
  display: inline-block;
  width: 0.75em;
  height: 0.75em;
  margin-left: 0.4em;
  border-radius: 0.2em;
  border: 1px solid var(--bs-border-color, rgba(0, 0, 0, 0.2));
  vertical-align: middle;
}

Plugins

&highlight(yellow){Important text}; β†’ <template class="umd-plugin umd-plugin-highlight">
<data value="0">yellow</data>
Important text
</template>

@detail(Click to expand){{Hidden}} β†’ <template class="umd-plugin umd-plugin-detail">
<data value="0">Click to expand</data>
Hidden
</template>

Tables with Cell Spanning

UMD Table (with colspan/rowspan):

| Header1 |>      | Header3 |
| Cell1   | Cell2 | Cell3 |
|^        | Cell4 | Cell5 |

RIGHT:
| Left Cell | Right Cell |

CENTER:
| Centered Table |

Documentation

Publishing & Maintenance


Architecture Overview

Input Text
    ↓
[Frontmatter Extractor] ← Extract YAML/TOML metadata
    ↓
[Nested Blocks Preprocess] ← Normalize list-item nested blocks
    ↓
[Tasklist Preprocess]   ← Convert indeterminate markers
    ↓
[Underline Preprocess]  ← Protect Discord-style __text__
    ↓
[Conflict Resolver]     ← Protect UMD syntax with markers
    ↓
[HTML Sanitizer]        ← Escape user input, preserve entities
    ↓
[comrak Parser]         ← CommonMark + GFM AST generation
    ↓
[Underline Postprocess] ← Restore <u> tags
    ↓
[UMD Extensions]        ← Apply inline/block decorations, plugins, tables, media
    ↓
[Footnotes Extractor]   ← Split body HTML and footnotes section
    ↓
Output: HTML + Frontmatter + Footnotes

Key Components

  • src/lib.rs - Main entry point (parse(), parse_with_frontmatter())
  • src/parser.rs - CommonMark + GFM parsing (comrak wrapper)
  • src/sanitizer.rs - HTML escaping & XSS protection
  • src/frontmatter.rs - YAML/TOML metadata extraction
  • src/extensions/ - UMD syntax implementations
    • conflict_resolver.rs - Marker-based pre/post-processing
    • block_decorations.rs - COLOR, SIZE, alignment prefixes
    • inline_decorations.rs - Semantic element functions
    • plugins.rs - Plugin rendering system
    • table/ - Table parsing & decoration
    • media.rs - Media auto-detection

Test Coverage

284 tests passing βœ…

196 unit tests (core modules)
 24 bootstrap integration tests (CSS class generation)
 18 commonmark compliance tests (specification adherence)
 13 conflict resolution tests (syntax collision handling)
  1 semantic integration test

Run tests:

cargo test --verbose              # All tests
cargo test --test bootstrap_integration  # Integration tests only

Performance

  • Small documents (1KB): < 1ms
  • Medium documents (10KB): < 10ms
  • Large documents (100KB): < 100ms

(Benchmarks on modern hardware)


Security Considerations

  • βœ… Input Sanitization: All user input HTML-escaped before parsing
  • βœ… Scheme Blocklist: Dangerous URL schemes blocked (javascript:, data:, etc.)
  • βœ… Plugin Safety: Plugins output to <template> for server-side processing (no direct HTML execution)
  • ⚠️ XSS Risk Mitigation: Recommend server-side validation of plugin content before rendering

Compatibility

  • Rust: 1.93.1+ (Edition 2024)
  • WASM: wasm32-unknown-unknown target
  • Node.js: Via WASM bindings
  • Browser: Chrome, Firefox, Safari, Edge (ES2020+)

Built With

  • comrak 0.50.0 - CommonMark + GFM parser
  • ammonia 4.1.2 - HTML sanitization
  • maud 0.27.0 - Type-safe HTML generation
  • regex 1.12.2 - Pattern matching
  • wasm-bindgen 0.2.108 - WASM integration

Contributing

Contributions welcome! Please:

  1. Read docs/architecture.md for system design
  2. Check PLAN.md for current priorities
  3. Write tests for new features
  4. Ensure all tests pass: cargo test --verbose
  5. Follow Rust conventions and document your changes

License

MIT License - see LICENSE for details

🎨 Crafted for Developers

This template is built with a focus on UI/UX excellence and modern developer experience. Maintaining it involves constant testing and updates to ensure everything works seamlessly.

If you appreciate the attention to detail in this project, a small sponsorship would go a long way in supporting my work across the Vue.js and Metaverse ecosystems.

GitHub Sponsors

About

A next-generation, high-performance Markdown engine designed for seamless UI/UX integration and CSS-native styling.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors