Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
BBCode parser → HTML transformer
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.

Ruby BBCode Parser

  • Work in progress
  • Never wrote a parser before so I'm figuring it out as I go
  • Live demo:
  • PEG Parser built with Parslet
  • Intended to reasonably emulate vBulletin's BBCode

Supported BBCode (so far)

For an intuitive inventory, view the cheatsheet on the live demo.

  • [b]bold[/b]
  • [i]italicize[/i]
  • [u]underline[/u]
  • [s]strikethrough[/s]
  • [quote]text[/quote]
  • [quote=Barack Obama]text with username reference[/quote]
  • [quote=Barack Obama;2342]text with username and post ID reference[/quote]
  • [left]left-align[/left]
  • [right]right-align[/right]
  • [center]center-align[/center]
  • [color=blue]colored text[/color]
  • [url][/url]
  • [url=]Click Me![/url]
  • [img][/img]

How it works

  1. The parser (lib/parser.rb), where the grammar rules are defined, reads a string of user text and outputs a syntax tree.



      open: "b"
      inner: [
        text: "hello"
      close: "b"
      text: "goodbye"
  • The transformer (lib/transformer.rb) applies transformation rules to the tree by starting at the tree's leaves/subtrees and collapsing them into strings/html.

    For example, if this is the tree sent to the transformer:

      open: "b"
      inner: [
        text: "hello"
      close: "b"

    …first, inner: { text: "hello" } is collapsed into inner: ["hello"] because of the rule that collapses text nodes into strings:

      open: "b"
      inner: ["hello"]
      close: "b"

    …which now matches another transformation rule pattern: open:string, inner:sequence, close:string.

  • Whenever the transformer finds an open/inner/close subtree, it instantiates it into a Tag (lib/tag.rb) object that contains the actual BBCode to HTML transformation.

    open: "b"
    inner: ["hello"]
    close: "b"

    …instantiates a Tag like this:"b").wrap(inner.join)

    …which outputs

  • It keeps collapsing all the nodes into strings until it has one final string to return.

TODO/Caveats/Stuff I can't figure out yet

  • Sanitize user input. Make sure they can't do weird stuff.
  • Limit nesting so users can't go too crazy.
  • Handle mismatched or overlapped tags. For instance,

    [b]abc [i]def[/b] ghi[/i]

    Should become:

    <strong>abc <em>def</em></strong> <em>ghi</em>

    But at the moment the parser produces a tree like:

    open: "b"
      text: "abc"
      open: "i"
          text: "def"
      text: "ghi"
      close: "b"
    close: "i"
  • Parser is just too simplistic and nonrobust.

    This works:

    [b]no close tag

    …becomes a 'text' node since text only fails if lookahead finds a [/closing tag].

    Which means that this fails:

    [/b]no open tag

    Instead, orphan tags that don't form blocks should just be output as text.

  • Write tests. :'(
Something went wrong with that request. Please try again.