The tiniest little HTML/XML parser...using regex
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.



The tiniest little HTML/XML parser…using regex

gem install taggie --pre

WTF, why regex?!?

Curiosity, regex practice, and to understand the limitations of regular expressions

Examples (these all may not work yet - work in progress)

html = '<div id="header"><img src="logo.png" /><h1>Your Company</h1></div><div id="body"><p class="content">some <span>content</span> here</p></div>'.to_taggie
puts html.type                                # div
puts html.tag                                 # <div id="header">
puts html.inner_html                          # <img src="logo.png" /><h1>Your Company</h1>

puts html.children.first.src                  # logo.png
html.children.first.src = '/images/logo.png'
puts html.inner_html                          # <img src="/images/logo.png" /><h1>Your Company</h1>

p = html.siblings.first.children.first
puts p.tag                                    # <p class="content"> = 'content'
puts html.siblings.first.children.first       # <p class="content" id="content">Blah blah blah</p>

p.class = nil
puts html.siblings.first.children.first       # <p id="content">Blah blah blah</p>

p.class = ''
puts html.siblings.first.children.first       # <p id="content" class="">Blah blah blah</p>


  • siblings

  • tests

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don't break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but

    bump version in a commit by itself I can ignore when I pull)
  • Send me a pull request. Bonus points for topic branches.