GitHub - gpestana/htmlizer: Parses only human readable content from HTML DOM.

htmlizer

Parses only human readable content from HTML DOM.

Example

import (
  "fmt"
  "github.com/gpestana/htmlizer"
)

func main() {
  html := `
    <html>
     <body>
       <h1>Heading H1</h1>
       <p>This is the first text</p>
       <h2>heading h2</h2>
       <p>This is the second text</p>
     </body>
     <script>console.log("scripts are discarded")</script>
   </html>`

  // will trim out all the tabs from text
  ignore := []rune{'\t'}
  hizer := htmlizer.New(ignore)
  hizer.Load(html)

  fmt.Println(">> Struct:")
  fmt.Println(hizer)

  fmt.Println(">> Human readable content:")
  fmt.Println(hizer.HumanReadable())
}

Output:

>> Struct:
{[Heading H1 heading h2], [this is the first text this is the seconf text]}
>> Human readable content:
Heading H1
This is the first text
heading h2
This is the second text

Contribute

Fork and PR and use issues for bug reports, feature requests and general comments.

gpestana © MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
htmlizer		htmlizer
htmlizer.go		htmlizer.go
htmlizer_test.go		htmlizer_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

.travis.yml

.travis.yml

LICENSE

LICENSE

README.md

README.md

htmlizer

htmlizer

htmlizer.go

htmlizer.go

htmlizer_test.go

htmlizer_test.go

Repository files navigation

htmlizer

Example

Contribute

About

Releases 1

Packages

Languages

License

gpestana/htmlizer

Folders and files

Latest commit

History

Repository files navigation

htmlizer

Example

Contribute

About

Resources

License

Stars

Watchers

Forks

Languages