Extract is HTML Extractor. This extractor is based on wedata.
- items.json is originally from
http://wedata.net/databases/LDRFullFeed/items.json
. - Currently, Extract only works for URLs which in wedata.
From _example,
package main
import (
"flag"
"fmt"
"log"
"os"
"github.com/suzuken/extract"
)
func main() {
var (
rawurl = flag.String("url", "http://example.com", "url for extract")
)
flag.Parse()
ex := extract.New()
if rule := ex.Match(*rawurl); rule == nil {
log.Printf("%s doesn't match in rule", *rawurl)
os.Exit(0)
}
c, err := ex.ExtractURL(*rawurl)
if err != nil {
log.Fatalf("extract failed: %s", err)
}
fmt.Printf("content: %v", c)
}
MIT
All data in wedata are in the public domain. see also: http://wedata.net/help/about .
- Wedata project and members.
Kenta Suzuki (a.k.a. suzuken)