Skip to content
Antch, a fast, powerful and extensible web crawling & scraping framework for Go
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
contrib/dupefilter
.gitignore Initial commit Oct 1, 2017
.travis.yml
Gopkg.lock
Gopkg.toml update Gopkg.toml Jul 27, 2018
LICENSE
README.md docs: add BingWallpaper project Jul 28, 2018
compression.go
compression_test.go test: update test Dec 1, 2017
cookies.go BREAKING: rewrite architecture and all code Nov 29, 2017
cookies_test.go
crawler.go fix spider deadlock, remove respCh channel Dec 25, 2017
crawler_test.go test: fix format issues Jul 27, 2018
html.go support XPath query for JSON Jul 27, 2018
html_test.go
json.go
json_test.go
logger.go feature: new Logger interface, replace log.Logger Dec 22, 2017
main_test.go test: new test file Nov 30, 2017
middleware.go BREAKING: rewrite architecture and all code Nov 29, 2017
pipeline.go
proxy.go BREAKING: rewrite architecture and all code Nov 29, 2017
proxy_test.go test: add basic auth test Dec 1, 2017
robotstxt.go robots.txt request support proxy Dec 8, 2017
robotstxt_test.go
spider.go BREAKING: rewrite architecture and all code Nov 29, 2017
xml.go update: replace xquery with xmlquery & htmlquery Dec 5, 2017
xml_test.go test: new test file Dec 1, 2017

README.md

Antch

Build Status Coverage Status Go Report Card GoDoc

Antch, inspired by Scrapy. If you're familiar with scrapy, you can quickly get started.

Antch is a fast, powerful and extensible web crawling & scraping framework for Go, used to crawl websites and extract structured data from their pages.

Get Started

Getting Started

Follow the Getting Started instructions to start your first spider.

Features

  • Polite, highly concurrent web crawler.
  • Powerful and customizable HTTP middleware.
  • Item data pipeline for the web spider.
  • Built-in proxy support (HTTP, HTTPS, SOCKS5).
  • Built-in XPath query support for HTML/XML documents.
  • Easy to use and integrate with your project.

Examples

BingWallpaper - Bing daily wallpaper.

Documentation

See https://github.com/antchfx/antch/wiki

You can’t perform that action at this time.