html query stuff for go
  • utility functions like int, float, regexExtract
  • http
    • client with get/post
    • proxy rotator
    • cookies, headers, user agent task Cookies, Headers, UserAgent override client settings

    results is chan, results is returned from Start() task is {url, headers, cookies, proxy, throttler} httpclient has headers, cookies, proxy, throttler modify httpclient temporary via task or globally via client

Client{Headers, UserAgent, Proxy{Check, List}, Cache, RetryCount, Throttle{n, time}} Document{URL, Node, Tasks, Results} Task{Method, URL, Headers, Cookies, Cache, Proxy…} // nil means default to what Client does, soup.NoCache ? Result{Value interface{}}

Client.Get(url, func(d soup.Document) error) ([]Result, error)

for starters it would be nice to have a proxy rotating client that caches to fs

