Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colly v2 #5

Open
atlet opened this issue Jul 1, 2022 · 1 comment
Open

Colly v2 #5

atlet opened this issue Jul 1, 2022 · 1 comment

Comments

@atlet
Copy link

atlet commented Jul 1, 2022

I tried with simple example using colly V2, but I get the following error:

cannot use storage (variable of type *redisstorage.Storage) as storage.Storage value in argument to c.SetStorage: wrong type for method Cookies (have func(u *net/url.URL) string, want func(u *net/url.URL) string)compiler[InvalidIfaceAssign](https://pkg.go.dev/golang.org/x/tools/internal/typesinternal?utm_source%3Dgopls#InvalidIfaceAssign)

Example

import (
	"github.com/gocolly/colly/v2"
	"github.com/gocolly/redisstorage"
)
c := colly.NewCollector()

storage := &redisstorage.Storage{
    Address:  "127.0.0.1:6379",
    Password: "",
    DB:       0,
    Prefix:   "job01",
}

err := c.SetStorage(storage)
if err != nil {
    panic(err)
}

Any suggestions how to fix this issue?

@satvik007
Copy link

satvik007 commented Aug 6, 2022

This worked perfectly fine for me with go 1.18
Share some repro code. Like a github repo.

package main

import (
  "fmt"

  "github.com/gocolly/colly/v2"
  "github.com/gocolly/redisstorage"
)

func main() {
  c := colly.NewCollector()

  storage := &redisstorage.Storage{
    Address: "localhost:6379",
    DB:      0,
    Prefix:  "job01",
  }

  err := c.SetStorage(storage)
  if err != nil {
    panic(err)
  }

  // On every a element which has href attribute call callback
  c.OnHTML("a[href]", func(e *colly.HTMLElement) {
    link := e.Attr("href")
    // Print link
    fmt.Printf("Link found: %q -> %s\n", e.Text, link)
    // Visit link found on page
    // Only those links are visited which are in AllowedDomains
    c.Visit(e.Request.AbsoluteURL(link))
  })

  // Before making a request print "Visiting ..."
  c.OnRequest(func(r *colly.Request) {
    fmt.Println("Visiting", r.URL.String())
  })

  // Start scraping on https://hackerspaces.org
  c.Visit("https://hackerspaces.org/")
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants