Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tour: Add tests for web crawler task #14480

Open
amorgun opened this issue Feb 23, 2016 · 5 comments

Comments

@amorgun
Copy link

commented Feb 23, 2016

Context: https://tour.golang.org/concurrency/10

I think web crawler task is harder than it looks. I saw a lot of solutions and all of them are wrong, even the official one.
In order to prove that I have written some tests and script to run them on multiple solutions. You can see results here.
I suggest adding these tests to the task the same way as in the word count task.

@bradfitz bradfitz added this to the Unreleased milestone Feb 23, 2016

@adg

This comment has been minimized.

Copy link
Contributor

commented Feb 26, 2016

We should probably rework the example to be something that is more achievable in the context of a web browser.

We should also fix the official solution if it is broken.

@adg

This comment has been minimized.

Copy link
Contributor

commented Feb 26, 2016

@amorgun have you tried running your solution against Go 1.6? On my machine your test cases trigger a concurrent map write panic:

My solution: OK

Test case #3
found: a "node a"
found: b "node b"
found: c "node c"
My solution: OK

Test case #4
found: 0 "node #0"
found: 3 "node #3"
found: 1 "node #1 with very slow fetching"
found: 2 "node #2"
found: 4 "node #4"
found: 6 "node #6"
found: 5 "node #5"
My solution: OK

Test case #5
found: 0 "node #0"
found: 3 "node #3"
found: 1 "node #1"
found: 4 "node #4"
found: 5 "node #5"
found: 2 "node #2"
fatal error: concurrent map read and map writefound: 13 "node #13"

Your test harness has a bug:

type warningFetcher struct {
        fetcher        crawler.Fetcher
        alreadyFetched map[string]bool
        lock           sync.Mutex
}

func (f warningFetcher) Fetch(url string) (string, []string, error) {
        f.lock.Lock()
        defer f.lock.Unlock()
        if f.alreadyFetched[url] {
                fmt.Printf("WARNING: Url %s has been fetched multiple times\n", url)
        }
...

The Fetch method is declared on warningFetcher which has an embedded sync.Mutex. So when you're passing around a warningFetcher you are copying the mutex (defeating its purpose).

The fix is to declare the Fetch method on *warningFetcher and pass the warningFetcher around by pointer, not by value.

-func (f warningFetcher) Fetch(url string) (string, []string, error) {
+func (f *warningFetcher) Fetch(url string) (string, []string, error) {
        f.lock.Lock()
        defer f.lock.Unlock()
        if f.alreadyFetched[url] {
@@ -40,5 +41,5 @@ func (f warningFetcher) Fetch(url string) (string, []string, error) {
 }

 func newWarningFetcher(fetcher crawler.Fetcher) crawler.Fetcher {
-       return warningFetcher{fetcher: fetcher, alreadyFetched: make(map[string]bool)}
+       return &warningFetcher{fetcher: fetcher, alreadyFetched: make(map[string]bool)}
@amorgun

This comment has been minimized.

Copy link
Author

commented Feb 26, 2016

@adg Thank you for pointing out the bug. I have fixed it as you suggested. I also have enabled build against Go 1.6 in travis so at least it works there. Let me know if you still have problems running it.

@adg adg removed their assignment Feb 14, 2018

@adg adg added the help wanted label Feb 14, 2018

@Windsooon

This comment has been minimized.

Copy link

commented May 23, 2019

I will try to work on it from tomorrow.

@Windsooon

This comment has been minimized.

Copy link

commented May 24, 2019

I'm not sure we should add more tests for the example (since all the examples don't have tests). But I do think we should update the example to make it correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.