Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancellation is not working properly and integration test is wrong #240

Open
kpiekara opened this issue Sep 11, 2023 · 1 comment
Open

Comments

@kpiekara
Copy link

I believe it is the same issue as #206 which was closed based on "integration unittest is passing".

This UT is passing:

[Test]
public async Task Crawl_Synchronous_CancellationTokenCancelled_StopsCrawl()
{
    var cancellationTokenSource = new CancellationTokenSource();
    var timer = new System.Timers.Timer(800);
    timer.Elapsed += (o, e) =>
    {
        cancellationTokenSource.Cancel();
        timer.Stop();
        timer.Dispose();
    };
    timer.Start();

    var crawler = new PoliteWebCrawler();
    var result = await crawler.CrawlAsync(new Uri("https://github.com/"), cancellationTokenSource);

    Assert.IsTrue(result.ErrorOccurred);
    Assert.IsTrue(result.ErrorException is OperationCanceledException);
}

But if we change time (from 800ms to 3s) to actually crawler starting to work:

[Test]
public async Task Crawl_Synchronous_CancellationTokenCancelled_StopsCrawl()
{
    var cancellationTokenSource = new CancellationTokenSource();
    var timer = new System.Timers.Timer(3000);
    timer.Elapsed += (o, e) =>
    {
        cancellationTokenSource.Cancel();
        timer.Stop();
        timer.Dispose();
    };
    timer.Start();

    var crawler = new PoliteWebCrawler();
    var result = await crawler.CrawlAsync(new Uri("https://github.com/"), cancellationTokenSource);

    Assert.IsTrue(result.ErrorOccurred);
    Assert.IsTrue(result.ErrorException is OperationCanceledException);
}

We have failure which will crash application as unhandled exception

Exit code is -532462766 (Output is too long. Showing the last 100 lines:

   at System.Threading.CancellationToken.ThrowIfCancellationRequested()
   at Abot2.Crawler.WebCrawler.ThrowIfCancellationRequested()
   at Abot2.Crawler.WebCrawler.ProcessPage(PageToCrawl pageToCrawl)
   at Abot2.Crawler.WebCrawler.<CrawlSite>b__64_0()
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
   at System.Threading.QueueUserWorkItemCallback.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()

Issue: there is no way to cancel crawler

@ynnob
Copy link

ynnob commented Nov 1, 2023

Same here. Entire Website is crashing when crawler is gettign canceled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants