Skip to content

Obeying the robots.txt file #1989

@TechnologyClassroom

Description

@TechnologyClassroom

I've recently found this project while reading server logs. Someone is scraping one of the sites that I help administer supposedly using AHC/2.1 and they are not obeying the robots.txt file. There should be several seconds of delay between requests, but it appears to be going a 1 request/second. Is this normal behavior for AHC or is this a user misconfiguration in some way? If this is normal, could robots.txt file support for Crawl-delay values be added by default?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions