Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robotsparser deny all with some rules #80388

Open
Wats0ns mannequin opened this issue Mar 6, 2019 · 6 comments
Open

robotsparser deny all with some rules #80388

Wats0ns mannequin opened this issue Mar 6, 2019 · 6 comments
Labels
3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@Wats0ns
Copy link
Mannequin

Wats0ns mannequin commented Mar 6, 2019

BPO 36207
Nosy @vstinner, @Wats0ns, @iritkatriel

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-03-06.09:42:01.201>
labels = ['type-feature', 'library', '3.11']
title = 'robotsparser deny all with some rules'
updated_at = <Date 2022-04-06.10:21:58.234>
user = 'https://github.com/Wats0ns'

bugs.python.org fields:

activity = <Date 2022-04-06.10:21:58.234>
actor = 'vstinner'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-03-06.09:42:01.201>
creator = 'quentin-maire'
dependencies = []
files = []
hgrepos = []
issue_num = 36207
keywords = []
message_count = 6.0
messages = ['337285', '338293', '338298', '390073', '408351', '416852']
nosy_count = 6.0
nosy_names = ['vstinner', 'quentin-maire', 'iritkatriel', 'EricG', 'nico.bonefato', 'adiboo67']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue36207'
versions = ['Python 3.11']

@Wats0ns
Copy link
Mannequin Author

Wats0ns mannequin commented Mar 6, 2019

RobotsParser parse a "Disallow: ?" rule as a deny all, but this is a valid rule that should be interpreted as "Disallow: /?" or "Disallow: /?*"

@Wats0ns Wats0ns mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Mar 6, 2019
@csabella
Copy link
Contributor

Can you provide a link to documentation showing that "Disallow: ?" shouldn't be the same as deny all? Thanks!

@Wats0ns
Copy link
Mannequin Author

Wats0ns mannequin commented Mar 18, 2019

I can't find a documentation about it, but all of the robots.txt checkers I find behave like this. You can test on this site: http://www.eskimoz.fr/robots.txt, I believe that this is how it's implemented now in most parsers ?

@EricG EricG mannequin changed the title robotsparser deny all with some rules référencement naturel Apr 2, 2021
@vstinner vstinner changed the title référencement naturel robotsparser deny all with some rules Apr 2, 2021
@vstinner
Copy link
Member

vstinner commented Apr 2, 2021

I removed almost all messages of this issue since most of them looked list SPAM. I also blocked user accounts who posted SPAM. If it was a mistake, contact me.

This is the Python bug tracker, not a forum to ask questions how to use Python, or to report bugs in your website.

Multiple comments were written in French, whereas this bug tracker is in English.

I even hesitate to close the issue since it got too many SPAM comments.

@iritkatriel
Copy link
Member

I restored one non-spam message from the OP that was deleted.

Changing to enhancement because this is not a bug (i.e., deviation from documentation).

I don't know enough about this to have a view on whether this enhancement request should be accepted.

@iritkatriel iritkatriel added 3.11 only security fixes type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Dec 12, 2021
@vstinner
Copy link
Member

vstinner commented Apr 6, 2022

I removed two comments: none of the mentioned URL contains a "Disallow: ?" rule and the comments didn't add any value to this issue. It looks like regular spam (SEO).

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants