New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Robots Exclusion Protocol (REP) IETF Draft: port unit tests #360
Merged
sebastian-nagel
merged 5 commits into
crawler-commons:master
from
sebastian-nagel:cc-245-google-robotstxt-tests
Jul 12, 2023
Merged
Implement Robots Exclusion Protocol (REP) IETF Draft: port unit tests #360
sebastian-nagel
merged 5 commits into
crawler-commons:master
from
sebastian-nagel:cc-245-google-robotstxt-tests
Jul 12, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sebastian-nagel
force-pushed
the
cc-245-google-robotstxt-tests
branch
from
October 20, 2022 13:18
849b67e
to
00ee91f
Compare
Rebased on top of #362 - still five unit tests fail:
|
sebastian-nagel
force-pushed
the
cc-245-google-robotstxt-tests
branch
from
April 22, 2023 18:49
00ee91f
to
5baeb88
Compare
sebastian-nagel
force-pushed
the
cc-245-google-robotstxt-tests
branch
from
May 12, 2023 12:37
5baeb88
to
28a0322
Compare
sebastian-nagel
force-pushed
the
cc-245-google-robotstxt-tests
branch
from
July 3, 2023 19:21
28a0322
to
7bfadca
Compare
Unit tests should succeed, once rebased on top of #430. |
- port unit tests from https://github.com/google/robotstxt
- port unit tests from https://github.com/google/robotstxt - adapt "Google-only" unit tests dealing with overlong lines and none-standard user-agent names
- port unit tests from https://github.com/google/robotstxt - adapt unit tests dealing with overlong lines and percent-encoded URL paths were the behavior of SimpleRobotRulesParser is not wrong and maybe even seen as an improvement compared to restrictions put on API input params by the Google robots.txt parser
- avoid locale-sensitive methods
sebastian-nagel
force-pushed
the
cc-245-google-robotstxt-tests
branch
from
July 12, 2023 08:39
7bfadca
to
83454bd
Compare
rzo1
approved these changes
Jul 12, 2023
Thanks for the review, @rzo1! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Port unit tests from https://github.com/google/robotstxt : https://github.com/google/robotstxt/blob/master/robots_test.cc