Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the AllowAll and DisallowAll instances public #36

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

thewizardplusplus
Copy link

It may be useful in the following cases:

  • Unit tests. Let's assume we are testing a function that filters a list of links with a previously parsed robots.txt file. How do we check the case when a robots.txt file disallows all links? Now it is possible only with actual parsing of the corresponding robots.txt file.
  • Errors not related to the network. Let's assume we are collecting a cache of parsed robots.txt files. What should we put in it in case of an error? Yes, the library has the useful functions analyze a status of a response, but not all errors are due to network problems (for example, let the cache accepts links as strings and parses them to construct a link to robots.txt files; this parsing may result in an error).
  • Support for temporary ignoring robots.txt files. Let's assume we want to allow a user to ignore robots.txt files through a special option. That is, allow all links. Now again, it is possible only with actual parsing of the corresponding robots.txt file.

I suppose it would be more comfortable to use the prepared public instances for allowing and disallowing all links.

@temoto
Copy link
Owner

temoto commented Jul 17, 2022 via email

@thewizardplusplus
Copy link
Author

You can't just make a variable public, because that allows accidental or
malicious mutation outside of package.

This is common practice in Go, recall io.EOF or http.DefaultClient. The latter is the full equivalent of our case.

But you are right. I can suggest using functions that return these constants, such as NewAllowAll() and NewDisallowAll() (the names are up to you).

Current unit tests check that some agent is allowed on some path. I'm not
sure that extracting (dis)allowall is beneficial. You would repeat part of
group.test function in your code to check against new value. How is that
better?

Other arguments seem to want marshalling to save result of parsing and
reuse it later. But what's wrong with just reparsing robots file again?
It's pretty fast.

You are right, I can parse prepared strings every time I need those special cases. But I find it non-ideomatic: I need the special value, the library has that value, but private, and I have to parse the same string over and over again.

After all, why does the library have these constants? You could also parse prepared strings each time to get AllowAll and DisallowAll. That's fast. But you chose to use constants. Because it's obviously more comfortable (and faster).

@temoto
Copy link
Owner

temoto commented Jul 18, 2022 via email

@temoto
Copy link
Owner

temoto commented Jul 18, 2022

I can parse prepared strings every time I need those special cases. But I find it non-ideomatic: I need the special value

This is where we don't agree yet.

Library doesn't use (dis)allowAll constants as special value. Maybe RobotsData.(Dis)AllowAll bool should be exported. Maybe via method RobotsData.SetAll(bool).

Let's imagine RobotsData supports Marshaler interface and result is stable between Go versions, like protobuf. Will that be ok or would you still want SetAll ?

@codecov
Copy link

codecov bot commented Jul 18, 2022

Codecov Report

Merging #36 (60a6ab8) into master (a68aeca) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master      #36   +/-   ##
=======================================
  Coverage   92.59%   92.59%           
=======================================
  Files           3        3           
  Lines         405      405           
=======================================
  Hits          375      375           
  Misses         19       19           
  Partials       11       11           
Impacted Files Coverage Δ
robotstxt.go 92.85% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a68aeca...60a6ab8. Read the comment docs.

@temoto temoto added the waiting-for-feedback Waiting for response from original poster or community. label Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-for-feedback Waiting for response from original poster or community.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants