Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added entry for AxelSpringerSE #23

Merged

Conversation

renebaudisch
Copy link
Contributor

No description provided.

@google-cla
Copy link

google-cla bot commented May 2, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@sjledoux
Copy link
Collaborator

sjledoux commented May 2, 2023

Hi @renebaudisch, thank you for your submission. In addition to the CLA check, it appears you're failing a number of the FPS submission checks, which you can view by clicking on the "Details" link on the right side of the tab where it says "FPS submission checks / PR-Actions (pull_request)".

To summarize, it appears that a number of sites in your set are lacking the required .well-known/first-party-set.json file. Additionally, your service domain asadcn.com does not meet our requirements for service domains in regards to robots.txt and ads.txt.

@renebaudisch
Copy link
Contributor Author

Yes, thanks, I've seen...
But that's new, we needed some time solving the issues from #10, meanwhile you updated some things, so we also need to do a new round inhouse...

@renebaudisch
Copy link
Contributor Author

renebaudisch commented May 3, 2023

What does that mean?
The service site https://www.asadcdn.com/ has an ads.txt file, this violates the policies for service sites

Where can I access those policies?

And I also don't understand what's wrong with our robots.txt, the header is set, and I cannot adjust them being case-sensitive:

image

@sjledoux
Copy link
Collaborator

sjledoux commented May 3, 2023

What does that mean?
The service site https://www.asadcdn.com/ has an ads.txt file, this violates the policies for service sites

We are expecting that for each service domain, there is no ads.txt page. The error your PR received for that check suggests that when we made a request to https://asadcn.com/ads.txt we received a response code of 200, which constitutes a failure of the check.

Where can I access those policies?

That policy is listed under the Subset Level Technical Validation section of the submission guide.

And I also don't understand what's wrong with our robots.txt, the header is set, and I cannot adjust them being case-sensitive

Thank you for pointing this out - our check here should not be case sensitive. I will create an issue in the repo with reference to this and fix it shortly.

@renebaudisch
Copy link
Contributor Author

I'm pretty sure, that I added this ads.txt only because of the report of PR#10, as I added it on April 14th 2023...
But ok, I'll remove it.

@sjledoux
Copy link
Collaborator

sjledoux commented May 4, 2023

Hi, after some investigation into the code, I believe the issue with the robots.txt that your PR is experiencing is not actually related to case-sensitivity. We are using python's urllib library, which uses the requests library to implement a case-insensitive dictionary for response headers, so that should not be an issue with how our check functions.

Instead, the issue looks to be coming from which page contains the "X-Robots-Tag". Our check is looking for the tag on your base domain, https://asadcdn.com, but it appears the tag is instead on https://asadcdn.com/robots.txt, which is why the check is failing. The wording of the policy on the submission guide is ambiguous on where we are expecting the tag, so we are rewording the requirement as part of the PR here. Apologies for causing any confusion there.

@renebaudisch renebaudisch force-pushed the AxelSpringerSE_bild.de_init_submit branch 2 times, most recently from 45e82d7 to a8b1d7f Compare May 5, 2023 09:36
@renebaudisch
Copy link
Contributor Author

Hi there, I did some changes and also see our next to does fixing the contents of the json files, but regarding the robots.txt error, I think there is a misspelling, shouldn't it search for noindex without -? Sure, I can add this value on my side also but won't the standard you will be faced give you noindex as well?

@sjledoux
Copy link
Collaborator

sjledoux commented May 5, 2023

Hi Renee, it looks like that's a misspelling in the error message; the code itself was looking at noindex without the -, which you can see here. The reason why the response is showing up is because that version of the code was looking for just the noindex tag, while your site has noindex, nofollow. Your tags should be acceptable, so we are fixing that with PR 28, as well as some of the wording with the error message and the policy description in the submission guide.

The fix will be merged shortly, so you should be passing the robots check soon. Hope this clears everything up in that regard. Apologies for the previous issues with that check.

@renebaudisch
Copy link
Contributor Author

No problem, we hopefully also updated our jsons till monday.
Thanks for your fast responses and caring so much!

@sjledoux
Copy link
Collaborator

sjledoux commented May 5, 2023

Glad to have helped!

By the way, for the changes you made on commits 45e82d7 and a8b1d7f, I noticed these look to be trivial. Was there any reason for these changes? Were you just making minor changes to trigger the workflow again?

@renebaudisch
Copy link
Contributor Author

We still need to update the json files, but the txt files should already be fixed. I purged the caches on the server to hopefully get rid of these warnings as they should be done....

@sjledoux
Copy link
Collaborator

I believe you will need to rebase your branch to pass the robots check at least. When I run the current version of the code locally with your list, it is only failing the well-known checks.

@renebaudisch renebaudisch force-pushed the AxelSpringerSE_bild.de_init_submit branch from a8b1d7f to 7a45587 Compare May 15, 2023 15:42
@github-actions
Copy link

Looks like you've passed all of the checks!

@sjledoux sjledoux self-requested a review May 16, 2023 15:48
Copy link
Collaborator

@sjledoux sjledoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Ready to merge.

@sjledoux sjledoux merged commit 47e3775 into GoogleChrome:main May 16, 2023
@renebaudisch renebaudisch deleted the AxelSpringerSE_bild.de_init_submit branch May 16, 2023 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants