Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add associated warning #373

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions .github/workflows/rws-submissions-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,20 +46,25 @@ jobs:
CONTENTS: ${{ steps.read_results.outputs.contents }}
run: echo "$CONTENTS"
- name: Create Comment
if: steps.read_results.outputs.contents != 'success'
if: steps.read_results.outputs.contents != 'success' && ${{ !startsWith(steps.read_results.outputs.contents, 'Warning')}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this support emitting a warning alongside errors? What if the warning happens to be the first line that was emitted, but it's followed by errors?

(Maybe we ought to write separate files for warnings/errors.)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want help me?

run: |
echo "It appears you have failed some tests. Here are your results:" > message.txt
cat results.txt >> message.txt
- name: Comment if sucess
- name: Comment if success
if: steps.read_results.outputs.contents == 'success'
run: echo "Looks like you've passed all of the checks!" >> message.txt
- name: Comment if warning
if: startsWith(steps.read_results.outputs.contents, 'Warning')
run: |
echo "You have passed all of the checks, but there are one or more warnings associated with your submission." > message.txt
cat results.txt >> message.txt
- name: Write Comment on PR
uses: mshick/add-pr-comment@v2
with:
message-path: "message.txt"
refresh-message-position: true
- name: Fail or Succeed
if: steps.read_results.outputs.contents != 'success'
if: startsWith(steps.read_results.outputs.contents, 'It appears you have failed')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can all of this be moved into the "Create Comment" step, so that we don't have to duplicate the "It appears you have failed" substring? Duplicating that makes this brittle, it'll break if someone changes that string and doesn't realize there's a dependency here.

uses: actions/github-script@v3
with:
script: |
Expand Down
19 changes: 19 additions & 0 deletions RwsCheck.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from publicsuffix2 import PublicSuffixList

WELL_KNOWN = "/.well-known/related-website-set.json"
ASSOCIATED_LIMIT = 5

class RwsCheck:

Expand All @@ -45,6 +46,7 @@ def __init__(self, rws_sites: json, etlds: PublicSuffixList, icanns: set):
self.etlds = etlds
self.icanns = icanns
self.error_list = []
self.warning_texts = []

def validate_schema(self, schema_file):
"""Validates the canonical sites list
Expand Down Expand Up @@ -170,6 +172,23 @@ def check_exclusivity(self, check_sets):
else:
site_list.update(aliases)

def check_associated_count(self, check_sets):
"""This method checks for RwsSets that exceed the associated limit

Creates a warning for each set passed in the check_sets list that has
more associatedSites than the ASSOCIATED_LIMIT

Args:
check_sets: Dict[string, RwsSet]
Returns:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future refactoring: IMO it'd be better design to make this a "pure" function, i.e. it takes all its inputs as arguments, and returns its result explicitly (rather than just mutating the instance). That makes it easier to test, and easier for a reader to understand how to use the method.

Doing that in this CL would require the caller to know whether the results should be considered warnings or errors, though, which isn't great. What'd be nice is if each check returned a list of errors and and list of warnings, and the caller just appended each to its running lists. That's a larger change, so it shouldn't be part of this PR.

None
"""
for primary, rws in check_sets.items():
if len(rws.associated_sites) > ASSOCIATED_LIMIT:
self.warning_texts.append(
f"Warning: the set for {primary} contains more than {ASSOCIATED_LIMIT} associated sites."
)

def url_is_https(self, site):
"""A function that checks for https://

Expand Down
9 changes: 6 additions & 3 deletions check_sites.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,8 @@ def main():
# Run rest of checks
check_list = [
rws_checker.has_all_rationales,
rws_checker.find_non_https_urls,
rws_checker.find_non_https_urls,
rws_checker.check_associated_count,
rws_checker.find_invalid_eTLD_Plus1,
rws_checker.find_invalid_well_known,
rws_checker.find_invalid_alias_eSLDs,
Expand All @@ -145,12 +146,14 @@ def main():
except Exception as inst:
error_texts.append(inst)
# This message allows us to check the succes of our action
if rws_checker.error_list or error_texts:
if rws_checker.error_list or error_texts or rws_checker.warning_texts:
for checker_error in rws_checker.error_list:
print(checker_error)
for error_text in error_texts:
print(error_text)
else:
for warning in rws_checker.warning_texts:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment here that indicates it's important for the warnings to be after errors? Unless you choose to remove that limitation, per my other comment.

print(warning)
Comment on lines +149 to +155
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting complicated/repetitive; consider extracting the things we want to print:

outputs = rws_checker.error_list + error_texts + rws_checker.warning_texts
if outputs:
  for output in outputs:
    print(output)
else:
  print("success", end="")

Note that this no longer repeats each list twice.

else:
print("success", end='')


Expand Down
78 changes: 78 additions & 0 deletions tests/rws_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,84 @@ def test_expected_rationales_case(self):
loaded_sets = rws_check.load_sets()
self.assertEqual(rws_check.error_list, [])

class TestCheckAssociatedCount(unittest.TestCase):
def test_within_limit(self):
json_dict = {
"sets":
[
{
"primary": "https://primary.com",
"associatedSites": ["https://associated1.com"],
"rationaleBySite": {}
}
]
}
rws_check = RwsCheck(rws_sites=json_dict,
etlds=None,
icanns=set())
loaded_sets = rws_check.load_sets()
rws_check.check_associated_count(loaded_sets)
self.assertEqual(rws_check.associated_warning, [])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is more specific than it needs to be, so it is a change-detector assertion. (If we add another kind of warning in the future, this assertion may fail even though the condition that it's meant to check -- that the >5 associated sites warning is not present -- is still true.)

Please check for the specific warning that this test cares about, not that there are 0 warnings overall.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is rws_check.associated_warning defined? It looks like you meant this to be rws_check.warning_texts instead, here and below. If so, please investigate why the tests passed despite this typo?


def test_over_limit(self):
json_dict = {
"sets":
[
{
"primary": "https://primary.com",
"associatedSites": ["https://associated1.com",
"https://associated2.com",
"https://associated3.com",
"https://associated4.com",
"https://associated5.com",
"https://associated6.com"],
"rationaleBySite": {}
}
]
}
rws_check = RwsCheck(rws_sites=json_dict,
etlds=None,
icanns=set())
loaded_sets = rws_check.load_sets()
rws_check.check_associated_count(loaded_sets)
self.assertEqual(rws_check.associated_warning,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that the warning we want is among the actual warnings; not that it is the only warning. (You could do this with self.assertIn or self.assertCountEqual.)

Same comment for the test case below.

["Warning: the set for https://primary.com contains more than 5 associated sites."])

def test_multi_over_limit(self):
json_dict = {
"sets":
[
{
"primary": "https://primary.com",
"associatedSites": ["https://associated1.com",
"https://associated2.com",
"https://associated3.com",
"https://associated4.com",
"https://associated5.com",
"https://associated6.com"],
"rationaleBySite": {}
},
{
"primary": "https://primary2.com",
"associatedSites": ["https://associated7.com",
"https://associated8.com",
"https://associated9.com",
"https://associated10.com",
"https://associated11.com",
"https://associated12.com"],
"rationaleBySite": {}
}
]
}
rws_check = RwsCheck(rws_sites=json_dict,
etlds=None,
icanns=set())
loaded_sets = rws_check.load_sets()
rws_check.check_associated_count(loaded_sets)
self.assertEqual(rws_check.associated_warning,
["Warning: the set for https://primary.com contains more than 5 associated sites.",
"Warning: the set for https://primary2.com contains more than 5 associated sites."])

class TestCheckExclusivity(unittest.TestCase):
def test_servicesets_overlap(self):
json_dict = {
Expand Down
Loading