Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check all repos in Homu's cfg.toml are valid #606

Merged
merged 1 commit into from May 3, 2017
Merged

Conversation

@rwaweber
Copy link
Contributor

rwaweber commented Feb 18, 2017

Enumerates the inconsistencies between repositories configured in homu
vs those configured in the servo github organization.


This change is Reviewable

@highfive
Copy link

highfive commented Feb 18, 2017

Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @aneeshusa (or someone else) soon.

@jdm
Copy link
Member

jdm commented Mar 15, 2017

@aneeshusa or @edunham Could you take a look at this?

@edunham
Copy link
Contributor

edunham commented Mar 20, 2017

The logic looks correct to me, and the whole thing superficially looks like it should work. However, both the master tests fail on Travis.

Could you please add a test in the repo_pager to bail out with a descriptive error message if the auth credentials passed into it didn't work?

(the following is my notes from debugging it -- I might want them later, but you're free to ignore them)

So as you mentioned in your comment on #581, we're sticking a string into the gh-access-token here where we should actually be reading an env var named TEST_HOMU_GH_ACCESS_TOKEN after having set the TEST_HOMU_GH_ACCESS_TOKEN to a working token at https://travis-ci.org/servo/saltfs/settings . So then the string which is not in fact a token at all gets blindly pulled in to the config without being validated as a token here, and is then (i think) successfully read out as if it were the token on line 38 of the new test.

So then the actual error we get is string indices must be integers. I think that means that the req_list is getting filled with strings instead of dicts that you can sanely ask for a html_url out of, and that's happening up in the repo_pager.

recursively consumes an endpoint and an empty (self-populating) list
it will exit when it fails to find the 'next' key in the links dict
"""
req = requests.get(endpoint, headers=auth)

This comment has been minimized.

@edunham

edunham Mar 20, 2017

Contributor

If the endpoint returns anything about failed auth, please bail out right here with a descriptive error message.

# exist in the req dicts
gh_repos = [req['html_url'] for req in req_list]
except TypeError as e:
return Failure('Unable to authorize to github API', str(e))

This comment has been minimized.

@rwaweber

rwaweber Mar 22, 2017

Author Contributor

After d83bedf I feel that the need to catch this exception is significantly diminished, since the authorization check is now handled in the repo_pager function.

edunham added a commit to edunham/saltfs that referenced this pull request Mar 23, 2017
edunham added a commit to edunham/saltfs that referenced this pull request Mar 23, 2017
edunham added a commit to edunham/saltfs that referenced this pull request Mar 23, 2017
@rwaweber rwaweber changed the title addressing 581 Check all repos in Homu's cfg.toml are valid Mar 26, 2017
Copy link
Member

aneeshusa left a comment

Sorry about the review delay here, and thanks for the PR! The current approach of enumerating all repos under the servo org has a few problems: it doesn't handle the case of repos that aren't under the servo org, it does more work than necessary, and it requires an authentication token. I think an easier approach would be to just make a HEAD request for each repo to see if it exists, e.g.requests.head('https://github.com/servo/servo').status_code == 200. We only have ~50 repos configured, so this shouldn't be too slow, and is simpler to deploy and understand.

Also, please rebase on top of the latest master to get rid of the merge commit.



def run():
# these try and except blocks are largely to compensate for potential

This comment has been minimized.

@aneeshusa

aneeshusa Apr 11, 2017

Member

The test harness has a blanket catch-all for exceptions raised - these try/except blocks are fine while debugging, but I think they add noise once we get it working.

@rwaweber
Copy link
Contributor Author

rwaweber commented Apr 11, 2017

@aneeshusa Thanks for the review!

just make a HEAD request for each repo to see if it exists, e.g.requests.head('https://github.com/servo/servo').status_code == 200

Agreed! I think that would be a great deal simpler. Would also allow me to remove the hackery around the repo_pager function necessary to get through the github api, I must've misunderstood the original request in the issue statement.

Just for the sake of clarity, we only want to make sure that all of the repositories in the homu configuration are configured on github, correct?

Not sure about the rate limiting risk here, as there would now be a request for every repo configured. But, github /probably/ wouldn't rate limit on there http frontend and with ~50 requests. This is something to figure out with testing.

Would it also be preferred to consider using urllib instead of introducing an extra dependency in the form of requests? I'm perfectly happy to rewrite it that way.

@aneeshusa
Copy link
Member

aneeshusa commented Apr 11, 2017

Yes, the intent is to ensure all the repositories in the Homu config exist on GitHub.

I didn't think about rate limiting earlier, I thought the token was to get read access to the list of repos under the organization. I do think we'll need a token to work around rate limiting since IIRC there's a limit of 60 requests an hour to the API. However, we can use a separate token for the test suite and just read it from the environment instead of gathering it from the Homu config (see my comment on the PR for the token). Of course, if it works without a token, that's even better!

It would be nice to use urllib and avoid the dependency (especially since it's more ergonomic in Python 3), but I'm also ok with using requests.

@rwaweber rwaweber force-pushed the rwaweber:master branch from 4567ba0 to 4acee0a Apr 13, 2017
@rwaweber
Copy link
Contributor Author

rwaweber commented Apr 13, 2017

Lemme know if there's anything you'd want me to change with the above!

From what I can tell(by checking the https://api.github.com/rate_limit endpoint), these requests are not affected by the 60 unauthenticated requests an hour limit imposed by the github api. Downside is, that it makes the homu.sls tests run a good 10-15 seconds slower, which is sort of annoying. Though, getting around that limitation may end up being more trouble than it's worth.

@jdm
Copy link
Member

jdm commented Apr 20, 2017

@aneeshusa review ping!

@@ -0,0 +1,33 @@
import toml

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

Split up the imports into three groups: Python modules (the urllib imports), external modules (toml) and internal modules (tests.util). Separate each group of imports by a blank line, and alphabetize in each group.



def run():
# repository configuration dictionary from homu

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

nit: Remove comment

# repository configuration dictionary from homu
repo_cfg = toml.load('/home/servo/homu/cfg.toml')['repo']
VCS = "https://github.com/"
# extracting owner and repo from the configuration dict

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

nit: Remove comment

Copy link
Member

aneeshusa left a comment

Overall right approach! I left a bunch of Python style nits. Also, prefer block indent for indentation, and capitalize the Success/Failure messages and commit message.

from tests.util import Failure, Success


def getStatus(url):

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

Let's make this a repo_exists method, which takes a repo identifier string in the form owner/name, e.g. servo/saltfs, and returns a boolean. Update the doc string as appropriate.

Consumes a url string and returns the status code of a GET request
'''
try:
response = urllib.request.urlopen(url).status

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

Use HEAD requests to reduce bandwidth usage - you can create a urllib.request.Request object with method='HEAD', and pass that to urlopen.
Use a with statement for urlopen to automatically close the response.

# extracting owner and repo from the configuration dict
# and formatting it to more easily form a url to submit a request to
homu_repos = [repo_cfg[repo_title]['owner']+'/'+repo_title
for repo_title in repo_cfg.keys()]

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

Iterating over repo_cfg.values() and using the 'owner' and 'name' keys for each repo will make this a little cleaner.

VCS = "https://github.com/"
# extracting owner and repo from the configuration dict
# and formatting it to more easily form a url to submit a request to
homu_repos = [repo_cfg[repo_title]['owner']+'/'+repo_title

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

I prefer to use string formatting, e.g. '{}/{}'.format(repo['owner'], repo['name'])

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

Also, this can be a generator expression instead of making a temporary list.

This comment has been minimized.

@rwaweber

rwaweber Apr 21, 2017

Author Contributor

++ on the generator recommendation, I've never used those before so that was neat to play around with them!

for repo_title in repo_cfg.keys()]
failed_responses = [repository for repository in homu_repos
if getStatus(VCS+repository) != 200]
failed_resp_str = " \n".join(failed_responses)

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

Inline this inside the Failureconstructor to avoid constructing it if not necessary.
Use '\n' to join the responses (no preceding space), and precede each entry with a - , which makes a nice list.

# and formatting it to more easily form a url to submit a request to
homu_repos = [repo_cfg[repo_title]['owner']+'/'+repo_title
for repo_title in repo_cfg.keys()]
failed_responses = [repository for repository in homu_repos

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

Rename this to missing_repos.

def run():
# repository configuration dictionary from homu
repo_cfg = toml.load('/home/servo/homu/cfg.toml')['repo']
VCS = "https://github.com/"

This comment has been minimized.

@aneeshusa

aneeshusa Apr 20, 2017

Member

Inline this the urllib.request.Request constructor in repo_exists.

@rwaweber
Copy link
Contributor Author

rwaweber commented Apr 28, 2017

Hey @aneeshusa, just lemme know if you want me to look at anything else. If not, I'm happy to get down to squashing!

Copy link
Member

aneeshusa left a comment

One last nit here! Go ahead and squash when you address this.

if not repo_exists(repository)
]
if len(missing_repos) > 0:
return Failure('Some repos set up for Homu do not exist on GitHub:',

This comment has been minimized.

@aneeshusa

aneeshusa Apr 29, 2017

Member

one last nit: use block indent for this. This also means the second argument will all fit on one line, i.e.:

    return Failure(
        'Some repos...on GitHub:',
        '\n'.join(' - {}'.format(repo) for repo in missing_repos)
    )
@edunham
Copy link
Contributor

edunham commented May 2, 2017

@aneeshusa review ping :)

@aneeshusa
Copy link
Member

aneeshusa commented May 2, 2017

@rwaweber, all the code looks good! Please update the commit message to have a descriptive title for the change and include some motivation/details as a body. I like https://chris.beams.io/posts/git-commit/ as a guide to good commit messages.

This test will read the deployed cfg.toml file for Homu, and report if
any repositories configured there are not available on github.

This will help to prevent typos in repo names.
@rwaweber rwaweber force-pushed the rwaweber:master branch from 6c20908 to 0acc281 May 2, 2017
@aneeshusa
Copy link
Member

aneeshusa commented May 3, 2017

Looks great, thanks for the PR and your patience, @rwaweber!

@bors-servo r+

@bors-servo
Copy link
Contributor

bors-servo commented May 3, 2017

📌 Commit 0acc281 has been approved by aneeshusa

@bors-servo
Copy link
Contributor

bors-servo commented May 3, 2017

Testing commit 0acc281 with merge d1a5302...

bors-servo added a commit that referenced this pull request May 3, 2017
Check all repos in Homu's cfg.toml are valid

Enumerates the inconsistencies between repositories configured in homu
vs those configured in the servo github organization.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/saltfs/606)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented May 3, 2017

☀️ Test successful - status-travis
Approved by: aneeshusa
Pushing d1a5302 to master...

@bors-servo bors-servo merged commit 0acc281 into servo:master May 3, 2017
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@rwaweber
Copy link
Contributor Author

rwaweber commented May 3, 2017

Thanks for taking the time to teach, @aneeshusa. Looking forward to the next one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.