Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upCheck all repos in Homu's cfg.toml are valid #606
Conversation
highfive
commented
Feb 18, 2017
|
Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @aneeshusa (or someone else) soon. |
|
@aneeshusa or @edunham Could you take a look at this? |
|
The logic looks correct to me, and the whole thing superficially looks like it should work. However, both the master tests fail on Travis. Could you please add a test in the repo_pager to bail out with a descriptive error message if the auth credentials passed into it didn't work? (the following is my notes from debugging it -- I might want them later, but you're free to ignore them) So as you mentioned in your comment on #581, we're sticking a string into the gh-access-token here where we should actually be reading an env var named So then the actual error we get is string indices must be integers. I think that means that the req_list is getting filled with strings instead of dicts that you can sanely ask for a html_url out of, and that's happening up in the repo_pager. |
| recursively consumes an endpoint and an empty (self-populating) list | ||
| it will exit when it fails to find the 'next' key in the links dict | ||
| """ | ||
| req = requests.get(endpoint, headers=auth) |
This comment has been minimized.
This comment has been minimized.
edunham
Mar 20, 2017
Contributor
If the endpoint returns anything about failed auth, please bail out right here with a descriptive error message.
| # exist in the req dicts | ||
| gh_repos = [req['html_url'] for req in req_list] | ||
| except TypeError as e: | ||
| return Failure('Unable to authorize to github API', str(e)) |
This comment has been minimized.
This comment has been minimized.
rwaweber
Mar 22, 2017
Author
Contributor
After d83bedf I feel that the need to catch this exception is significantly diminished, since the authorization check is now handled in the repo_pager function.
|
Sorry about the review delay here, and thanks for the PR! The current approach of enumerating all repos under the servo org has a few problems: it doesn't handle the case of repos that aren't under the servo org, it does more work than necessary, and it requires an authentication token. I think an easier approach would be to just make a HEAD request for each repo to see if it exists, e.g. Also, please rebase on top of the latest master to get rid of the merge commit. |
|
|
||
|
|
||
| def run(): | ||
| # these try and except blocks are largely to compensate for potential |
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 11, 2017
Member
The test harness has a blanket catch-all for exceptions raised - these try/except blocks are fine while debugging, but I think they add noise once we get it working.
|
@aneeshusa Thanks for the review!
Agreed! I think that would be a great deal simpler. Would also allow me to remove the hackery around the Just for the sake of clarity, we only want to make sure that all of the repositories in the homu configuration are configured on github, correct? Not sure about the rate limiting risk here, as there would now be a request for every repo configured. But, github /probably/ wouldn't rate limit on there http frontend and with ~50 requests. This is something to figure out with testing. Would it also be preferred to consider using |
|
Yes, the intent is to ensure all the repositories in the Homu config exist on GitHub. I didn't think about rate limiting earlier, I thought the token was to get read access to the list of repos under the organization. I do think we'll need a token to work around rate limiting since IIRC there's a limit of 60 requests an hour to the API. However, we can use a separate token for the test suite and just read it from the environment instead of gathering it from the Homu config (see my comment on the PR for the token). Of course, if it works without a token, that's even better! It would be nice to use urllib and avoid the dependency (especially since it's more ergonomic in Python 3), but I'm also ok with using requests. |
|
Lemme know if there's anything you'd want me to change with the above! From what I can tell(by checking the |
|
@aneeshusa review ping! |
| @@ -0,0 +1,33 @@ | |||
| import toml | |||
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 20, 2017
Member
Split up the imports into three groups: Python modules (the urllib imports), external modules (toml) and internal modules (tests.util). Separate each group of imports by a blank line, and alphabetize in each group.
|
|
||
|
|
||
| def run(): | ||
| # repository configuration dictionary from homu |
This comment has been minimized.
This comment has been minimized.
| # repository configuration dictionary from homu | ||
| repo_cfg = toml.load('/home/servo/homu/cfg.toml')['repo'] | ||
| VCS = "https://github.com/" | ||
| # extracting owner and repo from the configuration dict |
This comment has been minimized.
This comment has been minimized.
|
Overall right approach! I left a bunch of Python style nits. Also, prefer block indent for indentation, and capitalize the Success/Failure messages and commit message. |
| from tests.util import Failure, Success | ||
|
|
||
|
|
||
| def getStatus(url): |
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 20, 2017
Member
Let's make this a repo_exists method, which takes a repo identifier string in the form owner/name, e.g. servo/saltfs, and returns a boolean. Update the doc string as appropriate.
| Consumes a url string and returns the status code of a GET request | ||
| ''' | ||
| try: | ||
| response = urllib.request.urlopen(url).status |
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 20, 2017
Member
Use HEAD requests to reduce bandwidth usage - you can create a urllib.request.Request object with method='HEAD', and pass that to urlopen.
Use a with statement for urlopen to automatically close the response.
| # extracting owner and repo from the configuration dict | ||
| # and formatting it to more easily form a url to submit a request to | ||
| homu_repos = [repo_cfg[repo_title]['owner']+'/'+repo_title | ||
| for repo_title in repo_cfg.keys()] |
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 20, 2017
Member
Iterating over repo_cfg.values() and using the 'owner' and 'name' keys for each repo will make this a little cleaner.
| VCS = "https://github.com/" | ||
| # extracting owner and repo from the configuration dict | ||
| # and formatting it to more easily form a url to submit a request to | ||
| homu_repos = [repo_cfg[repo_title]['owner']+'/'+repo_title |
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 20, 2017
Member
I prefer to use string formatting, e.g. '{}/{}'.format(repo['owner'], repo['name'])
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 20, 2017
Member
Also, this can be a generator expression instead of making a temporary list.
This comment has been minimized.
This comment has been minimized.
rwaweber
Apr 21, 2017
Author
Contributor
++ on the generator recommendation, I've never used those before so that was neat to play around with them!
| for repo_title in repo_cfg.keys()] | ||
| failed_responses = [repository for repository in homu_repos | ||
| if getStatus(VCS+repository) != 200] | ||
| failed_resp_str = " \n".join(failed_responses) |
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 20, 2017
Member
Inline this inside the Failureconstructor to avoid constructing it if not necessary.
Use '\n' to join the responses (no preceding space), and precede each entry with a - , which makes a nice list.
| # and formatting it to more easily form a url to submit a request to | ||
| homu_repos = [repo_cfg[repo_title]['owner']+'/'+repo_title | ||
| for repo_title in repo_cfg.keys()] | ||
| failed_responses = [repository for repository in homu_repos |
This comment has been minimized.
This comment has been minimized.
| def run(): | ||
| # repository configuration dictionary from homu | ||
| repo_cfg = toml.load('/home/servo/homu/cfg.toml')['repo'] | ||
| VCS = "https://github.com/" |
This comment has been minimized.
This comment has been minimized.
|
Hey @aneeshusa, just lemme know if you want me to look at anything else. If not, I'm happy to get down to squashing! |
|
One last nit here! Go ahead and squash when you address this. |
| if not repo_exists(repository) | ||
| ] | ||
| if len(missing_repos) > 0: | ||
| return Failure('Some repos set up for Homu do not exist on GitHub:', |
This comment has been minimized.
This comment has been minimized.
aneeshusa
Apr 29, 2017
Member
one last nit: use block indent for this. This also means the second argument will all fit on one line, i.e.:
return Failure(
'Some repos...on GitHub:',
'\n'.join(' - {}'.format(repo) for repo in missing_repos)
)
|
@aneeshusa review ping :) |
|
@rwaweber, all the code looks good! Please update the commit message to have a descriptive title for the change and include some motivation/details as a body. I like https://chris.beams.io/posts/git-commit/ as a guide to good commit messages. |
This test will read the deployed cfg.toml file for Homu, and report if any repositories configured there are not available on github. This will help to prevent typos in repo names.
|
Looks great, thanks for the PR and your patience, @rwaweber! @bors-servo r+ |
|
|
Check all repos in Homu's cfg.toml are valid Enumerates the inconsistencies between repositories configured in homu vs those configured in the servo github organization. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/saltfs/606) <!-- Reviewable:end -->
|
|
|
Thanks for taking the time to teach, @aneeshusa. Looking forward to the next one! |
rwaweber commentedFeb 18, 2017
•
edited by larsbergstrom
Enumerates the inconsistencies between repositories configured in homu
vs those configured in the servo github organization.
This change is