-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor app doc classes & build task #2918
Conversation
The method does actually return an array of elements of type `App`, so `pages` is quite a disingenuous name, all the more problematic now that we have a concept of imported app 'docs' (synonymous with pages). Renaming 'pages' to 'apps' makes it clearer what object types we are dealing with.
We're already checking if a repo is private in the 'consume_docs_folder' method in AppDocs, but this will be removed in the next commit. It's safer to move the check to a lower level in the code, to ensure that there's no possible way we can accidentally retrieve private information through these methods. I had to stop using `app_name` and start using `github_repo_name` for `govuk-content-store-examples`, which is actually called `govuk-content-schemas` on GitHub. I also had to explicitly reject private repos from ProxyPages and api_layout, as otherwise the build would error with `RuntimeError: alphagov/govuk-knowledge-graph not found` (which is a private repository). Hopefully we can continue to refactor govuk-developer-docs to be less brittle around this.
Since #2910, _all_ apps have their `docs/` folder imported. We have not identified a reason why any of them should be opted out, so to simplify the code & tests, we've now removed the relevant code.
- Move govuk-content-schemas check later on in the task, as it's quite slow. - Suppress the very noisy `HTTP.get` output from the `verify_deployable_apps` and `check_puppet_names` tasks, so it's easier to figure out where in the task we are. - Remove the `/tmp/govuk-content-schemas` before cloning, to a) stop the task from borking when this dir already exists, and b) ensure it is always up to date. - Clarify that you do now NEED a GitHub auth token to build or test govuk-developer-docs, given that there are so many API requests made now that we import external documentation. Documents a useful shortcut for retrieving your auth token without exposing it in your bash history.
These tests would often fail if the cache exists already (from an earlier test run or build). In the case of document_types_spec.rb, the keys would sometimes return in a different order, so I've now arranged them alphabetically before asserting them. And in the case of the CSV tests, I've bypassed having to worry about the cache and simply mock the method calls rather than stub the network requests. This has led to easier to read tests!
The way these tests were written meant that the cache got cleared when the tests were run, so if we'd just gone through the expensive (many requests, multiple minutes) build process, we'd have to go through it allllll again to get requests in the cache. We can now build the project from scratch (i.e. make lots of HTTP requests), then run the test suite, then build the project again (this time with no HTTP requests, as the requests are cached), and it all works nicely.
There's quite a lot of mocking going on, but I think this is an improvement on what we had previously (you no longer need to scroll in order to read the entire test, for a start!). This test is mainly about how the responses from network requests and method calls are combined into an array result, so rather than do all the complicated setup to get those method calls returning something we can test against, it's better to simply mock the return values of those method calls.
d55bed1
to
6f53c71
Compare
spec/app/github_repo_fetcher_spec.rb
Outdated
def public_repo | ||
double("Public repo", private_repo?: false, default_branch: "master") | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really following the explanation in 5acf5ad, aren't let
s reinitialised for every test which uses them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just tried switching it back to a let
, and it worked! I found it did work when running github_repo_fetcher_spec.rb
directly, but would fail when running the entire test suite. However, I've just tried a couple of runs and it seems to pass every time, so some wires must have got crossed somewhere.
I've now fixed up the commit to use a let
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I think it just happens intermittently. An unrelated change just failed:
Failures:
1) GitHubRepoFetcher#repo fetches a repo from cache if it exists
Failure/Error: all_alphagov_repos.find { |repo| repo.name == repo_name } || raise("alphagov/#{repo_name} not found")
#<Double "Public repo"> received unexpected message :name with (no args)
# ./app/github_repo_fetcher.rb:8:in `block in repo'
# ./app/github_repo_fetcher.rb:8:in `each'
# ./app/github_repo_fetcher.rb:8:in `find'
# ./app/github_repo_fetcher.rb:8:in `repo'
# ./spec/app/github_repo_fetcher_spec.rb:19:in `block (3 levels) in <top (required)>'
2) GitHubRepoFetcher#repo raises error if no repo is found
Failure/Error:
expect {
GitHubRepoFetcher.instance.repo("something-not-here")
}.to raise_error(StandardError)
expected StandardError, got #<RSpec::Mocks::ExpiredTestDoubleError: #<Double "Public repo"> was originally created in one example...only last for one example, and you need to create a new one in each example you wish to use it for.> with backtrace:
# ./app/github_repo_fetcher.rb:8:in `block in repo'
# ./app/github_repo_fetcher.rb:8:in `each'
# ./app/github_repo_fetcher.rb:8:in `find'
# ./app/github_repo_fetcher.rb:8:in `repo'
# ./spec/app/github_repo_fetcher_spec.rb:34:in `block (4 levels) in <top (required)>'
# ./spec/app/github_repo_fetcher_spec.rb:33:in `block (3 levels) in <top (required)>'
# ./spec/app/github_repo_fetcher_spec.rb:33:in `block (3 levels) in <top (required)>'
3) GitHubRepoFetcher#repo fetches a repo from GitHub if it doesn't exist in the cache
Failure/Error: all_alphagov_repos.find { |repo| repo.name == repo_name } || raise("alphagov/#{repo_name} not found")
#<Double "Public repo"> was originally created in one example but has leaked into another example and can no longer be used. rspec-mocks' doubles are designed to only last for one example, and you need to create a new one in each example you wish to use it for.
# ./app/github_repo_fetcher.rb:8:in `block in repo'
# ./app/github_repo_fetcher.rb:8:in `each'
# ./app/github_repo_fetcher.rb:8:in `find'
# ./app/github_repo_fetcher.rb:8:in `repo'
# ./spec/app/github_repo_fetcher_spec.rb:27:in `block (3 levels) in <top (required)>'
4) GitHubRepoFetcher#readme retrieves the README content from the GitHub CDN
Failure/Error: all_alphagov_repos.find { |repo| repo.name == repo_name } || raise("alphagov/#{repo_name} not found")
#<Double "Public repo"> was originally created in one example but has leaked into another example and can no longer be used. rspec-mocks' doubles are designed to only last for one example, and you need to create a new one in each example you wish to use it for.
# ./app/github_repo_fetcher.rb:8:in `block in repo'
# ./app/github_repo_fetcher.rb:8:in `each'
# ./app/github_repo_fetcher.rb:8:in `find'
# ./app/github_repo_fetcher.rb:8:in `repo'
# ./app/github_repo_fetcher.rb:14:in `readme'
# ./spec/app/github_repo_fetcher_spec.rb:70:in `block (3 levels) in <top (required)>'
5) GitHubRepoFetcher#readme returns nil if no README exists
Failure/Error: all_alphagov_repos.find { |repo| repo.name == repo_name } || raise("alphagov/#{repo_name} not found")
#<Double "Public repo"> was originally created in one example but has leaked into another example and can no longer be used. rspec-mocks' doubles are designed to only last for one example, and you need to create a new one in each example you wish to use it for.
# ./app/github_repo_fetcher.rb:8:in `block in repo'
# ./app/github_repo_fetcher.rb:8:in `each'
# ./app/github_repo_fetcher.rb:8:in `find'
# ./app/github_repo_fetcher.rb:8:in `repo'
# ./app/github_repo_fetcher.rb:14:in `readme'
# ./spec/app/github_repo_fetcher_spec.rb:90:in `block (3 levels) in <top (required)>'
Finished in 1.06 seconds (files took 2.33 seconds to load)
1488 examples, 5 failures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's unfortunate. Well I'm happy to approve a PR restoring the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly I just tried it and the issue still persists: https://github.com/alphagov/govuk-developer-docs/runs/1733868114
I'll have to rearchitect the tests I think.
We had some doubles that were duplicated in places, and an unnecessary `let(:repo)` that was only used by one test.
Having the `App` class nested inside `AppDocs` made it harder to test, and a much longer file to read and understand. The functionality of both the `AppDocs` and `App` classes has now grown to the point where it makes sense to treat them as disparate classes in their own right.
Now that we have a concept of 'documentation from a given app's repo', the term 'AppDocs' is misleading - it suggests that it is that. When actually it is a collection of what we now call `App`s (as per the class in `app.rb`), and docs are separate. To attempt to remove the confusion, I've renamed the class. I'll rename some of its methods next.
`.apps` made sense when this was called `AppDocs`, but now is just a repetition of the class name. `.all` is more appropriate, and follows traditional Railsy naming conventions.
Duplication of the word 'app' is unnecessary.
In several places, we needed to just retrieve the public repos. So I've encapsulated this in its own method.
We had an unnecessary `active_app_pages` helper method, where a simple scope on the `Applications` class would do. We also had duplicated filtering logic in several places, all of which just wanted to retrieve the non-retired apps.
We do have distinct concepts of `app_name` vs `repo_name`; often they are the same thing, but can be overridden in config, e.g. https://github.com/alphagov/govuk-developer-docs/blob/91185cc86710ccf39f170f3743a2e3f9b5c12f7d/data/applications.yml#L341-L342 GitHubRepoFetcher does indeed rely on the repo name rather than the app name, so we should make that clearer in the code.
53c779a
to
2a7db13
Compare
In theory, a `let` should create a new variable each time, but we're seeing errors like: ``` GitHubRepoFetcher#repo fetches a repo from GitHub if it doesn't exist in the cache Failure/Error: all_alphagov_repos.find { |repo| repo.name == repo_name } || raise("alphagov/#{repo_name} not found") #<Double "Public repo"> was originally created in one example but has leaked into another example and can no longer be used. rspec-mocks' doubles are designed to only last for one example, and you need to create a new one in each example you wish to use it for. # ./app/github_repo_fetcher.rb:8:in `block in repo' # ./app/github_repo_fetcher.rb:8:in `each' # ./app/github_repo_fetcher.rb:8:in `find' # ./app/github_repo_fetcher.rb:8:in `repo' # ./spec/app/github_repo_fetcher_spec.rb:27:in `block (3 levels) in <top (required)>' ``` We'll try switching to a method call so that it creates a new one every time. See discussion: #2918 (comment)
In theory, a `let` should create a new variable each time, but we're seeing errors like: ``` GitHubRepoFetcher#repo fetches a repo from GitHub if it doesn't exist in the cache Failure/Error: all_alphagov_repos.find { |repo| repo.name == repo_name } || raise("alphagov/#{repo_name} not found") #<Double "Public repo"> was originally created in one example but has leaked into another example and can no longer be used. rspec-mocks' doubles are designed to only last for one example, and you need to create a new one in each example you wish to use it for. # ./app/github_repo_fetcher.rb:8:in `block in repo' # ./app/github_repo_fetcher.rb:8:in `each' # ./app/github_repo_fetcher.rb:8:in `find' # ./app/github_repo_fetcher.rb:8:in `repo' # ./spec/app/github_repo_fetcher_spec.rb:27:in `block (3 levels) in <top (required)>' ``` We'll try switching to a method call so that it creates a new one every time. See discussion: #2918 (comment)
We're intermittently seeing errors like: ``` GitHubRepoFetcher#repo fetches a repo from GitHub if it doesn't exist in the cache Failure/Error: all_alphagov_repos.find { |repo| repo.name == repo_name } || raise("alphagov/#{repo_name} not found") #<Double "Public repo"> was originally created in one example but has leaked into another example and can no longer be used. rspec-mocks' doubles are designed to only last for one example, and you need to create a new one in each example you wish to use it for. # ./app/github_repo_fetcher.rb:8:in `block in repo' # ./app/github_repo_fetcher.rb:8:in `each' # ./app/github_repo_fetcher.rb:8:in `find' # ./app/github_repo_fetcher.rb:8:in `repo' # ./spec/app/github_repo_fetcher_spec.rb:27:in `block (3 levels) in <top (required)>' ``` I've tried switching `let(:public_repo)` for a method (see discussion): #2918 (comment) ...but that did not have the intended effect. I believe we need to stop manipulating the `CACHE` global itself to fix our tests.
We're intermittently seeing errors like: ``` GitHubRepoFetcher#repo fetches a repo from GitHub if it doesn't exist in the cache Failure/Error: all_alphagov_repos.find { |repo| repo.name == repo_name } || raise("alphagov/#{repo_name} not found") #<Double "Public repo"> was originally created in one example but has leaked into another example and can no longer be used. rspec-mocks' doubles are designed to only last for one example, and you need to create a new one in each example you wish to use it for. # ./app/github_repo_fetcher.rb:8:in `block in repo' # ./app/github_repo_fetcher.rb:8:in `each' # ./app/github_repo_fetcher.rb:8:in `find' # ./app/github_repo_fetcher.rb:8:in `repo' # ./spec/app/github_repo_fetcher_spec.rb:27:in `block (3 levels) in <top (required)>' ``` I've tried switching `let(:public_repo)` for a method (see discussion): #2918 (comment) ...but that did not have the intended effect. I believe we need to stop manipulating the `CACHE` global itself to fix our tests.
Fixes #2940. Bug introduced in #2918. Tests were sensitive to their order of execution. State from a previous run of tests can remain on the `GitHubRepoFetcher` singleton object. The Singleton pattern is somewhat discouraged [1] [2], though it can be useful to refer to a common instance of a class where its fetch methods are computationally expensive and we rely heavily on caching. This commit follows the guidance in the second referenced article, to retain the Singleton presentation of the class but also to allow it to be instantiated (making it much easier to test). I've also added a test to capture and verify the intended `.instance` singleton behaviour. [1]: https://www.rubyguides.com/2018/05/singleton-pattern-in-ruby/ [2]: https://8thlight.com/blog/josh-cheek/2012/10/20/implementing-and-testing-the-singleton-pattern-in-ruby.html
This PR tackles some of the technical debt that has built up over time. See commits for details.