Skip to content

Commit

Permalink
Update documentation for republishing Whitehall content
Browse files Browse the repository at this point in the history
In alphagov/whitehall#6715, we have consolidated
a number of rake tasks into one.

This updates the documentation to reflect that and also makes minor
changes to make the documentation more readable.
  • Loading branch information
brucebolt committed Aug 8, 2022
1 parent 3ed990c commit 332cee3
Showing 1 changed file with 49 additions and 15 deletions.
64 changes: 49 additions & 15 deletions source/manual/republishing-content.html.md.erb
Expand Up @@ -6,13 +6,17 @@ layout: manual_layout
parent: "/manual.html"
---

Sometimes it may be necessary to republish content to make it show up on the website. For example if we make an update to [govspeak][govspeak-repo] that would require us to re-render and save new HTML for content.
Sometimes it may be necessary to republish content to the Publishing API. This will refresh the content on the website.

For example if we make an update to [govspeak][govspeak-repo] and a publishing application pre-renders that content prior to its submission to Publishing API, that would require us to re-render and save new HTML for content.

This process varies per app and requires

- Connection to the [VPN][vpn] and
- [Production access][production-access]

You may wish to test first on integration, prior to carrying out the republish in production.

## Whitehall

If the documents are in Whitehall, there are Rake tasks you can run as outlined below. Try to pick the one most focused to the scope of what you need to republish to avoid unnecessary load. You can monitor the effect on the publishing queue via these Grafana dashboards:
Expand All @@ -21,35 +25,65 @@ If the documents are in Whitehall, there are Rake tasks you can run as outlined
- [staging](https://grafana.blue.staging.govuk.digital/dashboard/file/sidekiq.json?refresh=1m&orgId=1&var-Application=whitehall&var-Interval=$__auto_interval)
- [production](https://grafana.blue.production.govuk.digital/dashboard/file/sidekiq.json?refresh=1m&orgId=1&var-Application=whitehall&var-Interval=$__auto_interval)

<%= RunRakeTask.links("whitehall", "publishing_api:republish:document_by_slug[slug]") %>

For organisations, use the `organisation_by_slug` rake task:
### To republish a single document by slug

<%= RunRakeTask.links("whitehall", "publishing_api:republish:organisation_by_slug[slug]") %>
<%= RunRakeTask.links("whitehall", "publishing_api:republish:document_by_slug[slug]") %>

For all of a single document type, use the `bulk_republish` rake task:
> Replace `slug` with the document's slug, but not the full base path. For example the slug for `https://www.gov.uk/government/important-news` would be `important-news`.

### To republish all documents of a specific type

To republish all instances of the following document types, run the following rake task.

- CaseStudy
- Consultation
- Contact
- CorporateInformationPage
- DetailedGuide
- DocumentCollection
- FatalityNotice
- Government
- NewsArticle
- OperationalField
- Organisation
- Person
- PolicyGroup
- Publication
- Role
- RoleAppointment
- Speech
- StatisticalDataSet
- StatisticsAnnouncement
- TakePartPage
- TopicalEvent
- TopicalEventAboutPage
- WorldLocation
- WorldwideOrganisation

<%= RunRakeTask.links("whitehall", "publishing_api:bulk_republish:document_type[DocumentClass]") %>

For example:
`/government/case-studies/alexander-dennis-maximum-capacity`
`publishing_api:bulk_republish:document_type[CaseStudies]`
> Replace `DocumentClass` with the camelized (i.e. as it is written above) class name.

You may wish to test first on Integration.
### To republish multiple documents

For a short list of Content IDs, use the `documents_by_content_ids` rake task:
For a small number of documents, use the following rake task:

<%= RunRakeTask.links("whitehall", "publishing_api:bulk_republish:documents_by_content_ids[content_id_1 content_id_2]") %>

For a significant number of Content IDs, some preparation is needed for this as a CSV file needs to be in place.
> Replace `content_id_1` and `content_id_2` with the content ID (i.e. UUID) for the documents to republish. You can add more than 2 content IDs.

For a significant number of documents, a CSV file should be added to the repository:

1. The CSV should have a column called content_id that contains all the relevant IDS. This should be added to the whitehall repository at `lib/tasks/{FILENAME}.csv`.
1. Create a CSV file that contains a single column headed `content_id`. Put the content ID for each document on a separate line below this. The file should be saved in `lib/tasks/{FILENAME}.csv` and a PR raised.
1. Merge and deploy the PR to the relevant environment.
1. Run the `documents_by_content_ids_from_csv` rake task:
<%= RunRakeTask.links("whitehall", "publishing_api:bulk_republish:documents_by_content_ids_from_csv[csv_file_name]") %>
> Replace `csv_file_name` with the filename of the CSV, including the `.csv` extension.
1. After the job has completed, remove the CSV from the repository.

To republish all documents:
Caution: this is a lot of content and will take hours to complete. If it is possible to scope the republish do so and use a different task, but if you have made a change such as something in govspeak that will affect the majority of content, this is available. Before running this job confirm with Technical 2nd Line that they are happy for you to proceed as it could cause backed up publishing queues and alerts.
### To republish all documents

> Caution: this is a lot of content and will take hours to complete. If it is possible to scope the republish do so and use a different task, but if you have made a change such as something in govspeak that will affect the majority of content, this is available. Before running this job confirm with Technical 2nd Line that they are happy for you to proceed as it could cause backed up publishing queues and alerts.

<%= RunRakeTask.links("whitehall", "publishing_api:bulk_republish:all") %>

Expand Down

0 comments on commit 332cee3

Please sign in to comment.