Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDFs and other non-HTML documents #161

Open
sarawilcox opened this issue Sep 9, 2019 · 27 comments
Open

PDFs and other non-HTML documents #161

sarawilcox opened this issue Sep 9, 2019 · 27 comments
Labels
content Goes into the 'Content' section of the service manual inclusion Makes our products and services more inclusive

Comments

@sarawilcox
Copy link
Collaborator

sarawilcox commented Sep 9, 2019

Use this issue to discuss the PDFs and other non-HTML documents page in the service manual

This issue started with a proposal that teams share their experience and best practice guidance at the September 2019 Style Council and that we should consider whether our current service manual page needs changing: https://service-manual.nhs.uk/content/pdfs and https://service-manual.nhs.uk/content/links

@sarawilcox
Copy link
Collaborator Author

Discussed at September Style Council meeting - notes available. To be discussed further at October meeting.

@JackMatthams
Copy link
Collaborator

Some draft guidance for PDFs, why we avoid them and when they may be used was proposed at the October content style council. These are a work in progress and subject to change.

Some possible actions:

  • Draft a clearer explanation for the style guide about why we avoid PDFs​
  • Draft criteria for the style guide to help decide when we will and won't publish PDFs ​
  • Draft a checklist for what makes an accessible PDF – should we?​
  • Agree and publish a PDF-A template as a short-term option – should we?​
  • Check that our current guidance on linking to PDFs is sufficient​

    Also:​
  • Research online PDF checkers - is there anything we can recommend?​
  • Independent assessment of PDFs?​

On why we should avoid PDFs:
"We avoid PDFs because:​

  • they're not able to meet the range of users' accessibility needs ​
  • they give people a poor user experience, especially on mobile​
  • many browsers, tools and extensions do not work with them – they often have problems with zoom, scroll, audio, image and keyboard navigation​
  • they take users away from the website, opening in a new tab, window or software – and not all users have the right software ​
  • they are hard to maintain and update, so users may get out of date and unreliable content​
  • if users find PDFs in search results, they get them without any context or supporting material​
  • search engines do not rank PDFs high in search results ​
  • it’s difficult to collect analytics data on how people interact with PDFs and that makes it difficult to identify problems and improve them"

On where we may use PDFs:

"New PDFs​
​Wherever possible, we avoid PDFs. Instead we create content as structured web pages in HTML. ​

There are a few cases where we need to publish a PDF, for example:​

  • downloads of reports or publications, where a web page version is also available​
  • where there's a legal or regulatory requirement to have a formal, signed document​
  • for downloads designed for printing, such as posters ​
  • for niche audiences where there is a clear user need for special formats like Easy Read ​

    If you do create a new PDF, it must meet the PDF/A standard which is more accessible. (Follow the GOV.UK guidance on publishing accessible documents.) ​

Old PDFs​
​Under the new accessibility regulations, you don't need to fix PDFs or other documents published before 23 September 2018."

@JackMatthams
Copy link
Collaborator

The NHS Digital Website Team provided some useful feedback on their own approach and guidance to PDFs and PDF/A templates:

"PDF/A is a PDF format that you can save in, rather than a template. ​

It saves information about formatting as well as content in a way that other programmes can read - for example, coding header structure correctly. This means it is the best PDF format for accessibility.​

To check PDFs, we save them as PDF/A and run the Adobe Acrobat accessibility report to identify any issues. We can then fix lots of these in the checker. For example, you can step through the document adding either 'decorative image' labels or alt-text so that images in the document are treated correctly by screen readers.​"

"We're wary of making templates to help people make accessible PDFs, as it might encourage them to use that format when we know PDFs will never be as accessible as HTML pages. ​

However, we are planning to work on some corporate document templates in Word, which use basic, properly marked up headings etc. These should be easier to:​

  • bring into the CMS​
  • make accessible pdfs from​
  • use in Word format​

    The work should start in the native programme that the doc was created in, and there are accessibility checkers in Word now.​"

@sarawilcox
Copy link
Collaborator Author

New PDF page to be published Monday 18 November.

@sarawilcox
Copy link
Collaborator Author

Published.

@sarawilcox sarawilcox added the content Goes into the 'Content' section of the service manual label Oct 21, 2020
@torchboxjen
Copy link

I love this page and share it when explaining why I don’t recommend publishing content in PDF form. However, there is one point that I don’t believe is accurate, and could be improved:

search engines do not rank PDFs high in search results

Search engines crawl, index and rank PDFs the same as a HTML page, and they do rank highly when they contain information relevant to the user’s search query, e.g.:

nhs providers march 2020 budget

There are also thousands of clicks on PDFs in organic search for the NHS website.

I would suggest changing it to (something like) these two bullet points:

search engines can rank PDF content highly in search results, when the PDF is the most relevant result for a user's search query. However, often other HTML webpages rank higher than PDFs because they offer a better landing page experience for the user, and they are written and optimised for users seeking the page through search.

PDFs are not ideal landing pages for users because they cannot see information about who the publisher is, and they cannot navigate the rest of the site. It is also difficult to implement Analytics tracking on PDFs, meaning it is difficult to measure as part of your users' journeys

Another reason why not to use PDFs: many organisations aren’t tracking PDF clicks through search, because they are not tracked in Google Analytics by default (not sure if it’s the same for Adobe). So, PDFs don’t count in pageviews and sessions data. Many sites often only track button clicks, not PDF page load, which then excludes users arriving from outside the site (e.g. from organic search, and direct traffic). (Nb. You can still see this information in Google Search Console.)

@sarawilcox
Copy link
Collaborator Author

I'm afraid we'll have to hold this one over till January @torchboxjen. Hope that's OK.

@sarawilcox
Copy link
Collaborator Author

Latest GDS guidance on PDFs

From a GOV.UK Basecamp post, 4 December 2020

Hi everyone,

We have now updated GOV.UK guidance to make it clear how to publish accessible documents on GOV.UK.

We have updated the guidance to say:

  • you must publish HTML if the document is designed to be viewed on a screen
  • If the document is designed to be edited, you must publish in an open document format.

If the document is designed to be printed, for example, a flyer, you can publish a PDF. However, you must publish an accessible version with it - either in HTML or OpenDocument.

The guidance explains this in more detail:

There has been a lot of conversation about publishing accessible documents on GOV.UK, especially around publishing PDFs. While it is possible to make PDFs somewhat accessible, we’ve seen lots of PDFs published on GOV.UK which are not as accessible as possible.

It takes a lot of work to make a PDF accessible and you may be breaking the law if you publish a PDF or other non-HTML document without an accessible version.

Why PDFs may not be accessible for everyone
If you must publish a PDF, it must have:

  • a logical structure based on tags and headings
  • meaningful document properties
  • readable body text
  • good color contrast
  • text alternative for images
  • Even with all those, a PDF may not be accessible for every user. As we shared in a blog post, there’s still no guarantee that PDF content will meet the accessibility needs of users and their technology.

Some users need to change browser settings such as colours and text size to make web content easier to read. It’s difficult to do this for content in PDFs. You can magnify the file, but the words might not wrap and the font might pixelate, making for a poor user experience.

Some users need to view information on mobile devices.

Locking content into a PDF limits the ability for people to make these kinds of accessibility customisations.

Creating open documents
HTML attachments can be created from source documents like word or ODTs using the govspeak preview app and you can use a table generator to convert an Excel document, Google sheet or web-page table to Markdown

Also, HTML attachments can now be printed by users if needed.

When it’s not technically possible to publish in open format
We also recognise that there may be some scenarios where technical constraints of our publishing tools make it impossible to publish the document in HTML or OpenDocument. If you identify such a scenario, please let us know so we can prioritise the updating of the publishing tools.

We understand the pressures of publishing at pace
We understand that some departments may not have the capacity to do this immediately. However, we hope that this change to guidance will help you push for changes to your publishing workflows as each department is responsible for the accessibility of their own content.

Tobi Ogunsina
GOV.UK Accessibility Team

@sarawilcox
Copy link
Collaborator Author

sarawilcox commented Dec 7, 2020

We need to review this at an upcoming Style Council meeting, along with @torchboxjen's comments above.

We've taken on board some comments from the NHS Digital website team, NHS.UK product lead and Alistair Duggin (re accessibility and PDFs).

@sarawilcox
Copy link
Collaborator Author

January Style Council meeting

We reviewed an updated version of our PDF guidance at the meeting and agreed to publish it.

Since the meeting, we've had some further comments.

The latest version of the page is available for review in the NHS.UK Slack content channel and we've sent it to 2 comms teams for feedback. We'll then send it off for clinical approval before updating the guidance in the service manual.

@sarawilcox
Copy link
Collaborator Author

Sent to clinical team for sign off.

@sarawilcox
Copy link
Collaborator Author

sarawilcox commented Feb 25, 2021

Guidance updated. Move to Done for now.

@sarawilcox
Copy link
Collaborator Author

Message from @cjforms on this issue: nhsuk/nhsuk-service-manual#928.

Exactly. I only recently learned about the problems of using ODFs rather than PDFs or HTML for forms.

Of course I'm in favour of using HTML when that's feasible, but I'm definitely afraid that colleagues will interpret this guidance as 'you can't publish a .PDF form' and therefore decide not to publish, or to restrict their choice to paper (which is highly inaccessible for some people, although I still defend paper as an acceptable choice for others).

I agree with Alistair. In fact, I think we all agree. Only I don't think the advice makes it clear.

Here's what we agree on:

  • If your form is currently only on paper, then you must make an accessible version as well.
  • If you can, use an HTML form because this is the easiest to make accessible.
  • If you can't yet make an HTML form, then an accessible PDF is an acceptable option in the short term.
    Can we make the guidance have a specific section on forms that aligns with that?

@cjforms
Copy link

cjforms commented Feb 26, 2021

Thanks @sarawilcox

I'll keep an eye on this thread too, and will ask my Defra colleague Martin Glancy to have a look. He's the person who has been investigating this problem recently and who brought it to my attention

@sarawilcox sarawilcox changed the title PDFs - challenges and recommended practice PDFs and other non-HTML documents Apr 9, 2021
@sarawilcox
Copy link
Collaborator Author

sarawilcox commented May 5, 2021

Comment from Alistair Duggin:
We offer 2 options:

  • create an HTML alternative
  • delete the PDF
    There is a 3rd option which is to make sure that the PDF is compliant. It is technically possible to create a PDF that meets WCAG 2.0 and it is legal to publish a PDF if it meets WCAG (though it may not be fully inclusive). Would we consider offering this option, given that it is hard to check and may involve paying for the services of a PDF specialist.

Note: re changing fonts and colours, it depends how it is created. See blog post on GOV.UK re pros and cons of PDFs. It is possible to provide support for high contrast and alternative foreground and background colours though most people can't won't be able to change them. It is possible to zoom into a PDF and to reflow content into a single column but not to resize text.

Sara to discuss with NHSD website team.

https://www.adobe.com/accessibility/pdf/pdf-accessibility-overview.html

@cjforms
Copy link

cjforms commented May 5, 2021

Ref Alistair Duggin's comment:

If the document is a form, then the 3rd option is especially important and may be the best one.

If you delete the form, then you may be pushing the user back to using paper which is a useful format but definitely not the most convenient for most people.

@sarawilcox
Copy link
Collaborator Author

Note also this comment on the PDFs section in the accessibility guidance: #347 (comment)

@sarawilcox sarawilcox added the inclusion Makes our products and services more inclusive label May 27, 2021
@sarawilcox
Copy link
Collaborator Author

@mcheung-nhs has suggested clarifying that open document format means the OpenDocument (.odt) format.

@sarawilcox
Copy link
Collaborator Author

We get some questions about Mb or MB. The content style guide says MB.

A comment on NHS.UK Slack from @Ross-Clark:
"Mb and MB have different meanings to some
Mb being a Megabit and MB being a MegaByte.
Megabits are used typically for one of the units in things like file transfer & internet speed
MegaBytes are typically used to measure file sizes."
So MB is correct for file sizes.

@sarawilcox
Copy link
Collaborator Author

Current guidance on linking to PDFs is as follows.

If we need to link to a PDF, we:

  • open it in the same tab
  • add "PDF, [file size in MB or KB]" in brackets to the end of the link text, for example - "weight loss progress chart (PDF, 545KB)"

Proposal to Style Council, December 2021: Given that our guidance is to avoid linking to PDFs and we would only link if the information is only available as PDF, we suggest adding the word “only”:

PDF only, [file size in MB or KB]

Action: Sara to check that this would read as “PDF only, 5MB” not “PDF only 5MB”.

Approved by Style Council, subject to above check. Sara Wilcox to update the content style guide.

Note also @mcheung-nhs's comment about re ODT - to do.

@sarawilcox
Copy link
Collaborator Author

@mcheung-nhs confirmed that, on VoiceOver and NVDA and JAWS, there is a slight pause with them all but it wouldn't be much of a pause if the speaking rate was set to fast! Also note that NVDA/Firefox read the 5MB as “5 megabytes” whereas the others said “5 M B”.

@sarawilcox
Copy link
Collaborator Author

Added the word "only" to guidance on avoiding linking to PDFs on the Links page.

@sarawilcox
Copy link
Collaborator Author

@mcheung-nhs has suggested clarifying that open document format means the OpenDocument (.odt) format.

Hi @mcheung-nhs, I wonder if we should specify Open Document Formats (ODFs) rather than .odt specifically? https://www.gov.uk/guidance/using-open-document-formats-odf-in-your-organisation

@mcheung-nhs
Copy link

Hi @sarawilcox - yes you're probably right, as the .odt is an instance of an ODF but there are others such as .ods and .odp

@sarawilcox
Copy link
Collaborator Author

Latest from GOV.UK

Dear GOV.UK publishers,

If you publish annual reports or any other documents in PDFs, you also need to publish them in HTML. If you only upload the PDF version, you are breaking the law.

GOV.UK’s policy is HTML first. You should only publish a PDF if the document is designed to be printed, such as a leaflet or booklet.

Tips and advice

Here are some ways to help you convert your PDFs to HTML:

There is more information in the publishing accessible documents guidance.

GOV.UK Find and View Accessibility team

@sarawilcox
Copy link
Collaborator Author

sarawilcox commented May 16, 2022

More from GOV.UK

We have found that no matter what you do with PDFs, there are certain things that cannot be done to make it as accessible as possible.

For example, you cannot change the background or font colours in a PDF which some low vision users need to do to access the document.

PDFs also make it very hard for magnification users to access the document.

The best way to make sure your document includes everyone is to publish it in HTML.

The WCAG guidelines fall under the Public Sector Bodies (Websites and Mobile Applications) (No. 2) Accessibility Regulations 2018.

However, we also have a legal obligation to provide equal access to people with disabilities under the Equality Act 2010.

See Government Content Community Basecamp for comments on this guidance (for people with access to Basecamp: https://3.basecamp.com/4322319/buckets/15005645/messages/4938632689).

@cjforms
Copy link

cjforms commented May 17, 2022

I'm greatly in favour of an html-first policy, but I'm also bewildered by some of the anti-.pdf rhetoric. It's just a technology and it has its place.

There are lots of health reasons why printable documents are convenient or important, and .pdf is a helpful way to get the document to place where it can be printed on demand - or in bulk, for that matter. For example:

  • as a clinician, it's helpful to be able to give a leaflet to a patient there and then in clinic.
  • as the supporter/carer for a patient, I have access to the internet but the patient doesn't. I want to be able to print a copy so that I can read it to the patient or so that the patient can read it at their convenience.

Yes, it is possible to print from html. But often, it's not all that nice when it is printed.

Also .pdf is often more convenient for preserving a document - a need that is common amongst researchers.

Also, why are we saying 'no decorative images'? Why not? Obviously they need to be tested like any other images but what's wrong with alleviating the gloom a bit? Why can't something be cheered up a bit by an appropriate decorative image? Let's get some emotion back into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Goes into the 'Content' section of the service manual inclusion Makes our products and services more inclusive
Projects
Development

No branches or pull requests

5 participants