Skip to content
This repository has been archived by the owner. It is now read-only.

<speak> tag #1682

briannewbold opened this issue Feb 28, 2019 · 2 comments


Copy link

commented Feb 28, 2019

Premise: Current HTML tags are attuned to text based VISUAL consumption. The use of embedding media files is needed for audio or a/v content, which may not necessarily be accurate to the typed content of the html page. The text may also not be enunciated as typed or otherwise inaccessible in a precise manner to visually challenged individuals. Although other methods are available to invoke spoken text from html pages, there is low consistency between them. Furthermore context or even content may be lost due to additional later markup such as interstitial marketing ads or inline suggested additional reading links, when using global “speak” or page based TTS solutions. A specific tag would provide precise delineation of what is to be spoken.

Suggestion: It is my suggestion that a new tag be introduced, called “speak”.
This tag in its use would instruct the browser to speak the enclosed text aloud, utilizing TTS (text to speech) or other functionality which shall be determined by the browser itself.

Additional suggestions:
text=“” to define alternate text, and if present is the spoken text. Visible text between tags remains same.
type=onclick, onload, etc. to define browser interaction.
language=English, French, etc. to define spoken language if not browser default.
highlight=word, letter, wordunderline, letterunderline, etc. to define highlighting current word being spoken.
pip=short, long to define length of audio pip queue when tag is in onhover state.

Example use case(s) and intended audience:
Speaking page contents:
Blind individuals, who otherwise cannot use browser gestures to locate, and more precisely select and play content (text)

Click to speak, for direct consumption:
Children learning to read.
Foreign language learning.

Click to speak, for communication to others:
Autism cases of selective mutism
Other disabilities requiring selection of words to speak

Additional uses: spoken notation, citation or pronunciation of selected visible text.

Thanks for your consideration!
Brian A. Newbold


This comment has been minimized.

Copy link

commented Mar 1, 2019

Thanks for suggesting this. Some initial thoughts...

The Web Speech API is able to do this already, so a new HTML element may not be needed.

If it doesn't fulfill the use cases you have in mind, then the place to suggest and discuss new elements is the Web Incubator Community Group (WICG).

One use case probably doesn't hold up though. Blind people who need web content to be spoken will also need to have that capability at the OS level. If you can't get to the browser in the first place, having web content spoken is usually a moot point.


This comment has been minimized.

Copy link

commented Jul 29, 2019

We're closing this issue on the W3C HTML specification because the W3C and WHATWG are now working together on HTML, and all issues are being discussed on the WHATWG repository.

If you filed this issue and you still think it is relevant, please open a new issue on the WHATWG repository and reference this issue (if there is useful information here). Before you open a new issue, please check for existing issues on the WHATWG repository to avoid duplication.

If you have questions about this, please open an issue on the W3C HTML WG repository or send an email to

@siusin siusin closed this Jul 29, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
3 participants
You can’t perform that action at this time.