Skip to content

Commit

Permalink
allow defining characters to ignore in speech output
Browse files Browse the repository at this point in the history
  • Loading branch information
Florian Gyger committed Apr 7, 2020
1 parent c32441a commit d502ce7
Show file tree
Hide file tree
Showing 5 changed files with 44 additions and 4 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ There are two ways to configure your AWS credentials:
| `awsCredentials` | No | `{ "accessKeyId": process.env.GATSBY_AWS_ACCESS_KEY_ID, "secretAccessKey": process.env.GATSBY_AWS_SECRET_ACCESS_KEY }` |
| `defaultSsmlTags` | No | `"<prosody rate='70%'>$SPEECH_OUTPUT_TEXT</prosody>"` |
| `defaultLexiconNames` | No | `["LexA", "LexB"]` |
| `ignoredCharactersRegex` | No | `/路/` |
| `speechOutputComponentName` | No | `"CustomComponent"` |

##### About `defaultSsmlTags`:
Expand All @@ -101,6 +102,10 @@ There are two ways to configure your AWS credentials:
- The surrounding `<speak>` tag is added automatically.
- The variable `$SPEECH_OUTPUT_TEXT` will be replaced with the speech output text.

##### About `ignoredCharactersRegex`:

If your text contains special characters that should be ignored while reading (e.g. `fear路ful` should be read as `fearful`) you can use the `ignoredCharactersRegex` to define the characters to be ignored.

##### About `speechOutputComponentName`:

If you want to use your own component to handle the generated speech output you can specify its name using the `speechOutputComponentName` option. The plugin will then use this instead of `SpeechOutput` to extract the text to be used for TTS generation. Like that you can customize the way speech output is handled. Find more information about this in the [customization chapter](#customize).
Expand Down
12 changes: 12 additions & 0 deletions src/__tests__/getSsmlFromMdxAst.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,18 @@ it("should correctly extract speech output blocks from MDX AST", async () => {
);
});

it("should remove ignored special characters", async () => {
const mdxAst = loadMdxAstFromFile("single-block-with-special-characters.mdx");
const speechOutputBlock = extractSpeechOutputBlocks(
mdxAst,
"SpeechOutput",
//
)[0];
expect(speechOutputBlock.text).toEqual(
"Inside<break time='1s'/>I am a bit fearful that this dot is vocalized. The dot in the word fearful should be filtered out.<break time='1s'/>"
);
});

it("heading should end with a break SSML tag", () => {
const headingAst = {
type: "heading",
Expand Down
17 changes: 17 additions & 0 deletions src/__tests__/single-block-with-special-characters.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import SpeechOutput from "gatsby-mdx-tts"

# Outside

This text is outside a single speech output block.

<SpeechOutput id="single-block">

# Inside

I am a bit fear路ful that this dot is vocalized. The dot in the word fear路ful should be filtered out.

<CustomComponent />

</SpeechOutput>

This is again outside the speech output block.
4 changes: 3 additions & 1 deletion src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ interface PluginOptions {
};
defaultSsmlTags?: string;
defaultLexiconNames?: LexiconNameList;
ignoredCharactersRegex?: RegExp;
speechOutputComponentName?: string;
}

Expand All @@ -179,7 +180,8 @@ module.exports = async (
) => {
const speechOutputBlocks = extractSpeechOutputBlocks(
parameters.markdownAST,
pluginOptions.speechOutputComponentName || "SpeechOutput"
pluginOptions.speechOutputComponentName || "SpeechOutput",
pluginOptions.ignoredCharactersRegex
);

if (speechOutputBlocks.length > 0) {
Expand Down
10 changes: 7 additions & 3 deletions src/internals/utils/extractSpeechOutputBlocks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ const buildSpeechOutputBlock = (

const extractSpeechOutputBlocks = (
mdxAst: Node,
speechOutputComponentName: string
speechOutputComponentName: string,
ignoredCharactersRegex?: RegExp
): SpeechOutputBlock[] => {
const speechOutputBlocks: SpeechOutputBlock[] = [];

Expand All @@ -93,9 +94,12 @@ const extractSpeechOutputBlocks = (
(startNode: Node, startNodeIndex: number, parent: Node) => {
const relatedEndNode = findAfter(parent, startNode, isEndNode);
const nodesToGetTextFrom = between(parent, startNode, relatedEndNode);
const text = nodesToGetTextFrom.map(getSsmlFromMdxAst).join("");
const unfilteredText = nodesToGetTextFrom.map(getSsmlFromMdxAst).join("");
const filteredText = ignoredCharactersRegex
? unfilteredText.replace(new RegExp(ignoredCharactersRegex, "g"), "")
: unfilteredText;
speechOutputBlocks.push(
buildSpeechOutputBlock(startNode, text, relatedEndNode)
buildSpeechOutputBlock(startNode, filteredText, relatedEndNode)
);
}
);
Expand Down

0 comments on commit d502ce7

Please sign in to comment.