Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supply phonetic road names to AVSpeechSynthesizer #624

Merged
merged 1 commit into from
Dec 7, 2017
Merged

Conversation

1ec5
Copy link
Contributor

@1ec5 1ec5 commented Sep 18, 2017

This change provides AVSpeechSynthesizer with the same phonetic names that #622 provides Polly with. This way the AVSpeechSynthesizer fallback remains consistent with Polly in terms of its pronunciation of difficult names. The closer the two voices’ pronunciation, the less jarring it is when the fallback kicks in.

AVSpeechSynthesizer doesn’t support SSML; instead, on iOS 10 and above, it uses the native NSAttributedString type to hold attributes such as pronunciations. It turns out that the Alex voice (#612) does not support attributed strings, but it manages to correctly pronounce just about all the roads that are currently tagged with pronunciations anyways.

Depends on #621, mapbox/mapbox-directions-swift#174, and Project-OSRM/osrm-text-instructions.swift#39.

/cc @frederoni @bsudekum

@1ec5 1ec5 added feature New feature request. op-ex Refactoring, Tech Debt or any other operational excellence work. topic: instructions topic: voice ⚠️ DO NOT MERGE PR should not be merged! labels Sep 18, 2017
@1ec5 1ec5 self-assigned this Sep 18, 2017
@1ec5 1ec5 requested a review from frederoni September 18, 2017 11:54
@bsudekum
Copy link
Contributor

@1ec5 does it make sense to add this just for AVSpeechSynthesizer if in theory all users will be using polly via #617?

@1ec5
Copy link
Contributor Author

1ec5 commented Sep 18, 2017

in theory all users will be using polly via #617

The fallback exists for a few reasons, including:

  1. The API may not be reachable at the time it’s needed. Since instructions are time-critical, a fallback needs to kick in locally. Google Maps takes a different approach of preloading some common utterances locally and omitting details like road names that are more variable. I think it’s more helpful to have a different voice kick in than to leave the user hanging.
  2. Some of the languages this library is localized into are supported by AVSpeechSynthesizer but not Polly.

@1ec5 1ec5 added blocked: upstream and removed ⚠️ DO NOT MERGE PR should not be merged! labels Sep 18, 2017
@1ec5 1ec5 force-pushed the 1ec5-ipa-attribute branch 2 times, most recently from e5e710b to 9b8ad92 Compare September 19, 2017 18:28
return
}

guard let url = awsTask.result else {
super.speak(fallbackText, error: "No polly response")
print("No polly response")
speakWithoutPolly(for: routeProgress, userDistance: userDistance)
return
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we cancel the task before creating a new one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, but I think we should address this issue in #661. Once that PR lands, I’ll rebase and do likewise here.

@@ -150,13 +152,14 @@ public class PollyVoiceController: RouteVoiceController {
audioPlayer.volume = strongSelf.volume
audioPlayer.play()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the return value of .play() is discardable but I think we should observe it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#661 takes care of responding to the return value.

@bsudekum
Copy link
Contributor

@1ec5 is this still actionable?

@1ec5
Copy link
Contributor Author

1ec5 commented Nov 16, 2017

It would still be desirable to ensure that the AVSpeechSynthesizer fallback can handle phonetic names. AVSpeechSynthesizer isn’t going away; after all, Polly is optional, and it doesn’t support most of the languages we do, like Chinese.

For AVSpeechSynthesizer to apply the phonetic names as string attributes, this library currently depends on OSRMTextInstructions.swift, hence all the merge conflicts. Relying on OSRMTextInstructions.swift would create some inconsistencies between the instructions spoken by Polly and those spoken by AVSpeechSynthesizer, due to extra processing we’re doing server-side outside of OSRM Text Instructions. A better solution would be for the Directions API to indicate the ranges within the plain-text instruction that should be pronounced a certain way. We can track that issue internally.

I’ve also been leaving this PR around in order to track a refactoring of PollyVoiceController and RouteVoiceController, but I had you carry it out in #800.

@1ec5
Copy link
Contributor Author

1ec5 commented Nov 17, 2017

A better solution would be for the Directions API to indicate the ranges within the plain-text instruction that should be pronounced a certain way. We can track that issue internally.

In the meantime, RouteVoiceController can search for the current route step’s names inside the plain-text instruction and apply the step’s phoneticNames to those substrings inside an attributed string. The assumption would be that we wouldn’t encounter a single instruction that contains the same name pronounced two different ways. Hopefully the road name is always distinguishable from other instruction text.

@1ec5
Copy link
Contributor Author

1ec5 commented Dec 1, 2017

Ready for review.

@frederoni
Copy link
Contributor

This implementation currently relies on RouteStepFormatter which is getting removed in #767.
Would it make sense to wait for 767 to land and adapt this PR afterward?

@bsudekum
Copy link
Contributor

bsudekum commented Dec 4, 2017

Would it make sense to wait for 767 to land and adapt this PR afterward?

Yeah let's do that.

@1ec5
Copy link
Contributor Author

1ec5 commented Dec 5, 2017

Yes, we should wait until after #767 lands. This PR still includes the modifications to RouteStepFormatter mainly so that the project can still build. But the cruxt of the change is in RouteVoiceController.

Build speech strings as attributed strings in order to apply an IPA notation attribute to road names when spoken by AVSpeechSynthesizer.
@1ec5
Copy link
Contributor Author

1ec5 commented Dec 6, 2017

All set.

if Locale.preferredLocalLanguageCountryCode == "en-US" {
utterance.voice = AVSpeechSynthesisVoice(identifier: AVSpeechSynthesisVoiceIdentifierAlex)
// Alex can’t handle attributed text.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢

@1ec5 1ec5 merged commit 9047a41 into master Dec 7, 2017
@1ec5 1ec5 deleted the 1ec5-ipa-attribute branch December 7, 2017 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature request. op-ex Refactoring, Tech Debt or any other operational excellence work. topic: instructions topic: voice
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants