Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Kurdish TTS #5

Closed
3 tasks
willwade opened this issue Aug 3, 2023 · 0 comments
Closed
3 tasks

Support Kurdish TTS #5

willwade opened this issue Aug 3, 2023 · 0 comments

Comments

@willwade
Copy link
Contributor

willwade commented Aug 3, 2023

Kurdish TTS is available from https://tts.asosoft.com

** UPDATE: See https://github.com/AceCentre/TranslateAndTTS/blob/main/KurdishTTS/kurdishTTS.py - This now needs to become an engine option in the code. **

I can't see an API - but it looks like a POST request sending back a mp3 file

NB:

  • l = chkLatin = "If it contains Latin words, read them as Kurdish"
  • p = chkPct = "Read punctuation symbols"

eg

$.post("[https://tts.kurdishspeech.com/"](https://tts.kurdishspeech.com/%22), {
						t: txt,
						l: $('#chkKuLatin').is(':checked'),
						p: $('#chkPunct').is(':checked')
					},
					function(data) {
						var path = `[https://tts.kurdishspeech.com/static/TTS/${data}.mp3`;](https://tts.kurdishspeech.com/static/TTS/$%7Bdata%7D.mp3%60;)
						$('.loading').hide();
						$('#results').show();
						$('.audio').html(`<audio controls autoplay><source src="${path}" type="audio/mpeg"></audio>`);
					}
				);

e.g.

curl -X POST -F 't=سڵاو' -F 'l=true' -F 'p=false' https://tts.kurdishspeech.com

response is a number: https://tts.kurdishspeech.com/static/TTS/number.mp3

So to do this we need to build a TTS wrapper to this service to use Kurdish TTS.

NB: above - it looks like punctuation and characters are checked. Would have to repeat those functions client side.

Here is a code sample. NOTE: NO CHECKING

import requests
from playsound import playsound
import os

# Code should santise text using - https://tts.kurdishspeech.com/static/js/Normalizer.js
# see https://stackoverflow.com/questions/16467479/normalizing-unicode

# Should be:
# - Less than 2000 characters
# - Remove any html chars. 
# - NormaliseUnicode  https://stackoverflow.com/questions/16467479/normalizing-unicode
#
# NB: for the most part this is fine in AAC use cases where people are typing in pure sorani only

if os.path.exists('snd.mp3'):
	os.remove('snd.mp3')

PastedText = 'سڵاو'
url = 'https://tts.kurdishspeech.com'
myobj = {'t': PastedText, 'l': 'true', 'p':'false' }
x = requests.post(url, data = myobj)
snd=requests.get('https://tts.kurdishspeech.com/static/TTS/'+x.text+'.mp3')
with open('snd.mp3', 'wb') as f:
    f.write(snd.content)
playsound('snd.mp3')
  • Add "KurdishTTS" as an option to use for TTS
  • Make a python library/file to call KurdishTTS. This would take a unicode string and download the mp3 and then play it
  • Do some error checking. Strip the string over 2000 chars. Look into the JS code to see what its doing to see if we need to santise the code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant