# Vérifier et préparer votre environnement de projet

## Vérifier votre environnement

In [1]:
!gcloud config get-value core/account

pascalr@google.com


In [2]:
!gcloud config get-value core/project

pascal-demo


## Vérifier que les APIs sont activées

In [3]:
!gcloud services list --enabled --filter speech

NAME                         TITLE
speech.googleapis.com        Cloud Speech-to-Text API
texttospeech.googleapis.com  Cloud Text-to-Speech API


## Vérifier que les API Speech-to-Text et Text-to-Speech sont bien activées

In [4]:
!gcloud services list --enabled --filter=speech

NAME                         TITLE
speech.googleapis.com        Cloud Speech-to-Text API
texttospeech.googleapis.com  Cloud Text-to-Speech API


## Vérifier que l'API Translation est bien activée

In [5]:
!gcloud services list --enabled --filter=translate

NAME                      TITLE
automl.googleapis.com     Cloud AutoML API
translate.googleapis.com  Cloud Translation API


## Activer les APIs éventuellement non activées
Si certaines APIs ne sont pas encore activées, retournez sur l'interface web de la console GCP, cliquez sur __Navigation Menu__ (icône dite "Burger menu") > __APIs & Services__ > __Library__.

Cherchez la ou les API(s) non encore activées et cliquez sur __Enable__.

## Déclarer votre clé d'API
Retrouvez la valuer de votre clé d'API :
- depuis l'interface graphique de la console GCP :
cliquez sur __Navigation Menu__ (icône dite "Burger menu") > __APIs & Services__ > __Credentials__

- depuis le fichier dans lequel vous l'avez enregistrée :
Dans __Cloud Shell__ tapez la commende :

```cat My-API-key.txt```

Puis copiez-collez la valeur ci-dessous :

In [None]:
%env API_KEY <valeur>

In [3]:
PROJECT = !gcloud config get-value core/project
%env GOOGLE_CLOUD_PROJECT {PROJECT[0]}

env: GOOGLE_CLOUD_PROJECT=pascal-demo


In [None]:
!echo $GOOGLE_CLOUD_PROJECT

# Transcrire un fichier audio vocal vers du texte avec Speech-to-Text API

## Créer un service account et télécharger la clé

In [5]:
!gcloud iam service-accounts create my-lab-sa --display-name "my lab service account"

Created service account [my-lab-sa].


In [7]:
!gcloud iam service-accounts keys create ~/key.json --iam-account my-lab-sa@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com

created key [1fb39498cbfd0cca2662a6460a31a1509a776f78] of type [json] as [/home/jupyter/key.json] for [my-lab-sa@pascal-demo.iam.gserviceaccount.com]


In [8]:
%env GOOGLE_APPLICATION_CREDENTIALS="/home/jupyter/key.json"

env: GOOGLE_APPLICATION_CREDENTIALS="/home/jupyter/key.json"


## Utiliser la librairie client Python 

Vous allez lancer la transcription d'un [autre fichier sonore vocal](https://storage.googleapis.com/speech-language-samples/fr-sample.flac) (cliquez sur le lien pour l'écouter).

Vous allez lancer la transcription d'un [autre fichier sonore vocal](https://storage.googleapis.com/ml-api-codelab/tr-ostrich.wav) (cliquez sur le lien pour l'écouter).


### Vérifier d'abord que le module google-cloud-speech est bien installé :

In [10]:
!pip freeze | grep speech
!echo $GOOGLE_APPLICATION_CREDENTIALS

google-cloud-speech==1.3.2
"/home/jupyter/key.json"


### Ensuite, préparer l'appel à l'API via le client Python :

In [39]:
from google.cloud import speech

#creds = service_account.Credentials.from_service_account_file("/home/jupyter/key.json") 
client = speech.SpeechClient.from_service_account_json('/home/jupyter/key.json')

#gcs_uri = "gs://speech-language-samples/fr-sample.flac"
gcs_uri = "gs://ml-api-codelab/tr-ostrich.wav"

audio = speech.types.RecognitionAudio(uri=gcs_uri)
config = speech.types.RecognitionConfig(
    #encoding=speech.types.RecognitionConfig.AudioEncoding.FLAC,
    #sample_rate_hertz=44100,
    #language_code="fr",
    language_code="tr-TR"
)

### Lancer l'opération en mode synchrone et afficher la meilleure réponse :

In [40]:
response = client.recognize(config=config, audio=audio)
for result in response.results:
    # The first alternative is the most likely one for this portion.
    print(u"Transcript: {}".format(result.alternatives[0].transcript))
    print("Confidence: {}".format(result.alternatives[0].confidence))

Transcript: çok fazla deve kuşu var
Confidence: 0.9382978081703186


### Lancer l'opération en mode asyncrhone et afficher la meilleure réponse :

In [29]:
operation = client.long_running_recognize(
    config=config, audio=audio
)
print("Waiting for operation to complete...")
response = operation.result(timeout=90)

Waiting for operation to complete...


In [30]:
for result in response.results:
    # The first alternative is the most likely one for this portion.
    print(u"Transcript: {}".format(result.alternatives[0].transcript))
    print("Confidence: {}".format(result.alternatives[0].confidence))

Transcript: maître corbeau sur un arbre perché tenait en son bec un fromage
Confidence: 0.9385557770729065


# Traduire du texte avec Translation API

## Stocker le texte précédent dans une variable

In [47]:
text = response.results[0].alternatives[0].transcript

## Initialiser le client de l'API Translation

In [48]:
from google.cloud import translate_v2 as translate

translate_client = translate.Client.from_service_account_json('/home/jupyter/key.json')

## Préparer la requête / spécifier la langue cible

In [55]:
"""Translates text into the target language.

Target must be an ISO 639-1 language code.
See https://cloud.google.com/translate/docs/reference/rest/v3/SupportedLanguages
"""
target = "fr"

## Exécuter l'appel API

In [56]:
# Text can also be a sequence of strings, in which case this method
# will return a sequence of results for each text.
result = translate_client.translate(text, target_language=target)

## Afficher le résultat

In [57]:
print(u"Text: {}".format(result["input"]))
print(u"Translation: {}".format(result["translatedText"]))
print(u"Detected source language: {}".format(result["detectedSourceLanguage"]))

Text: çok fazla deve kuşu var
Translation: il y a trop d&#39;autruches
Detected source language: tr


# Synthèse vocale avec Text-to-Speec API

## Préparer l'environnement Python

In [58]:
!pip install --upgrade google-cloud-texttospeech

Collecting google-cloud-texttospeech
  Downloading google_cloud_texttospeech-2.2.0-py2.py3-none-any.whl (57 kB)
[K     |████████████████████████████████| 57 kB 3.2 MB/s eta 0:00:011
Collecting proto-plus>=1.4.0
  Downloading proto-plus-1.10.0.tar.gz (24 kB)
Collecting libcst>=0.2.5
  Downloading libcst-0.3.12-py3-none-any.whl (501 kB)
[K     |████████████████████████████████| 501 kB 7.3 MB/s eta 0:00:01
Collecting typing-inspect>=0.4.0
  Downloading typing_inspect-0.6.0-py3-none-any.whl (8.1 kB)
Building wheels for collected packages: proto-plus
  Building wheel for proto-plus (setup.py) ... [?25ldone
[?25h  Created wheel for proto-plus: filename=proto_plus-1.10.0-py3-none-any.whl size=36995 sha256=9a4c720b383fe27579eb63123d3ceec377b420c83b6520e204b2c68d2a56b818
  Stored in directory: /home/jupyter/.cache/pip/wheels/11/33/fc/104a428f03e59037ac73931b71b719ba559c37a3683ec39391
Successfully built proto-plus
Installing collected packages: proto-plus, typing-inspect, libcst, google-clou

## Initialiser le client Python

In [61]:
from google.cloud import texttospeech

tts_client = texttospeech.TextToSpeechClient.from_service_account_json('/home/jupyter/key.json')

## Imprimer la liste des voix disponibles

In [None]:
voices = tts_client.list_voices()

for voice in voices.voices:
    # Display the voice's name. Example: tpc-vocoded
    print(f"Name: {voice.name}")

    # Display the supported language codes for this voice. Example: "en-US"
    for language_code in voice.language_codes:
        print(f"Supported language: {language_code}")

    ssml_gender = texttospeech.SsmlVoiceGender(voice.ssml_gender)

    # Display the SSML Voice Gender
    print(f"SSML Voice Gender: {ssml_gender.name}")

    # Display the natural sample rate hertz for this voice. Example: 24000
    print(f"Natural Sample Rate Hertz: {voice.natural_sample_rate_hertz}\n")

## Stocker la requete dans un dictionnaire Python

In [63]:
my_text = 'Cloud Text-to-Speech API allows developers to include natural-sounding, synthetic human speech as playable audio in their applications. The Text-to-Speech API converts text or Speech Synthesis Markup Language (SSML) input into audio data like MP3 or LINEAR16 (the encoding used in WAV files).'

input = {
    'input':{
        'text':'Cloud Text-to-Speech API allows developers to include natural-sounding, synthetic human speech as playable audio in their applications. The Text-to-Speech API converts text or Speech Synthesis Markup Language (SSML) input into audio data like MP3 or LINEAR16 (the encoding used in WAV files).'
    },
    'voice':{
        'languageCode':'en-gb',
        'name':'en-GB-Standard-A',
        'ssmlGender':'FEMALE'
    },
    'audioConfig':{
        'audioEncoding':'MP3'
    }
}

## Préparer, lancer et exécuter la requête

In [64]:
input_text = texttospeech.SynthesisInput(text=my_text)

voice = texttospeech.VoiceSelectionParams(
    language_code="en-US",
    name="en-US-Standard-C",
    ssml_gender=texttospeech.SsmlVoiceGender.FEMALE,
)

audio_config = texttospeech.AudioConfig(
    audio_encoding=texttospeech.AudioEncoding.MP3
)

response = client.synthesize_speech(
    request={"input": input_text, "voice": voice, "audio_config": audio_config}
)

