# GCP Speech to Text (STT) API example
## In this tutorial we will be following these steps:
1. Install client libraries
2. Enable the APIs
3. export credentials after downloading a service account
4. Creating a bucket
5. Upload MP3 file
6. Transcribe using the Speech-to-Text API.

### See doucmentation links for documentations. 
TODO add the documentation links

### Replace all ' # <--- CHANGE THIS ' throughout the notebook

### 1. Install client libraries

In [None]:
!pip install --upgrade google-cloud-storage
!pip install --upgrade google-cloud-speech

### 2. Enable the APIs

In [114]:
PROJECT_ID = 'YOUR-PROJECT-ID'  # <--- CHANGE THIS

link_to_enable_API = "https://console.developers.google.com/apis/api/speech.googleapis.com/overview?project="+PROJECT_ID

print("If this is the first time you are using the API in the project please enable the API via this link:\n\n"+ link_to_enable_API)

If this is the first time you are using the API in the project please enable the API via this link:

https://console.developers.google.com/apis/api/speech.googleapis.com/overview?project=YOUR-PROJECT-ID


### Make sure to enable the API
<img src="enable-api.png" width="400">

### 3. export credentials after downloading a service account
#### 3.1 Create and download Service Account
https://cloud.google.com/iam/docs/creating-managing-service-account-keys#iam-service-account-keys-create-console

Follow the instructions in the link above to create a **service account key** with correct permissions (Project owner) and download the key.json

Export the path to json key as an environment variable
#### 3.2 Add permissions to service account
TODO: add link to documentation here
#### 3.3 Export service account as env variable

In [103]:
!export GOOGLE_APPLICATION_CREDENTIALS="/path/to/key.json"  # <--- CHANGE THIS

In [104]:
#check that the variable have been set correctly
GOOGLE_APPLICATION_CREDENTIALS

'AutoMLdemo-64ae368c71e2.json'

In [105]:
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/path/to/key.json" # <--- CHANGE THIS
print(os.environ['GOOGLE_APPLICATION_CREDENTIALS'])

AutoMLdemo-64ae368c71e2.json


### 4. Creating a bucket

In [106]:
from google.cloud import storage

BUCKET_NAME = "test-bucket-speech-n" # <--- CHANGE THIS

def create_bucket(bucket_name): 
    storage_client = storage.Client()
    bucket = storage_client.create_bucket(bucket_name)

    print("Bucket {} created.".format(bucket.name))
create_bucket(BUCKET_NAME)

Conflict: 409 POST https://storage.googleapis.com/storage/v1/b?project=automl-demo-198411&prettyPrint=false: You already own this bucket. Please select another name.

### 5. Upload MP3 file

In [107]:
SOURCE_FILE_NAME = "Armstrong_Small_Step.ogg.mp3"
DEST_FILE_NAME = "Armstrong_Small_Step2.ogg.mp3"
def upload_blob(bucket_name, source_file_name, destination_blob_name):
    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)
    blob.upload_from_filename(source_file_name)

    print(
        "File {} uploaded to {}.".format(
            source_file_name, destination_blob_name
        )
    )
    
upload_blob(BUCKET_NAME, SOURCE_FILE_NAME, DEST_FILE_NAME)

File Armstrong_Small_Step.ogg.mp3 uploaded to Armstrong_Small_Step2.ogg.mp3.


In [117]:
# validating the file has been uploaded
!gsutil ls gs://$BUCKET_NAME

gs://test-bucket-speech-n/2021020713284401905bff2636dcbb7e-partner-F3c69Fj6-972527143684.mp3
gs://test-bucket-speech-n/20210207132906019075a793a86e9da4-partner-htCGaDFf-972506009909.mp3
gs://test-bucket-speech-n/Armstrong_Small_Step.ogg.mp3
gs://test-bucket-speech-n/Armstrong_Small_Step2.ogg.mp3


### 6. Transcribe using the Speech-to-Text API.


In [109]:
from google.cloud import speech_v1p1beta1 as speech

GCS_URI = 'gs://' + BUCKET_NAME + '/' + DEST_FILE_NAME
print(GCS_URI)

def transcribe_sync(storage_uri):
    """
    Performs synchronous speech recognition on an audio file

    Args:
      storage_uri URI for audio file in Cloud Storage, e.g. gs://[BUCKET]/[FILE]
    """

    client = speech.SpeechClient()


    # The language of the supplied audio
    language_code = "en-US"

    # Sample rate in Hertz of the audio data sent
    sample_rate_hertz = 44100

    # Encoding of audio data sent. This sample sets this explicitly.
    # This field is optional for FLAC and WAV audio formats.
    encoding = speech.RecognitionConfig.AudioEncoding.MP3
    config = {
        "language_code": language_code,
        "sample_rate_hertz": sample_rate_hertz,
        "encoding": encoding,
        "model": "video"
    }
    audio = {"uri": storage_uri}

    response = client.recognize(config=config, audio=audio)

    for result in response.results:
        # First alternative is the most probable result
        alternative = result.alternatives[0]
        print(u"Transcript: {}".format(alternative.transcript))
        
transcribe_sync(GCS_URI)

gs://test-bucket-speech-n/Armstrong_Small_Step2.ogg.mp3
Transcript: step off the Lem now
Transcript:  that's one small step for man
Transcript:  one giant leap for mankind


In [112]:
from google.cloud import speech_v1p1beta1 as speech

GCS_URI = 'gs://' + BUCKET_NAME + '/' + DEST_FILE_NAME
print(GCS_URI)

def transcribe_async(gcs_uri):
    """Asynchronously transcribes the audio file specified by the gcs_uri."""

    client = speech.SpeechClient()

    audio = speech.RecognitionAudio(uri=gcs_uri)
    encoding = speech.RecognitionConfig.AudioEncoding.MP3
    language_code = "en-US" #iw-IL
    sample_rate_hertz = 44100
    
    config = {
        "language_code": language_code,
        "sample_rate_hertz": sample_rate_hertz,
        "encoding": encoding,
        "use_enhanced":True,
        "model": "video"
    }

    operation = client.long_running_recognize(config=config, audio=audio)

    print("Waiting for operation to complete...")
    response = operation.result(timeout=90)

    # Each result is for a consecutive portion of the audio. Iterate through
    # them to get the transcripts for the entire audio file.
    for result in response.results:
        # The first alternative is the most likely one for this portion.
        print(u"Transcript: {}".format(result.alternatives[0].transcript))
        #print("Confidence: {}".format(result.alternatives[0].confidence))
transcribe_async(GCS_URI)        


gs://test-bucket-speech-n/Armstrong_Small_Step2.ogg.mp3
Waiting for operation to complete...
Transcript: step off the Lem now
Transcript:  that's one small step for man
Transcript:  one giant leap for mankind
