Transcribe audio to text using API #19

dylman123 · 2020-04-26T05:13:33Z

Going to stick with Google Cloud Speech to Text API for now.

dylman123 · 2020-04-26T06:42:43Z

In order to integrate Google Speech API to Swift, follow this:
https://github.com/GoogleCloudPlatform/ios-docs-samples/tree/master/speech/Swift/Speech-gRPC-Streaming

In the interest of time, I will first use Rev AI API for development (note: that Rev doesn't do diarization well) https://rapidapi.com/Rev.AI/api/rev-ai

dylman123 · 2020-04-26T08:02:26Z

Either need to:

Learn how to use Rev AI API to accept an upload in the primary API call, rather than a reference to some public URL, or
Swap Rev AI API for Google Speech to Text API (using a similar backbone/approach to what has been set up for Rev AI API)

Either way, will need to learn how to first pass a file object (eg. audio file) in a POST request in Swift.

dylman123 · 2020-04-26T13:49:04Z

Watch this tutorial on how to upload files via API in Swift:

https://youtu.be/UMgApUhg7ic

dylman123 · 2020-04-27T09:01:52Z

On point 2 above, the answer is to use Google Firebase, a suite of APIs for the purpose of iOS (and by extension I assume MacOS) development. It interfaces with Google Cloud Storage so the original Speech to Text engine will be available for use.

See: https://firebase.google.com/docs/storage

Need to install Firebase with Cocoapods

dylman123 · 2020-04-27T09:24:38Z

Once an audio track is uploaded to Google Cloud Storage via Firebase, the next step will be to process that audio file "serverlessly" in GCP.
We will use Cloud Functions to do this: https://cloud.google.com/functions
In other words, the plan is to set up an events-based rule that will run code to transcribe audio to text (and possibly even generate captions too) all in the cloud.
Finally the text (or captions) will be sent back to the client with some sort of asynchronous GET request to a "transcribed" bucket via Firebase - need to look into this.

dylman123 · 2020-04-28T08:35:38Z

Will need to write the Functions code in Javascript or Typescript:
https://firebase.google.com/docs/functions/?authuser=0#implementation_paths

dylman123 · 2020-04-30T02:41:24Z

Done! Using Node JS with Google Firebase Functions! Big win

dylman123 added the refactor Fundamentally change code structure label Apr 26, 2020

dylman123 added a commit that referenced this issue Apr 26, 2020

first attempt at solving issue #19 - hitting DNS error

f610f69

dylman123 added a commit that referenced this issue Apr 26, 2020

enabled network client. started testing API calls to Rev API - issue #19

1c7dc86

dylman123 mentioned this issue Apr 28, 2020

Upload audio file to the cloud + download captions JSON #23

Closed

dylman123 added a commit that referenced this issue Apr 30, 2020

finally got a decent JSON output in GCS! issue #19

751d6df

dylman123 closed this as completed Apr 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transcribe audio to text using API #19

Transcribe audio to text using API #19

dylman123 commented Apr 26, 2020

dylman123 commented Apr 26, 2020 •

edited

Loading

dylman123 commented Apr 26, 2020 •

edited

Loading

dylman123 commented Apr 26, 2020

dylman123 commented Apr 27, 2020 •

edited

Loading

dylman123 commented Apr 27, 2020

dylman123 commented Apr 28, 2020

dylman123 commented Apr 30, 2020

Transcribe audio to text using API #19

Transcribe audio to text using API #19

Comments

dylman123 commented Apr 26, 2020

dylman123 commented Apr 26, 2020 • edited Loading

dylman123 commented Apr 26, 2020 • edited Loading

Either need to:

dylman123 commented Apr 26, 2020

dylman123 commented Apr 27, 2020 • edited Loading

dylman123 commented Apr 27, 2020

dylman123 commented Apr 28, 2020

dylman123 commented Apr 30, 2020

dylman123 commented Apr 26, 2020 •

edited

Loading

dylman123 commented Apr 26, 2020 •

edited

Loading

dylman123 commented Apr 27, 2020 •

edited

Loading