<a href="https://colab.research.google.com/github/coneill000/French-Flashcards-with-Anki/blob/main/French_Speech_Recognition_and_Translation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# French Speech Recognition and Translation
Welcome to my speech recognition and translation program. This was designed to speed up the process of making French Anki flashcards by using the audio captured from the textbook. The program will first recognize the French word being spoken and rename the audio file accordingly. It will then use that word recognition to translate the meaning into English. All of this information will be stored in a Google Sheets, where you can manually check any errors, missing vocab words, etc. Yeah it's kind of overly complicated but who cares. 

## Imports
Run this first so you have the necessary imports. The imports are split into two cells, one for pip installs and one for library imports

In [None]:
!pip install pydub
!pip install SpeechRecognition
!pip3 install googletrans

Collecting pydub
  Downloading https://files.pythonhosted.org/packages/7b/d1/fbfa79371a8cd9bb15c2e3c480d7e6e340ed5cc55005174e16f48418333a/pydub-0.24.1-py2.py3-none-any.whl
Installing collected packages: pydub
Successfully installed pydub-0.24.1
Collecting SpeechRecognition
[?25l  Downloading https://files.pythonhosted.org/packages/26/e1/7f5678cd94ec1234269d23756dbdaa4c8cfaed973412f88ae8adf7893a50/SpeechRecognition-3.8.1-py2.py3-none-any.whl (32.8MB)
[K     |████████████████████████████████| 32.8MB 122kB/s 
[?25hInstalling collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.8.1
Collecting googletrans
  Downloading https://files.pythonhosted.org/packages/71/3a/3b19effdd4c03958b90f40fe01c93de6d5280e03843cc5adf6956bfc9512/googletrans-3.0.0.tar.gz
Collecting httpx==0.13.3
[?25l  Downloading https://files.pythonhosted.org/packages/54/b4/698b284c6aed4d7c2b4fe3ba5df1fcf6093612423797e76fbb24890dd22f/httpx-0.13.3-py3-none-any.whl (55kB)
[K     |█████████████████

In [None]:
import os
import re
import gspread
import speech_recognition as sr
from pydub import AudioSegment
from googletrans import Translator, constants
from google.colab import auth
from oauth2client.client import GoogleCredentials

## Google Authorization
Run this to give the program access to your Google Sheets. Please make sure that you mount your Google Drive using Google Colab. 

In [None]:
auth.authenticate_user()
gc = gspread.authorize(GoogleCredentials.get_application_default())

## User-Set Variables


In [None]:
directory = '/content/drive/My Drive/Ch2Vocab' #change this to directory where audio files are stored
sheetname = 'En avant Ch2Vocab' #change this to desired name of Google Sheets document
tag = 'EnAvantCh2'

## Main Program (aka Under the Hood)

In [None]:
successpath = os.path.join(directory, 'success') 
errorpath = os.path.join(directory, 'error')
os.mkdir(successpath)
os.mkdir(errorpath)

In [None]:
r = sr.Recognizer()
translator = Translator()
frenchlist = []

In [None]:
for filename in os.listdir(directory):
    is_match = bool(re.match("S.mp3|S-\d.mp3|S-\d\d.mp3|S-\d\d\d.mp3", filename))
    if is_match:
        src = f'{directory}/{filename}'
        dst = f'{directory}/{filename[:-4]}.wav'
        sound = AudioSegment.from_mp3(src)
        sound.export(dst, format="wav")
        temp = sr.AudioFile(dst)
        with temp as source:
            audio = r.record(source)
        try:
            french = r.recognize_google(audio, language='fr-FR')
            os.rename(src, f'{successpath}/{french}.mp3')
            frenchlist.append(french)
        except:
            os.rename(src, f'{errorpath}/{filename}')

In [None]:
#sets up sheet/opens pre-existing sheet for info storage
sh = gc.create(sheetname)
worksheet = gc.open(sheetname).sheet1

In [None]:
#sets up basic worksheet information
worksheet.update_cell(1, 1, "French")
worksheet.update_cell(1, 2, "English")
worksheet.update_cell(1, 3, "Audio")
worksheet.update_cell(1, 4, "Tags")

num = len(frenchlist) + 1
french_cells = worksheet.range(f'A2:A{num}')
meaning_cells = worksheet.range(f'B2:B{num}')
audio_cells = worksheet.range(f'C2:C{num}')
tag_cells = worksheet.range(f'D2:D{num}')

In [None]:
#fills in Google Sheet with information
for i, cell in enumerate(french_cells):
    cell.value = frenchlist[i]

for i, cell in enumerate(meaning_cells):
    translation = translator.translate(frenchlist[i], src="fr")
    cell.value = translation.text

for i, cell in enumerate(audio_cells):
    cell.value = f"[sound:{frenchlist[i]}.mp3]"

for i, cell in enumerate(tag_cells):
    cell.value = tag

worksheet.update_cells(french_cells)
worksheet.update_cells(meaning_cells)
worksheet.update_cells(audio_cells)
worksheet.update_cells(tag_cells)

{'spreadsheetId': '1ndHSdxtLKCH_txTcJjCKmwqZ22KMjZyW1oNacYnKGL8',
 'updatedCells': 155,
 'updatedColumns': 1,
 'updatedRange': 'Sheet1!D2:D156',
 'updatedRows': 155}