<a href="https://colab.research.google.com/github/arnold402/NLP_project/blob/main/Alina%2C_your_new_assistant.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Meet Alina, your new assistant


A NLP project for DSI Africa 2022 module 3 by Antsa, Arnold and Ulrich, tutored by Edoardo.

Why would you miss your important meeting/seminars/lecture if you can be updated on all events of your DSI calendar by asking a chatbot ?

Alina is a simple chatbot that interacts with google calendar. The code consists of 3 main parts: a chatbot section, an nlp section and a google calendar query API section.

The chatbot setion using google speech recogniton to listen the to users input and also to speak to the user (by converting text to speech). The chatbot currently supports English and French. The chatbot, which is called Alina, can be awakened using any of the following phrases "Hi Alina", "Hey Alina" or "Bonjour Alina" (for french). Based on the wake-up phrase, we choose the chatbot sets the default language it expects to be either English or French. The chatbot can ended be using any phrase that involves "bye" or "Au Revoir".

The chatbot listens to any user and converts any input phrase to a text (string). The spoken text is then passed to the machine learning model, which tries to identify what is the user's query. For this purpose, we used a multilingual model pre-trained on 15 languages to encode the input sentence from the user. This sentence is then compared against our database's English and French encodings of various sentences. Each of these sentences represents a query that the assistant can perform. We use a cosine distance as a similarity metric between the new sentence encoding and all the sentences in our database. The query for the sentence with the maximum distance is selected. We also set a threshold only to validate the query if the maximum distance is at aleast 0.5 (as the cosine similarity ranges from 0 to 1).

Finally, the selected query is implemented using functionalities from google calendar API. The API requires the user to provide a credentials file to allow the app to access their google calendar. Furthermore, a little bit of string manipulations and regular expressions (handle by the dateutil package) is required to capture dates and time information from the input sentence. The queried information from the google calendar is then passed to the text-to-speech functionality of the chatbot to respond to the user.

The chatbot has two different version, the one running in a local machine which can be implemented via following the read.me file of this repository and a one that is running in Colab Notebook, which you are currently reading.

Some changes were made to adapt to the fact that the speach recognition for the chatbot itself could not access the local microphone of our computer.


The chatbot was inspired by [this Medium article](https://towardsdatascience.com/ai-chatbot-with-nlp-speech-recognition-transformers-583716a299e9), the Google API codes from [this tutorial](https://www.codespeedy.com/access-google-calendar-data-with-python/). 


To access the microphone, we used code from [this notebook](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples.ipynb#scrollTo=jc7ZqfooYZnD).

### Getting ready...let us prepare ourself

First of all, I will be mounting my google drive where I stored my credential for my calendar.

In [1]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


In [2]:
cd /content/gdrive/MyDrive/Calendar

/content/gdrive/MyDrive/Calendar


Then let us install the requirements. 

In [6]:
# !pip install google-api-core==2.5.0
# !pip install google-api-python-client==1.12.10
# !pip install google-auth==2.3.3
# !pip install google-auth-httplib2==0.1.0
# !pip install google-auth-oauthlib==0.4.6
# !pip install google-images-download==2.8.0
# !pip install google-pasta==0.2.0
# !pip install googleapis-common-protos==1.54.0
# !pip install sentence-transformers
# !pip install SpeechRecognition
# !pip install gTTS
# !pip install transformers
# !apt install libasound2-dev portaudio19-dev libportaudio2 libportaudiocpp0 ffmpeg
# !pip install PyAudio
# !pip install fasttext
#!pip install playsound
#!pip install -q omegaconf torchaudio pydub




Now, we are going to get the pretrained model we will be using.

In [None]:
# !wget -P /content/gdrive/MyDrive/Calendar/model/ https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin


Now that we have all our requirements and downloaded the model, we are ready to go.

### Google calendar API

I pickled the credential and I putted it in the folder, you can see how to do this with the  tutorial listed above.

In [3]:
from __future__ import print_function
from calendar import calendar

import datetime
import os.path

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

from dateutil.parser import parse as dtparse
from datetime import datetime as dt
import pandas as pd
import pytz

import pickle

# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/calendar.readonly']
calendar_id = 'antsa@dsi-program.com'
now = datetime.datetime.utcnow().isoformat() + 'Z'
end_DSI = '2022-05-31T23:59:0.0Z'
LANGUAGE = "en"


def load_calendar():
    """
    Load the calendar and return the events_results.
    This is basic showcase, we can pass options later
    """
    creds = None
    # The file token.json stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    
    # I have already the pickled token on my folder, if you do not have such,
    # no worry, you will be able to login, but having a pickled file is time saving
    if os.path.exists('/content/gdrive/MyDrive/Calendar/token.pkl') :
        creds = pickle.load(open("/content/gdrive/MyDrive/Calendar/token.pkl", "rb"))
    elif os.path.exists('api/token.json'):
        creds = Credentials.from_authorized_user_file('api/token.json', SCOPES)
    # If there are no (valid) credentials available, let the user log in.
    else:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'api/client_secret.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('api/token.json', 'w') as token:
            token.write(creds.to_json())

    try:
        service = build('calendar', 'v3', credentials=creds)
    except HttpError as error:
        print('An error occurred: %s' % error)
    
    return service


# now we should implements all the queries here each as a function
def next_event(service = load_calendar(), taskdate = now):
    # we might not need the taskdate for every function
    try:
        now = datetime.datetime.utcnow().isoformat() + 'Z'
        events_result = service.events().list(
            calendarId=calendar_id, timeMin=now,
            maxResults=1, singleEvents=True,
            orderBy='startTime').execute()

        events = events_result.get('items', [])
        #tmfmt = '%d %B, %H:%M %p'
        tf = '%H:%M %p'
        response = [(event['summary'], 
                    event['start'].get('dateTime', event['start'].get('date')), 
                    event['end'].get('dateTime', event['end'].get('date'))) for event in events][0]
        
        if LANGUAGE == "en":
            response = "Your next meeting is " + response[0] + ". It starts at " + dt.strftime(dtparse(response[1]), format=tf)
        else:
            response = "Votre prochaine reunion est " + response[0] + ". Elle debute a " + dt.strftime(dtparse(response[1]), format=tf)           
        
    except:
        response = "You dont have meetings left today"  if LANGUAGE == "en" else "Vous n'avez plus de reunion prevu pour aujourd'hui" 
    return response

def action_time(service = load_calendar(), taskdate= now):
    return datetime.datetime.now().time().strftime('%H:%M')

def repeat_question(service = load_calendar(), taskdate = now):
    return "sorry can you please repeat your question"

def repeat_question_fr(service = load_calendar(), taskdate = now):
    return "s'il vous plait pouvez-vous repeter votre question"

def tomorrow_meeting(service = load_calendar()):
  tomorrow_start = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() + 'T00:00:0.0Z'
  tomorrow_end = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() + 'T23:59:0.0Z'

  events_result = service.events().list(
      calendarId=calendar_id, timeMin=tomorrow_start, timeMax=tomorrow_end,
      maxResults=10, singleEvents=True,
      orderBy='startTime').execute()

  events = events_result.get('items', [])

  tf = '%H:%M %p'
  response = [(event['summary'], 
              event['start'].get('dateTime', event['start'].get('date')), 
              event['end'].get('dateTime', event['end'].get('date'))) for event in events]

  if LANGUAGE == "en":
    response = "Your tomorow's events are as follows: ", [(response[0][0],'at', dt.strftime(dtparse(response[0][1]), format=tf)) for item in response]
  else:
    response = "Vos reunions de demain sont les suivantes: ", [(response[0][0],'a parti de', dt.strftime(dtparse(response[0][1]), format=tf)) for item in response]
  return response

def get_all_event(service = load_calendar()) :
  """ taking all event in a panda dataframe to access it easily with its rows and column"""
  source_time_zone = pytz.timezone('Africa/Johannesburg')
  target_time_zone = pytz.timezone('America/New_York')
  dsi_start_date = datetime.datetime(2022, 1, 24,0,0,0, tzinfo=None) 
  # As we are working in different timezone, we better localize the time 
  dsi_start_date_SAST = source_time_zone.localize(dsi_start_date)
  # Get events from Google Calendar API
  now = datetime.datetime.utcnow().isoformat() + 'Z'
  events_result = service.events().list(
      calendarId=calendar_id).execute()
  events_result['items']
  import pandas as pd
  all_events = pd.DataFrame(events_result['items'])
  return all_events

def format(mydate):
  """Format the date in a nice format for the reader""" 
  time = dt.strftime(dtparse(mydate['dateTime']), format= '%B %d, %Y, %r')
  timezone = mydate['timeZone']
  if timezone == 'Africa/Johannesburg' :
    timezone = "SAST"
  if timezone == 'America/New_York' :
    timezone = "EST"    
  return(time+' '+ timezone  )

def is_on(event, topic = 'NLP'):
  try :
     return(topic in event)
  except :
    return False

def readytoread(eventdf):
  s = 'The list of concerned event are : '
  i = 0
  for row in eventdf.itertuples():
      i += 1
      s = s+ ' Event number ' + str(i) +":"
      s = s+ (row.summary + " starting at " +  row.start + " and ending at "+ row.end)
      
  s =s+ " You have " + str(i)  +" such event, Thank you for asking."
  return(s) 

def readytoread_fr(eventdf):
  s = 'Votre liste de reunions est la suivante : '
  i = 0
  for row in eventdf.itertuples():
      i += 1
      s = s+ ' Reunion numero ' + str(i) +":"
      s = s+ (row.summary + " de " +  row.start + " a "+ row.end)
      
  s =s+ " Vous avez " + str(i)  +" de ce genre d'evenements."
  return(s)   

def list_event_on(service = load_calendar(),topic= 'NLP') :
  
  all_events = get_all_event()
  
  try :  
    idx = [i for i in range(len(all_events['summary'])) if is_on(all_events['summary'][i], topic)]
    if len(idx)==0 :
      return('There is no such event' if LANGUAGE=="en" else "Vous n'avez aucun evenements de ce genre")
      
    on_topic = all_events.loc[idx, ['summary', 'start','end']]
    on_topic["start"] =  (on_topic["start"]).apply(format) 
    on_topic["end"] = (on_topic["end"]).apply(format) 
    return(readytoread(on_topic) if LANGUAGE=="en" else readytoread_fr(on_topic))
  except : repeat_question if LANGUAGE == "en" else repeat_question_fr


def is_ons(event, keywords):

  try :
     mybool =True
     for keyword in keywords :
       mybool = mybool and (keyword in event)
     return(mybool)
  except :
    return False

def list_event_ons(service = load_calendar(),keywords = ['Lecture','Bruce'], source_data = get_all_event(), features =  ['summary', 'start','end'] ) :
    try :  
      idx = [i for i in range(len(source_data['summary'])) if is_ons(source_data['summary'][i], keywords)]
      if len(idx)==0 :
        return('Nope! We are not having such event' if LANGUAGE=="en" else "Nous n'avons pas ce type d'evenements")
        
      on_topic = source_data.loc[idx,features]
      on_topic["start"] =  (on_topic["start"]).apply(format) 
      on_topic["end"] = (on_topic["end"]).apply(format) 

      return(readytoread(on_topic) if LANGUAGE=="en" else readytoread_fr(on_topic))
    except : repeat_question if LANGUAGE == "en" else repeat_question_fr

def get_week(date):
  """Return the full week (Sunday first) of the week containing the given date.
  'date' may be a datetime or date instance (the same type is returned).
  """
  one_day = datetime.timedelta(days=1)
  day_idx = (date.weekday() + 1) % 7  # turn sunday into 0, monday into 1, etc.
  sunday = date - datetime.timedelta(days=day_idx)
  date = sunday
  for n in range(7):
    yield date
    date += one_day

def give_start_end(timeframe):
  '''given a timeframe keyword, return the start and end in a list'''
  try :
    daynear = ['today', 'tomorrow', 'yesterday']
    for i in range(-1,2) : 
      if daynear[i] in timeframe :
        the_day_start = (datetime.date.today() + datetime.timedelta(days=i)).isoformat() + 'T00:00:0.0Z'
        the_day_end = (datetime.date.today() + datetime.timedelta(days=i)).isoformat() + 'T23:59:0.0Z'
        return([the_day_start,the_day_end ])
    
    weeknear = ['this','next','last']
    for i in range(-1,2) : 
      if (weeknear[i] in timeframe) and ('week' in timeframe) :
        the_day  = (datetime.date.today() + datetime.timedelta(days=i*7) )
        the_week = [d.isoformat() for d in get_week(the_day)]

        week_start_date = the_week[0] + 'T00:00:0.0Z'
        week_end_date = the_week[-1] + 'T23:59:0.0Z'
        return([week_start_date,week_end_date ])
    number_day_in_month = [31, 28, 31, 30, 31, 30, 31, 30, 31, 30, 31, 30] 
    month_near = ['this','next','last']
    currentMonth = datetime.datetime.now().month
    currentYear = datetime.datetime.now().year
    for i in range(-1,2) :
      if (month_near[i] in timeframe and 'month' in timeframe  ) :
        month = str(currentMonth +i)

        if len(month) == 1 :
          month = '0'+month # so that March will be 03 and so on

        month_start = str(currentYear) +'-'+(month) +'-01T00:00:0.0Z'
        month_end = str(currentYear) +'-'+(month)+'-' +str(number_day_in_month[currentMonth +i -1] )+ 'T23:59:0.0Z'
        return([month_start, month_end ])
  except :
    return('Can you give a valid timeframe' if LANGUAGE=="en" else "Donne une zones horraire valide")   

def now() : 
  return datetime.datetime.utcnow().isoformat() + 'Z'

def querries( service =  load_calendar(), keywords= ['Stand-up'], timeframe = [now(), end_DSI ], maxResults = 10, features =  ['summary', 'start','end'] ) :
  """" Given keywords and timeframe, return the event corresponding to this timeframe on the keywords"""
  try :
    if len(timeframe) == 1 : # that is, the timeframe is a list of string and we will define start and end
        timeframe = give_start_end(timeframe[0])
    
    events_result = service.events().list(
      calendarId=calendar_id, timeMin = timeframe[0], timeMax=timeframe[1],
      maxResults=maxResults, singleEvents=True,
      orderBy='startTime').execute()
    # changed
    if events_result['items']==[]: return 'No such event' if LANGUAGE == "en" else "Aucun evenement"
    ##

    events = pd.DataFrame(events_result['items'])
    
    return( list_event_ons(service,keywords, source_data = events, features=features ))
  except : (repeat_question if LANGUAGE == "en" else repeat_question_fr)


## Since all element in the Queries dictionary need to call on function each, we provide the following function based on the question
def next_seminar(service,now):
  return querries(keywords = ['Seminar'])
def next_stand_ups(service= load_calendar(), now = now()):
  return list_event_ons(keywords = ['Stand-up'])
def next_stand_up(service= load_calendar(), now = now()):
  return querries(keywords=  ['Stand-up'] ,maxResults=1)
def week_stand_up(service= load_calendar(), now = now()):
  return querries(keywords=  ['Stand-up'] , timeframe = ['this week'], maxResults=100)
def lecture_on_dashboard(service= load_calendar(), now = now()):
  return querries(keywords=  ['ashboard'] , timeframe = ['next week'], maxResults=100)

def seminar_this_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['eminar'] , timeframe = ['this week'], maxResults=100)
def seminar_last_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['eminar'] , timeframe = ['last week'], maxResults=100)  
def seminar_next_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['eminar'] , timeframe = ['next week'], maxResults=100)
def lecture_this_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['ecture'] , timeframe = ['this week'], maxResults=100)
def lecture_last_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['ecture'] , timeframe = ['last week'], maxResults=100)
def lecture_next_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['ecture'] , timeframe = ['next week'], maxResults=100)
def talk_next_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['Talk'] , timeframe = ['next week'], maxResults=100)
def talk_this_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['Talk'] , timeframe = ['this week'], maxResults=100)
def talk_last_week(service= load_calendar(), now = now()):
  return querries(keywords=  ['Talk'] , timeframe = ['last week'], maxResults=100)

def seminar_this_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['eminar'] , timeframe = ['this month'], maxResults=100)
def seminar_last_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['eminar'] , timeframe = ['last month'], maxResults=100)  
def seminar_next_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['eminar'] , timeframe = ['next month'], maxResults=100)
def lecture_this_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['ecture'] , timeframe = ['this month'], maxResults=100)
def lecture_last_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['ecture'] , timeframe = ['last month'], maxResults=100)
def lecture_next_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['ecture'] , timeframe = ['next month'], maxResults=100)
def talk_next_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['Talk'] , timeframe = ['next month'], maxResults=100)
def talk_this_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['Talk'] , timeframe = ['this month'], maxResults=100)
def talk_last_month(service= load_calendar(), now = now()):
  return querries(keywords=  ['Talk'] , timeframe = ['last month'], maxResults=100)

# now let have a general function query that should pick which querry we run
# Queries is a dictionary with all the relevant functions for each task in our questions-task.csv file
Queries = {}
Queries["Get next event"] = next_event
Queries["Get time"] = action_time
Queries["Repeat"]  = repeat_question
Queries["Repeat_fr"]  = repeat_question_fr
Queries['Tommorow meeting'] = tomorrow_meeting
Queries["Event on NLP"] = list_event_on
# here we can extend this dictionary with as many topic as we can
Queries["Lecture by Bruce"] = list_event_ons
Queries['next stand ups'] = next_stand_ups
Queries['next stand up'] = next_stand_up
Queries['stand-up this week'] = week_stand_up
Queries['lecture on Dashboard'] = lecture_on_dashboard

Queries['seminars this week'] = seminar_this_week
Queries['seminars last week'] = seminar_last_week
Queries['seminars next week'] = seminar_next_week
Queries['lectures this week'] = lecture_this_week
Queries['lectures last week'] = lecture_last_week
Queries['lectures next week'] = lecture_next_week
Queries['Talks last week'] = talk_last_week
Queries['Talks next week'] = talk_next_week
Queries['Talks this week'] = talk_this_week
Queries['seminars this month'] = seminar_this_month
Queries['seminars last month'] = seminar_last_month
Queries['seminars next month'] = seminar_next_month
Queries['lectures this month'] = lecture_this_month
Queries['lectures last month'] = lecture_last_month
Queries['lectures next month'] = lecture_next_month
Queries['Talks last month'] = talk_last_month
Queries['Talks next month'] = talk_next_month
Queries['Talks this month'] = talk_this_month


def run_query(service, query):
    """
    finds the relevant query to run and calls the right query function for it,
    then returns the text to be displayed by the chatbot
    Args:
        service (google calendar we have intialised)
        query (tuple)
            (sentence and date for the task (note that this maybe empty))
    """

    task, taskdate = query
    res = Queries[task](service, taskdate)

    return res

### Model.py


In [4]:
from asyncio import tasks
import pandas as pd
import numpy as np
from dateutil.parser import parse

from sentence_transformers import SentenceTransformer

from sklearn.metrics.pairwise import cosine_similarity

import fasttext

from datetime import *; from dateutil.relativedelta import *

mycsvfile = "model/questions-task.csv"

PRETRAINED_MODEL_PATH = 'model/lid.176.bin'


def get_period(string):
    """Checks if sentence contains any period of the day
    """
    if "morning" in string.lower():
        return "morning" 
    elif "afternoon" in string.lower():
        return "afternoon" 
    elif "evening" in string.lower():
        return "evening"
    else:
        return None

def get_date(string, fuzzy=False):
    """
    Returns the data in a string can be interpreted as a date.
    Args:
       string (str) 
           -string to check for date
       fuzzy: bool 
           -ignore unknown tokens in string if True
    Returns:
        Date if any
    """
    try: 
        mydate = parse(string, fuzzy=fuzzy)
        return mydate
    except ValueError:
        TODAY = date.today()
        period = get_period(string)

        if "today" in string.lower():
            return TODAY, period
        elif "tomorrow" in string.lower():
            return TODAY + relativedelta(days=+1), period
        elif "this week" in string.lower():
            return "this week"
        elif "next week" in string.lower():
            return "next week"
        else:
            return None 

class DistanceModel(object):
    """Implementation of simple model for computing the similarity between
       sentences using the cosine similarity between the embedded vectors
    """

    def __init__(self, mycsvfile=mycsvfile):
        """
        Initialise the model for sentence encording using the sentenceTransform library
        and apply the encording to all the sentences in our database
        Args:
            ---
        """

        df = pd.read_csv(mycsvfile)
        sentences_en = df["Questions_en"].to_list()
        sentences_fr = df["Questions_fr"].to_list()
        self.tasks = df["Tasks"].to_list()

        self.model = SentenceTransformer('distiluse-base-multilingual-cased-v1')
        self.sentence_embeddings_en = self.model.encode(sentences_en)
        self.sentence_embeddings_fr = self.model.encode(sentences_fr)

        self.lang_detector = fasttext.load_model(PRETRAINED_MODEL_PATH)


    def predict(self, sentence):
        """
        Return the sentence predicted by the picking the sentence from out database
        with the highest cosine similarity with the input sentence
        Args:
            sentence (sting)
              -input sentence
        Returns
            task (string)
              -Action to be perfomed by the API
            date (datatime.date)
              - a date time, a string or none if it can't read it
        """

        one_embedding = self.model.encode(sentence)
        distances_en = cosine_similarity([one_embedding], self.sentence_embeddings_en)
        distances_fr = cosine_similarity([one_embedding], self.sentence_embeddings_fr)
        distances = (distances_en + distances_fr)/2

        max_dist = np.max(distances)

        lang = self.lang(sentence)

        thresh_hold = 0.5 #if lang == "en" else 0.2

        if max_dist < thresh_hold:
            if lang == "en":
                return "Repeat", None
            else:
                return "Repeat_fr", None

        task = self.tasks[np.argmax(distances)]

        task_date = get_date(sentence)
        
        return task, task_date
    
    def lang(self, sentence):
        """
        Use a pretrained model to detect the language, for now we only want french and english but this could literally be any language
        Args:
            sentence (string)
        Returns:
            language code (string)
             - en for english or fr for french
        """
        
        lr, _ = self.lang_detector.predict(sentence)
        
        return "en" if "en" in lr[0] else "fr"

### Accessing my microphone on Colab

####   Dependencies and Imports

In [5]:
import os
from os.path import exists

if not exists('silero-models'):
  !git clone -q --depth 1 https://github.com/snakers4/silero-models

%cd silero-models

# silero imports
import torch
import random
from glob import glob
from omegaconf import OmegaConf
from src.silero.utils import (init_jit_model, 
                       split_into_batches,
                       read_audio,
                       read_batch,
                       prepare_model_input)
from colab_utils import (record_audio,
                         audio_bytes_to_np,
                         upload_audio)

device = torch.device('cpu')   # you can use any pytorch device
models = OmegaConf.load('models.yml')

# imports for uploading/recording
import numpy as np
import ipywidgets as widgets
from scipy.io import wavfile
from IPython.display import Audio, display, clear_output
from torchaudio.functional import vad


# wav to text method
def wav_to_text(f='test.wav'):
  batch = read_batch([f])
  input = prepare_model_input(batch, device=device)
  output = model(input)
  return decoder(output[0].cpu())

/content/gdrive/MyDrive/Calendar/silero-models


There I recopy my model folder on Google Drive to this silero-models folder.

#### Transcribe

You can definitely understand more of these [here](https://aveysov.medium.com/modern-google-level-stt-models-released-c6491019e30c).

In [8]:
from threading import Event
# need to apply a time delay while waiting for the user

In [9]:
#@markdown { run: "auto" }

language = "English" #@param ["English", "German", "Spanish"]

print(language)
if language == 'German':
  model, decoder = init_jit_model(models.stt_models.de.latest.jit, device=device)
elif language == "Spanish":
  model, decoder = init_jit_model(models.stt_models.es.latest.jit, device=device) 
else:
  model, decoder = init_jit_model(models.stt_models.en.latest.jit, device=device)

English


In [10]:
#@markdown { run: "auto" }

use_VAD = "No" #@param ["Yes", "No"]

In [11]:
#@markdown Either record audio from microphone or upload audio from file (.mp3 or .wav) { run: "auto" }

record_or_upload = "Record" #@param ["Record", "Upload (.mp3 or .wav)"]
record_seconds =   10 #@param {type:"number", min:1, max:10, step:1}
sample_rate = 16000

def _apply_vad(audio, boot_time=0, trigger_level=9, **kwargs):
  print('\nVAD applied\n')
  vad_kwargs = dict(locals().copy(), **kwargs)
  vad_kwargs['sample_rate'] = sample_rate
  del vad_kwargs['kwargs'], vad_kwargs['audio']
  audio = vad(torch.flip(audio, ([0])), **vad_kwargs)
  return vad(torch.flip(audio, ([0])), **vad_kwargs)

def _recognize(audio):
  display(Audio(audio, rate=sample_rate, autoplay=True))
  if use_VAD == "Yes":
    audio = _apply_vad(audio)
  wavfile.write('test.wav', sample_rate, (32767*audio).numpy().astype(np.int16))
  transcription = wav_to_text()
  print('\n\nTRANSCRIPTION:\n')
  print(transcription)
  return(transcription)

def _record_audio(b):
  clear_output()
  audio = record_audio(record_seconds)
  wavfile.write('recorded.wav', sample_rate, (32767*audio).numpy().astype(np.int16))
  _recognize(audio)

def _upload_audio(b):
  clear_output()
  audio = upload_audio()
  _recognize(audio)
  return audio

if record_or_upload == "Record":
  button = widgets.Button(description="Record Speech")
  button.on_click(_record_audio)
  display(button)
else:
  audio = _upload_audio("")

Button(description='Record Speech', style=ButtonStyle())

In [12]:

def get_audio() :
   print("You can speak in 3 second !")
   # Need to push button before being able to speak 
   for i in range(3):
     print(i+1)
    Event().wait(1)
    
   Event().wait(2)
   
   audio = record_audio(record_seconds)
   transcript = _recognize(audio)
   return transcript


In [13]:
# I wanted this  code to allow the user to press a button before recording but it did but it did not work out right in the chatbot
transcription =''
def start_record(b):
  print("You can speak!")
  global transcription
  audio = record_audio(record_seconds)
  wavfile.write('recorded.wav', sample_rate, (32767*audio).numpy().astype(np.int16))
  transcription = wav_to_text('recorded.wav')
  print('\n\nTRANSCRIPTION:\n')
  print(transcription)

def aud( ):
    
    button = widgets.Button(description="Record Speech")
    button.on_click(start_record)
    display(button)    
    
    

Now, I changed the function on Bot.py so taht I can access it from my colab. 

### Bot.py

In [None]:
import numpy as np
import speech_recognition as sr
from gtts import gTTS
import os
import platform
from IPython.display import  Audio, display
import playsound # to play the sound of mp3

#import api.calendar_api as calendar_api 
#from model.model import DistanceModel

# check platform 
if "Linux" in platform.system():
    player = "vlc --play-and-exit" # this is only for my local computer
elif "Darwin" in platform.system():
    player = "afplay"
elif "Windows" in platform.system():
    player = "start"
else:
    player = "start"  # not sure how to handle this. I assume we must definitely have one of these

# load the service
service = load_calendar()
 
# Build the assistant
class ChatBot():
    def __init__(self, name, lang="en-US"):
        print("--- starting up", name, "---")
        self.name = name
        self.lang = lang

    def speech_to_text(self):
        """This function allow users to record their voice and return the transcript"""
        # recognizer = sr.Recognizer()
        # with sr.Microphone() as mic:
        #     print("listening...")
        #     audio = recognizer.listen(mic) # changed

        # try:
        #     self.text = recognizer.recognize_google(audio, language=self.lang)
        #     print("me --> ", self.text)
        try :
            self.text = get_audio()
        except:
            print("me -->  ERROR")
    


    def set_lang(self, lang):
        self.lang = lang

    @staticmethod
    def text_to_speech(text, lang="en"):
        print("assistant --> ", text)
        speaker = gTTS(text=text, lang=lang, slow=False)
        speaker.save("res.mp3")
        # I changed it so that it can be read on colab

        #playsound.playsound('res.mp3', True)
        #os.system(f"{player} res.mp3")  #mac->afplay | windows->start
        display(Audio("res.mp3"))
        
        os.remove("res.mp3") # trying to see if I will be able to play
        #Event.wait(5)

    def wake_up(self, text):
        wakeup_hey = "hey %s"%self.name
        wakeup_hi = "hi %s"%self.name
        wakeup_bj = "bonjour %s"%self.name 

        if wakeup_hey in text.lower() or wakeup_hi in text.lower() or wakeup_bj in text.lower():
            return True
        else:
            return False
    
    def good_bye(self, text):
        bye_txt = "bye"
        
        if bye_txt in text.lower():
            return True
        else:
            return False
    
    def aurevoir(self, text):
        bye_txt = "au revoir"

        if bye_txt in text.lower():
            return True
        else:
            return False

# Run the assistant
if __name__ == "__main__":
    
    assistant = ChatBot(name="alina")
    nlp = DistanceModel()
    
    os.environ["TOKENIZERS_PARALLELISM"] = "true"
    
    while True:
        assistant.speech_to_text()

        ## wake up
        if assistant.wake_up(assistant.text) is True:
            if nlp.lang(assistant.text) == "en":
                res = "Hello I am Alina the assistant, what can I do for you?"
                assistant.set_lang("en-US")
                calendar_api.LANGUAGE = "en"
            else:
                res = "Bonjour, je suis Alina l'assistante, que puis-je faire pour vous?"
                assistant.set_lang("fr-FR")
                calendar_api.LANGUAGE = "fr"

        ## respond politely
        elif any(i in assistant.text.lower() for i in ["thank","thanks"]):
            res = np.random.choice(["you're welcome!","anytime!","no problem!","cool!","I'm here if you need me!","peace out!"])
        elif any (i in assistant.text.lower() for i in ["merci"]):
            res = np.random.choice(["je t'en prie", "de rien"])
        
        ## goodbye
        elif assistant.good_bye(assistant.text) is True:
            assistant.text_to_speech("Good bye")
            break

        ##aurevoir
        elif assistant.aurevoir(assistant.text) is True:
            assistant.text_to_speech("Aurevoir")
            break
        ## conversation
        else:   
            query = nlp.predict(assistant.text)
            res = run_query(service, query) 

        lang = nlp.lang(res)
         
        assistant.text_to_speech(res, lang)

playsound is relying on another python subprocess. Please use `pip install pygobject` if you want playsound to run more efficiently.


--- starting up alina ---




You can speak!
Starting recording for 7 seconds...


<IPython.core.display.Javascript object>

Finished recording!




TRANSCRIPTION:

haalena
assistant -->  s'il vous plait pouvez-vous repeter votre question


You can speak!
Starting recording for 7 seconds...


<IPython.core.display.Javascript object>

Finished recording!




TRANSCRIPTION:

who rebebit but for gi jo next meeting
assistant -->  sorry can you please repeat your question


You can speak!
Starting recording for 7 seconds...


<IPython.core.display.Javascript object>

Finished recording!




TRANSCRIPTION:


assistant -->  sorry can you please repeat your question


You can speak!
Starting recording for 7 seconds...


<IPython.core.display.Javascript object>

Finished recording!




TRANSCRIPTION:

next to meet
assistant -->  sorry can you please repeat your question


You can speak!
Starting recording for 7 seconds...


<IPython.core.display.Javascript object>

Finished recording!




TRANSCRIPTION:

next meeting next the meet
assistant -->  You dont have meetings left today


You can speak!
Starting recording for 7 seconds...


<IPython.core.display.Javascript object>

Finished recording!




TRANSCRIPTION:

meitting now i think the alois
assistant -->  sorry can you please repeat your question


You can speak!
Starting recording for 7 seconds...


<IPython.core.display.Javascript object>

'/content/gdrive/MyDrive/Calendar/silero-models'