**Meeting Summarizer**
> About Project: 
>>This project shows a method to generate a summary of meeting chat using ***Text Tank Algorithm***. The summarizing function improves the quality of the generated summary by taking user's feedback.

> Pipeline Used:
>>The various steps of the pipeline are:
>>1. Reading Meeting Chat
>>2. Preprocessing Text
>>3. Word Tokenization
>>4. Word Lemmatization
>>5. Generating Word Frequency Vector
>>6. Sentence Tokenization
>>7. Sentence Ranking
>>8. Finding Threshold Rank
>>9. Generating Summary
>>10. Taking User's Feedback 
>>11. Updating Feedback table








In [None]:
import nltk
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('wordnet')
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.stem import WordNetLemmatizer

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


In [None]:
#Few Important global variables
stopWords = set(stopwords.words("english"))
lemmatizer = WordNetLemmatizer() 
ImpTable = dict()
meetings = 0

In [None]:
def preprocess(text,participants):
  protext=text
  for part in participants:
    protext = protext.replace(part+": ","")
  return protext

In [None]:
def freq_vectorize(words):
  freqTable = dict()
  totalwords = len(words) 
  for word in words: 
      word = word.lower()
      word = lemmatizer.lemmatize(word)
      if word in stopWords: 
          continue
      if word in freqTable: 
          freqTable[word] += 1/totalwords
      else: 
          freqTable[word] = 1/totalwords

  #Updating Frequency Vector according to Users Feedback        
  for word,freq in freqTable.items():
    if word in ImpTable.keys():
      freqTable[word] = ImpTable[word]+freq
  return freqTable

In [None]:
def sent_rank(sentences, freqTable):
  sentenceValue = dict()
  for sentence in sentences:
    sentence_list = word_tokenize(sentence)
    sentence_list = list(map(lemmatizer.lemmatize,sentence_list)) 
    for word, freq in freqTable.items(): 
        if word in sentence_list: 
            if sentence in sentenceValue: 
                sentenceValue[sentence] += freq 
            else: 
                sentenceValue[sentence] = freq 
  return sentenceValue

In [None]:
def find_average(sentenceValue):
  sumValues = 0
  for sentence in sentenceValue: 
      sumValues += sentenceValue[sentence] 
  average = (sumValues / len(sentenceValue))
  return average

In [None]:
def extract_summary(sentences,sentenceValue,average):
  summary = '' 
  for sentence in sentences: 
      if (sentence in sentenceValue) and (sentenceValue[sentence] > (1.2*average)): 
          summary += " " + sentence 
  return summary

In [None]:
def update_ImpTable(hsummary):
  fwords = word_tokenize(hsummary)
  totalwords = len(fwords)
  for word in fwords: 
    word = word.lower()
    word = lemmatizer.lemmatize(word)
    if word in stopWords: 
        continue
    if word in ImpTable: 
        ImpTable[word] += 1/totalwords
    else: 
        ImpTable[word] = 1/totalwords

  for word,imp in ImpTable.items():
    ImpTable[word] = imp/meetings
  print("\n\nFeedback Vector Updated!!!")


In [None]:
def summarize(text, participants):
  #Generating summary
  protext = preprocess(text, participants) #removes name of the speaker
  words = word_tokenize(protext) #Word tokenizing
  freqTable = freq_vectorize(words) #Lemmatization and forming word freqency vector
  sentences = sent_tokenize(protext) #sentence tokenizing
  sentenceValue = sent_rank(sentences, freqTable) #Ranking sentences
  average = find_average(sentenceValue) #Finding the average rank
  summary = extract_summary(sentences, sentenceValue, average) #Generating summary 

  #printing summary
  print("Meeting Summary:")
  print(summary)
  print("\n\n")

  #Asking for Users Feedback
  c = input("Do you want to give feedback for the summary?(Y/N)")
  if c =='Y'or c=='y':
    global meetings 
    meetings += 1
    for i in range(len(sentences)):
      print(str(i) + ". " + sentences[i])
    IS = list(map(int, input("Enter the nos. of sentences you want to include in the summary (commaseperated): ").split(",")))
    hsummary=""
    for i in range(len(sentences)):
      if i in IS:
        hsummary = hsummary+" " + sentences[i]
    print("\n\nThe summary generated after your feedback is:")
    print(hsummary) #printing user given summary
    update_ImpTable(hsummary)

In [None]:
text = """Henil: Good Morning one and all. I will be presenting todays meeting. The agenda of todays meeting is our newly assigned project HICS. HICS stands for Hearing Impaired Communication System. The main objective of the project is to create an application convert speech audio signals into American sign Language Gestures also know as ASL, which is understood by the hearing-impaired user. The modules of the application includes designing the front-end, speech to text conversion, followed by text to corresponding ASL gesture converstion and finally displaying the gesture images. The important thing is this should all happen in realtime as the client wants the project to fill the communication gap between normal people and hearing-impaired users. So any doubts regarding the project goals?
Roni: So sir doest the client want a web application or desktop or mobile app?
Henil: It should be a Webapplication.
Atul: Sir what is the deadline of the project?
Henil: Its 21st of the next month. Anything else guys?. So, lets move to the work distribution. Roni, you will handle the speech to text and text to ASL conversion. Atul, you have to work on the front-end and dealing with the client for further updates on the project. You guys should should update me periodically regarding client's requirement and completion of work. So that was all for todays meeting. You can get back to your work."""

participants = ['Henil', 'Roni', 'Atul']

summarize(text,participants)

Meeting Summary:
 The main objective of the project is to create an application convert speech audio signals into American sign Language Gestures also know as ASL, which is understood by the hearing-impaired user. The modules of the application includes designing the front-end, speech to text conversion, followed by text to corresponding ASL gesture converstion and finally displaying the gesture images. The important thing is this should all happen in realtime as the client wants the project to fill the communication gap between normal people and hearing-impaired users. Atul, you have to work on the front-end and dealing with the client for further updates on the project. You guys should should update me periodically regarding client's requirement and completion of work.



Do you want to give feedback for the summary?(Y/N)Y
0. Good Morning one and all.
1. I will be presenting todays meeting.
2. The agenda of todays meeting is our newly assigned project HICS.
3. HICS stands for Hearing

In [None]:
ImpTable

{"'s": 0.005681818181818182,
 ',': 0.02840909090909091,
 '.': 0.05681818181818183,
 '21st': 0.005681818181818182,
 'agenda': 0.005681818181818182,
 'also': 0.005681818181818182,
 'american': 0.005681818181818182,
 'application': 0.011363636363636364,
 'asl': 0.017045454545454544,
 'assigned': 0.005681818181818182,
 'atul': 0.005681818181818182,
 'audio': 0.005681818181818182,
 'client': 0.017045454545454544,
 'communication': 0.011363636363636364,
 'completion': 0.005681818181818182,
 'conversion': 0.011363636363636364,
 'converstion': 0.005681818181818182,
 'convert': 0.005681818181818182,
 'corresponding': 0.005681818181818182,
 'create': 0.005681818181818182,
 'dealing': 0.005681818181818182,
 'designing': 0.005681818181818182,
 'displaying': 0.005681818181818182,
 'fill': 0.005681818181818182,
 'finally': 0.005681818181818182,
 'followed': 0.005681818181818182,
 'front-end': 0.011363636363636364,
 'gap': 0.005681818181818182,
 'gesture': 0.017045454545454544,
 'guy': 0.005681818181

In [None]:
text = """Henil: Hello Everyone. The agenda of todays meeting is discussing updates on HICS project and client's requirements. I want you guys to update me on how many assigned tasks have you completed.
Roni: Sir, me and my team have finished the speech recognition module and it has an accuracy of 91%. We are currently working with the subject matter expert for generating dataset for ASL gestures which we will be using for conversion of text to gesture images.
Henil: Ok. Good job. You can go ahead with your work. Atul what is you status?
Atul: Sir, me and my team are ready with the UI design for the front-end. My team is collabrating with the backend team to make the final template.
Henil: Ok, that absolutely fine. Did you contact the client for any update for the project requirement?
Atul: Yes sir, the client was quite satisfied with the projects progress and they also want a documentation page that displays the user manual for this application and the ASL manual.
Henil: So, you can coordinate with Roni and start working on that once the ASL dataset is ready. So, anything else? Alright then that was all for todays meeting. You can get back to your work now."""

participants = ['Henil', 'Roni', 'Atul']

summarize(text,participants)

Meeting Summary:
 The agenda of todays meeting is discussing updates on HICS project and client's requirements. Sir, me and my team have finished the speech recognition module and it has an accuracy of 91%. We are currently working with the subject matter expert for generating dataset for ASL gestures which we will be using for conversion of text to gesture images. Sir, me and my team are ready with the UI design for the front-end. Yes sir, the client was quite satisfied with the projects progress and they also want a documentation page that displays the user manual for this application and the ASL manual.



Do you want to give feedback for the summary?(Y/N)Y
0. Hello Everyone.
1. The agenda of todays meeting is discussing updates on HICS project and client's requirements.
2. I want you guys to update me on how many assigned tasks have you completed.
3. Sir, me and my team have finished the speech recognition module and it has an accuracy of 91%.
4. We are currently working with the s