In [1]:
transcript = """
Interviewer: Hello, thank you for joining us today. We're excited to get your feedback on our app.
Interviewee: Hi, happy to be here. Overall, the app's been good.
Interviewer: That's great to hear! Could you tell us more about your experience using the app?
Interviewee: Sure. It's user-friendly, but more personalized features would enhance it.
Interviewer: Personalization noted. What specific features would you like to see for a more fulfilling experience?
Interviewee: Customizable notifications and tailored content based on my preferences would be fantastic.
Interviewer: Noted. Now, are there any areas you feel could use improvement within the app?
Interviewee: Occasionally, the app lags during peak hours. Improving its speed would be beneficial.
Interviewer: Thank you for sharing that. We'll look into optimizing the app's performance. Any other areas?
Interviewee: The search function could be more accurate. It sometimes misses relevant results.
Interviewer: Understood. We'll work on refining the search algorithm. Any final thoughts or suggestions?
Interviewee: Overall, I'm satisfied. Just a few tweaks would make the app even better.
Interviewer: We appreciate your feedback. It's invaluable for us to enhance the app. Thank you for your time today.
"""

In [2]:
import re

def extract_interviewee_transcript(transcript):
    lines = transcript.split('\n')
    
    interviewee = []
    interviewee_regex = re.compile(r'Interviewee: (.*)')
    
    for line in lines:
        interviewee_match = interviewee_regex.search(line)
        
        if interviewee_match:
            interviewee.append(interviewee_match.group(1))
            
    return interviewee

interviewee_feedback = extract_interviewee_transcript(transcript)        
print(interviewee_feedback)

["Hi, happy to be here. Overall, the app's been good.", "Sure. It's user-friendly, but more personalized features would enhance it.", 'Customizable notifications and tailored content based on my preferences would be fantastic.', 'Occasionally, the app lags during peak hours. Improving its speed would be beneficial.', 'The search function could be more accurate. It sometimes misses relevant results.', "Overall, I'm satisfied. Just a few tweaks would make the app even better."]


In [3]:
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from collections import Counter
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import spacy
import pandas as pd


sia = SentimentIntensityAnalyzer()
nltk.download('punkt')
nltk.download('stopwords')
stop_words = set(stopwords.words('english'))

tokens = word_tokenize(transcript)
tokens = [word for word in tokens if word.isalnum()]
tokens = [word for word in tokens if word not in stop_words]


sentiment_scores = [sia.polarity_scores(sentence.replace("Interviewee:", "").strip()) for sentence in interviewee_feedback]

# Information extraction
nlp = spacy.load("en_core_web_sm")
doc = nlp(" ".join(interviewee_feedback))

# Extracting entities and noun phrases
entities = [ent.text for ent in doc.ents]
noun_phrases = [chunk.text for chunk in doc.noun_chunks]

# Displaying results
print("Sentiment Analysis Results:")
for sentence, score in zip(interviewee_feedback, sentiment_scores):
    print(f"{sentence}: {score}")

print("\nInformation Extraction Results:")
print("Entities:", entities)
print("Noun Phrases:", noun_phrases)


  from pandas.core.computation.check import NUMEXPR_INSTALLED
  from pandas.core import (
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\niyat\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\niyat\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Sentiment Analysis Results:
Hi, happy to be here. Overall, the app's been good.: {'neg': 0.0, 'neu': 0.548, 'pos': 0.452, 'compound': 0.765}
Sure. It's user-friendly, but more personalized features would enhance it.: {'neg': 0.0, 'neu': 0.845, 'pos': 0.155, 'compound': 0.1655}
Customizable notifications and tailored content based on my preferences would be fantastic.: {'neg': 0.0, 'neu': 0.753, 'pos': 0.247, 'compound': 0.5574}
Occasionally, the app lags during peak hours. Improving its speed would be beneficial.: {'neg': 0.125, 'neu': 0.558, 'pos': 0.318, 'compound': 0.5367}
The search function could be more accurate. It sometimes misses relevant results.: {'neg': 0.147, 'neu': 0.853, 'pos': 0.0, 'compound': -0.2263}
Overall, I'm satisfied. Just a few tweaks would make the app even better.: {'neg': 0.0, 'neu': 0.637, 'pos': 0.363, 'compound': 0.6908}

Information Extraction Results:
Entities: ['peak hours']
Noun Phrases: ['Hi', 'the app', 'It', 'more personalized features', 'it', 'Cus

In [4]:
from transformers import pipeline, AutoTokenizer, TFAutoModelForSeq2SeqLM

def generate_summary(transcript, model_name_or_path):
    text = " ".join(transcript)

    tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
    model = TFAutoModelForSeq2SeqLM.from_pretrained(model_name_or_path)

    summarization_pipeline = pipeline("summarization", model=model, tokenizer=tokenizer, framework="tf")

    summary = summarization_pipeline(text, max_length=50, min_length=10, do_sample=False)

    return summary[0]['summary_text']



model_name_or_path = "t5-small"

summary = generate_summary(interviewee_feedback, model_name_or_path)

# Printing
print("Generated Summary:")
print(summary)


All PyTorch model weights were used when initializing TFT5ForConditionalGeneration.

All the weights of TFT5ForConditionalGeneration were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFT5ForConditionalGeneration for predictions without further training.


Generated Summary:
the app's user-friendly, but more personalized features would enhance it . customizable notifications and tailored content based on my preferences would be fantastic .
