# PyConverse Basic usage:

Import necessary functions and class from pyconverse

_**Note: first time you install pyconverse and make these imports it downloads few transformers models, sentence-transformers in the backend hence the first import might take few minutes.**_

In [None]:
import pandas as pd
from pprint import pprint

from pyconverse import Callyzer, SpeakerStats
from pyconverse import SemanticTextSegmention, ZeroShotTopicFinder

## Load sample dataset

_**Note: to load your own transcript dataset, let's say from aws-transcribe, google-cloud, azure etc or any other services, you would need to convert your transcripts into a pandas dataframe and while you initialize the `Callyzer` class, you need to point towards the speaker, utterance, start-time & end-time for each utterance.**_

In [None]:
transcript_df = pd.read_csv("sample_transcript_data.csv"); transcript_df.head() #read sample data

## Analyse the Call transcript

Initialise the core call analysis class `Callyzer` with your dataset represented as a pandas dataframe and point towards utterance, speaker, start-time & end-time columns in it.

In [None]:
transcript_analysis = Callyzer(data=transcript_df, utterance="utterance", speaker="speaker", starttime="start_time", endtime="end_time")

compute and access various attributes of the call as follows: 

## Find Interruptions and periods of silence in a call.

In [None]:
interruptions = transcript_analysis.get_interruption() #interruption periods in a call
silence = transcript_analysis.get_silence() #periods of silence in a call

print("1. INTERRUPTIONS:\n")
pprint(interruptions)

print("\n2. PERIODS OF SILENCE:\n")
pprint(silence)

## Find the Backchannel utterances in a call transcripts.


Backchannels can be verbal, non-verbal (visual) or both. Vocalisations like 'hmm' or 'uh-huh', gestures such as head nods or head shakes, and a combination of verbal and non-verbal responses are common examples of backchannels. `pyconverse` identifies verbal backchannels using two different methods: 

1. default : via a set of commonly used backchannel keywords dictionary - fast, slightly low accuracy.
2. nlp: via sentence similarity with sentence-transformers - slow, high accuracy. 

_**Note: the backchannel identification with sentence similarity implementation is  highly inspired by facebook's [Unsupervised Topic Segmentation of Meetings with BERT Embeddings](https://arxiv.org/abs/2106.12978) paper.**_

The way this works is by taking common backchannel words like "okay", "thats it", "ummhhh" etc as backchannel samples and then do maxpool and  we apply sentence similarity with all utterances in the transcript.

In [None]:
backchannels_via_keywords = transcript_analysis.tag_backchannel().query("is_backchannel == True") #identify backchannel utterances via keywords
backchannels_via_transformers = transcript_analysis.tag_backchannel(type='nlp').query("is_backchannel == True") #identify backchannel utterances with sentence-transformers

In [None]:
backchannels_via_keywords

In [None]:
backchannels_via_transformers

backchannel detection with keywords returned with **39 utterances** vs backchannel detection with sentence-transformers returned with **68 utterances**! 

## Find the utterances which are interrogative questions

In [None]:
questions = transcript_analysis.tag_questions().query("is_question == True") #identiy utterances which are questions
questions

## Identify the emotions of the utterances

note: this might take some time as it uses miniLM language model.

In [None]:
transcript_analysis_ = Callyzer(transcript_df.tail(), utterance="utterance", speaker="speaker", starttime="start_time", endtime="end_time")

emotions = transcript_analysis_.tag_emotion(); emotions[["speaker", "utterance", "emotion"]]
#if no emotionis identified, it returns 'not found'.

## Identify if a given utterance is empathetic or not

In [None]:
empathy = transcript_analysis_.tag_empathy(); empathy[["speaker", "utterance", "is_empathy"]]
#if no empathy is identified, it classifies the sentence as 'non_empathetic', if identified it returns 'empathetic'.

## Collapse utterances into Turn level text chunks:

In [None]:
# convert the data at speaker level to turn level
df = transcript_analysis.convert_at_turn()

print(f"1. Original Utterance count: {transcript_df.shape[0]}\n2. After collapsing the utterance to turn level: {df.shape[0]}")

## Identiy the overall Psycological correlatedness nature of the speakers

In [None]:
ss = SpeakerStats(df, speaker='speaker')
pprint(ss.get_stats())

## Call segmentation

lets segment our calls into bigger chunks of texts via semantic sentence similairty & text tilling algorithms. 

In [None]:
sts = SemanticTextSegmention(df)
segments = sts.get_segments()

for segment in segments[0:4]:
    pprint(segment)
    print("-"*50)

## ZeroShot topic identification

Identify topics being discussed in a call via zero shot topic infernce at utterance/segment level (works best on segments)

In [None]:
zst = ZeroShotTopicFinder()

In [None]:
for text in segments[0:2]:
    print(f"Text: {text}\n")
    print(f"Topics: {zst.find_topic(text)}\n")
    print("-"*50)