In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from src.data_preparation import load_dataset
from src.analysis.qualitative import analyze_conversation_patterns, identify_themes

dataset_path = "../data/dataset.jsonl"
# Load your JSONL data into a DataFrame
conversations_df = load_dataset(dataset_path)
conversations_df = conversations_df.iloc[1:]

# Analyze individual conversations
analyses_df = analyze_conversation_patterns(conversations_df)

# Identify overall themes
themes = identify_themes(analyses_df)

# Create summary statistics
summary_stats = {
    "total_conversations": len(analyses_df),
    "avg_sentiment": analyses_df["analysis"].apply(lambda x: x.sentiment_score).mean().item(),
    "avg_user_engagement": analyses_df["analysis"].apply(lambda x: x.user_engagement).mean().item(),
    "avg_dialog_success": analyses_df["analysis"].apply(lambda x: x.dialog_success).mean().item(),
    "common_topics": analyses_df["analysis"].apply(lambda x: x.main_topic).value_counts().head(),
    "themes": themes,
}

In [10]:
print(themes)

Based on the provided conversation analyses, here are the identified findings:

### 1. Most Common Topics and Subtopics
- **Difficult Conversations**: Many entries focus on preparing for or reviewing difficult conversations (e.g., entries 0, 1, 2, 3, 7, 9, 14).
- **Employee Concerns**: Discussions on employee dissatisfaction, such as bonus concerns (entry 8) and promotion concerns (entry 13).
- **Process Adherence**: Addressing issues related to adherence to processes within the workplace (entry 4).
- **Conflict or Issue Management**: Several entries focus on managing difficult scenarios, including client escalations (entry 11) and negotiation strategies (entry 15).

### 2. Patterns in Successful Conversations
- **Preparation and Planning**: Successful conversations often involve thorough preparation (entries 0, 2, 4, 7, 9, 14).
- **Review and Reflection**: Analyzing past conversations (entries 1, 3, 5) to learn from outcomes improves the chances for future success.
- **Clarity and Tra

In [8]:
summary_stats

{'total_conversations': 18,
 'avg_sentiment': 3.9444444444444446,
 'avg_user_engagement': 4.888888888888889,
 'avg_dialog_success': 1.0,
 'common_topics': analysis
 Preparing for a difficult conversation with unresponsive managers                1
 Review of a difficult conversation regarding performance issues                  1
 Addressing employee punctuality issues                                           1
 Negotiation strategy regarding severance payments in relation to job offers      1
 Preparing for a difficult conversation about an AI tool for language training    1
 Name: count, dtype: int64,
 'themes': "Based on the provided conversation analyses, here are the identified findings:\n\n### 1. Most Common Topics and Subtopics\n- **Difficult Conversations**: Many entries focus on preparing for or reviewing difficult conversations (e.g., entries 0, 1, 2, 3, 7, 9, 14).\n- **Employee Concerns**: Discussions on employee dissatisfaction, such as bonus concerns (entry 8) and promoti

In [4]:
analyses_df.drop(columns=["analysis"])

Unnamed: 0,conversation_id,main_topic,subtopics,communication_style,key_challenges,success_factors,sentiment_score,user_engagement,dialog_success,user_feedback_score
0,1,Preparing for a difficult conversation with un...,"[Importance of training, Challenges with manag...",Supportive and exploratory,[Low response and enrollment rates from manage...,"[Empathetic listening by the assistant, Struct...",4,5,1,5
1,2,Review of a difficult conversation regarding p...,"[Performance issues during on-call rotations, ...",Empathetic and supportive inquiry,[Senior developer's feelings of being attacked...,"[Planned conversation structure, Tech lead's i...",4,5,1,5
2,3,Preparing for a difficult conversation with a ...,"[Discussion of team structure and roles, Balan...",Collaborative and supportive,"[Client's focus on cost-cutting, Building trus...",[Clear articulation of the situation and conce...,4,5,1,5
3,4,Reviewing a difficult conversation about low p...,"[Feelings of being attacked, Empathy and under...",Supportive and reflective,[Perceived defensiveness from the senior devel...,"[Open-ended questions encouraging dialogue, Te...",4,5,1,5
4,5,Addressing process adherence within a team,"[Impact of bypassing processes on workload, Te...",Supportive and exploratory,[Team members not following critical processes...,[Effective listening and understanding by the ...,4,5,1,5
5,6,Review of a difficult conversation regarding a...,"[Promotion and salary increase request, Manage...",Supportive and exploratory,"[Manager's limitations in decision-making, Pot...","[Active listening by the assistant, Encouragem...",4,5,1,5
6,7,Addressing a teammate's shift schedule change,"[Concerns about fairness and transparency, Imp...","Supportive, reflective, collaborative",[User's anxiety about addressing management de...,"[Open-ended questions from the assistant, Empa...",4,5,1,5
7,8,Preparing for a difficult conversation with a ...,"[Communication challenges, Emotional connectio...",Supportive and inquiry-based,[Cousin's interruptions and belittling behavio...,"[Identifying and articulating core values, Bre...",4,5,1,5
8,9,Employee dissatisfaction with bonus component ...,"[Context of the employee's role, Employee's un...",Supportive and exploratory,[Employee's feelings of unfairness regarding b...,[Clarification of the situation through open-e...,3,4,1,4
9,10,Preparing for a difficult conversation regardi...,[Frustration with reassessment of feature desi...,Collaborative and reflective,"[Defending a compromised idea, Navigating frus...","[Identifying and articulating feelings, Separa...",4,5,1,5


In [5]:
from src.analysis.clustering import analyze_conversations_with_llm_clustering


output_df, topic_clusters, challenge_clusters, success_key_clusters = analyze_conversations_with_llm_clustering(
    analyses_df
)

output_df.drop(columns=["analysis"])

Unnamed: 0,conversation_id,main_topic,subtopics,communication_style,key_challenges,success_factors,sentiment_score,user_engagement,dialog_success,user_feedback_score,topic_cluster,challenge_clusters,success_factor_clusters
0,1,Preparing for a difficult conversation with un...,"[Importance of training, Challenges with manag...",Supportive and exploratory,[Low response and enrollment rates from manage...,"[Empathetic listening by the assistant, Struct...",4,5,1,5,[Difficult Conversations with Leadership and C...,"[[Team Dynamics and Performance Concerns], [Em...","[Empathetic listening by the assistant, Struct..."
1,2,Review of a difficult conversation regarding p...,"[Performance issues during on-call rotations, ...",Empathetic and supportive inquiry,[Senior developer's feelings of being attacked...,"[Planned conversation structure, Tech lead's i...",4,5,1,5,[Difficult Conversations with Leadership and C...,[[Interpersonal Conflicts and Emotional Respon...,"[Planned conversation structure, Tech lead's i..."
2,3,Preparing for a difficult conversation with a ...,"[Discussion of team structure and roles, Balan...",Collaborative and supportive,"[Client's focus on cost-cutting, Building trus...",[Clear articulation of the situation and conce...,4,5,1,5,[Difficult Conversations with Leadership and C...,"[Client's focus on cost-cutting, Building trus...","[[Structured Conversations and Presentations],..."
3,4,Reviewing a difficult conversation about low p...,"[Feelings of being attacked, Empathy and under...",Supportive and reflective,[Perceived defensiveness from the senior devel...,"[Open-ended questions encouraging dialogue, Te...",4,5,1,5,[Difficult Conversations with Leadership and C...,[[Interpersonal Conflicts and Emotional Respon...,"[Open-ended questions encouraging dialogue, Te..."
4,5,Addressing process adherence within a team,"[Impact of bypassing processes on workload, Te...",Supportive and exploratory,[Team members not following critical processes...,[Effective listening and understanding by the ...,4,5,1,5,[Team and Employee Performance Management],"[[Team Dynamics and Performance Concerns], New...",[[Effective Listening and Empathetic Engagemen...
5,6,Review of a difficult conversation regarding a...,"[Promotion and salary increase request, Manage...",Supportive and exploratory,"[Manager's limitations in decision-making, Pot...","[Active listening by the assistant, Encouragem...",4,5,1,5,[Difficult Conversations with Leadership and C...,"[[Management and Decision-Making Struggles], [...",[[Effective Listening and Empathetic Engagemen...
6,7,Addressing a teammate's shift schedule change,"[Concerns about fairness and transparency, Imp...","Supportive, reflective, collaborative",[User's anxiety about addressing management de...,"[Open-ended questions from the assistant, Empa...",4,5,1,5,[Personal Conversations and Conflicts],"[[Employee Well-being and Job Security], [Team...","[Open-ended questions from the assistant, [Eff..."
7,8,Preparing for a difficult conversation with a ...,"[Communication challenges, Emotional connectio...",Supportive and inquiry-based,[Cousin's interruptions and belittling behavio...,"[Identifying and articulating core values, Bre...",4,5,1,5,[Personal Conversations and Conflicts],[[Interpersonal Conflicts and Emotional Respon...,"[Identifying and articulating core values, Bre..."
8,9,Employee dissatisfaction with bonus component ...,"[Context of the employee's role, Employee's un...",Supportive and exploratory,[Employee's feelings of unfairness regarding b...,[Clarification of the situation through open-e...,3,4,1,4,[Client and Customer Management Issues],"[[Employee Engagement and Feelings of Value], ...",[Clarification of the situation through open-e...
9,10,Preparing for a difficult conversation regardi...,[Frustration with reassessment of feature desi...,Collaborative and reflective,"[Defending a compromised idea, Navigating frus...","[Identifying and articulating feelings, Separa...",4,5,1,5,[Difficult Conversations with Leadership and C...,"[[Management and Decision-Making Struggles], [...",[[Exploring Feelings and Unexpressed Thoughts]...


In [6]:
challenge_clusters

{'[Employee Engagement and Feelings of Value]': ['Addressing potential feelings of unfairness among employees',
  'Feeling of being undervalued and unheard',
  "User's sense of frustration and feeling undervalued",
  "Employee's feelings of unfairness regarding bonus timing",
  'User feels undervalued compared to peers',
  "Concerns about recognition of the user's contributions",
  'Feelings of being undervalued'],
 '[Communication and Miscommunication Issues]': ['Unspoken feelings and thoughts not being addressed',
  'Miscommunication regarding the importance of documentation',
  'Navigating communication without complete information',
  "Tech lead's silence possibly leading to misunderstandings",
  'Unspoken issues affecting performance were not addressed',
  'Navigating frustrations with decision-making',
  'Difficulty in effectively expressing feelings'],
 '[Team Dynamics and Performance Concerns]': ['Inexperienced team members affecting performance',
  'Low response and enrollment

In [7]:
output_df_path = "../data/output_df.csv"
output_df.to_csv(output_df_path, index=False)