Skip to content

JContro/chatbotmanip_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chatbot Manipulation Analysis

This repository contains analysis scripts for studying manipulation tactics in AI chatbot conversations. The analysis examines various manipulation techniques, persuasion strategies, and inter-annotator agreement on conversational data.

Overview

The project analyzes chatbot conversations from multiple perspectives:

  • Manipulation Detection: Identifying and categorizing manipulation tactics (peer pressure, gaslighting, guilt-tripping, etc.)
  • Persuasion Analysis: Examining persuasion strategies and their effectiveness
  • Inter-Annotator Agreement: Measuring consistency between human annotators
  • Predictive Modeling: Machine learning models to classify manipulative content

Data Structure

The analysis uses three primary data files located in the data/ directory:

  • conversations.json: Raw conversation data with metadata about prompts, models, and conversation types
  • survey_responses.json: Human annotations rating manipulation tactics on various dimensions
  • all_data.json: Combined dataset with conversation content and ratings

Installation

  1. Clone this repository
  2. Install Python dependencies:
pip install -r requirements.txt
  1. (Optional) For NLP-specific analyses, download required NLTK data:
python -c "import nltk; nltk.download('punkt')"

Analysis Modules

Plots and Visualizations (analysis/plots/)

  • firestore_analysis.py: Main analysis pipeline generating correlation matrices, confusion matrices, and manipulation tactic heatmaps
  • manipulation_scores_plot.py: Visualizes distribution of manipulation scores
  • persuasion_helpful_stacked_by_model.py: Compares persuasion effectiveness across different AI models

Inter-Annotator Agreement (analysis/inter_annotator_agreement/)

  • iaa_analysis.py: Calculates agreement metrics (Cohen's Kappa, Fleiss' Kappa, etc.)
  • krippendorff_alpha_calculator.py: Computes Krippendorff's Alpha for reliability
  • disagreement_analyzer.py: Identifies and analyzes cases where annotators disagree

Predictive Modeling (analysis/predictive_modelling/)

  • bert_bilstm.py: BERT-based BiLSTM model for manipulation classification
  • zero_shot_analysis.py: Zero-shot classification using large language models
  • fold_creator.py: Creates cross-validation folds for model evaluation
  • aggregate_zero_shot_results.py: Aggregates results from multiple zero-shot experiments

Usage

Running the Main Analysis Pipeline

cd analysis/plots
python firestore_analysis.py

This generates:

  • Manipulation tactics heatmaps
  • Correlation analysis between manipulation types
  • Confusion matrices for classification accuracy

Inter-Annotator Agreement Analysis

cd analysis/inter_annotator_agreement
python iaa_analysis.py

Predictive Model Training

cd analysis/predictive_modelling
python bert_bilstm.py

Key Findings

The analysis categorizes manipulation tactics into several dimensions:

  • Peer Pressure: Using social conformity to influence decisions
  • Reciprocity Pressure: Creating obligation through perceived favors
  • Gaslighting: Undermining the user's perception of reality
  • Guilt-Tripping: Inducing guilt to drive behavior
  • Emotional Blackmail: Threatening emotional consequences
  • Fear Enhancement: Amplifying anxieties to motivate action
  • Negging: Undermining confidence to increase compliance

Output Files

Analysis scripts generate various output files:

  • PDF/PNG plots: Visualizations of manipulation patterns and correlations
  • CSV files: Prediction results and aggregated metrics
  • JSON files: Structured analysis results and model outputs
  • Log files: Detailed execution logs for debugging

Data Loading

All scripts have been updated to use local JSON data files. The shared data loader module (analysis/shared_data_loader.py) handles data loading consistently across all analysis scripts, replacing the previous Google Cloud Firestore integration.

Contributing

When adding new analysis scripts:

  1. Use the shared data loader from analysis/shared_data_loader.py
  2. Follow the existing code structure for consistency
  3. Document key findings and outputs in code comments

License

See LICENSE file for details.

Citation

If you use this analysis in your research, please cite the associated paper (details to be added).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors