This repository contains the solution for the AI Evaluation Engineer Technical Task for i.AI. The task involves generating synthetic consultation responses, extracting themes using Themefinder, creating a second set of theme mappings with controlled randomness, and comparing the two sets of theme mappings.
project/
├── data/ # Data directory
│ ├── synthetic_responses.json # Generated consultation responses
│ ├── theme_mapping_1.json # First set of theme mappings
│ ├── theme_mapping_2.json # Second set of theme mappings
│ ├── comparison_results.json # Detailed comparison results
│ ├── visualizations/ # Visualizations of the comparison
│ ├── sample_responses.json # Sample consultation responses
│ ├── sample_theme_mapping_1.json # Sample first theme mapping
│ ├── sample_theme_mapping_2.json # Sample second theme mapping
│ ├── sample_comparison_results.json # Sample comparison results
│ └── theme_descriptions.md # Descriptions of theme meanings
├── scripts/ # Python scripts
│ ├── data_generation.py # Script to generate synthetic responses
│ ├── theme_extraction.py # Script to extract themes using Themefinder
│ ├── theme_variation.py # Script to create a second set of theme mappings
│ ├── theme_comparison.py # Script to compare the two sets of theme mappings
│ ├── utils.py # Utility functions
│ └── test_utils.py # Tests for utility functions
├── docs/ # Documentation
│ ├── themefinder_usage.md # Guide for using Themefinder
│ └── metrics_explanation.md # Explanation of comparison metrics
├── summary.md # Summary paragraph of the comparison
├── plan.md # Working plan for the assessment
├── run_pipeline.sh # Shell script to run the complete pipeline
├── run_pipeline.py # Python script to run the complete pipeline
└── README.md # This file
- Python 3.9+
- Required packages:
- openai
- numpy
- matplotlib
- pandas
- scikit-learn
- themefinder (if available)
- Clone the repository
- Install the required packages:
pip3 install openai numpy matplotlib pandas scikit-learn
pip3 install themefinder # Optional, fallback implementation provided- Clone the repository
- Create and activate a virtual environment:
# Create a virtual environment
python3 -m venv venv
# Activate the virtual environment
source venv/bin/activate
# Upgrade pip
pip install --upgrade pip
# Install the required packages
pip install -r requirements.txt- To deactivate the virtual environment when done:
deactivate- To activate it again later:
source venv/bin/activate- Clone the repository
- Create and activate a conda environment:
# Create conda environment from the environment.yml file
conda env create -f environment.yml
# Activate the environment
conda activate iai-assessment# Create a new conda environment with Python 3.9
conda create -n iai-assessment python=3.9
# Activate the environment
conda activate iai-assessment
# Install core dependencies using conda
conda install -c conda-forge numpy pandas matplotlib scikit-learn
# Install remaining dependencies using pip
pip install openai pytest black flake8 mypy
# Try to install themefinder (may not be available)
pip install themefinder- To deactivate the conda environment when done:
conda deactivate- To activate it again later:
conda activate iai-assessmentGenerate 300 synthetic consultation responses:
python3 generate_in_batches.pyThis script generates responses in small batches with delays between batches to avoid rate limits (takes around 15-20 minutes to gather and combine the 300 resoponses). See BATCH_GENERATION_README.md for details.
Extract themes from the synthetic responses using Themefinder (or fallback implementation):
python3 scripts/theme_extraction.py --input data/synthetic_responses.json --output data/theme_mapping_1.jsonTo force the fallback implementation even if Themefinder is available:
python3 scripts/theme_extraction.py --force-fallbackCreate a second set of theme mappings with controlled randomness:
python3 scripts/theme_variation.py --input data/theme_mapping_1.json --output data/theme_mapping_2.json --variation 0.3The --variation parameter controls the degree of randomness (0.0 to 1.0).
Compare the two sets of theme mappings and generate a summary:
python3 scripts/theme_comparison.py --mapping1 data/theme_mapping_1.json --mapping2 data/theme_mapping_2.json --output data/comparison_results.json --summary summary.mdYou can run the complete pipeline using either the shell script or Python script:
./run_pipeline.sh --skip-generationpython3 run_pipeline.py --count 300 --variation 0.3python3 run_pipeline.py --skip-generationBoth scripts will:
- Generate synthetic consultation responses (using sample data by default to avoid OpenAI API issues)
- Extract themes using Themefinder (or fallback)
- Create a second theme mapping with controlled randomness
- Compare the two theme mappings and generate a summary
Note: The pipeline is configured to use sample data by default to avoid OpenAI API rate limits and dependency issues. If you want to generate new data, you'll need to modify the scripts to remove the --use-sample flag and ensure the openai package is installed.
Additional documentation is available in the docs directory:
docs/themefinder_usage.md: Guide for using the Themefinder APIdocs/metrics_explanation.md: Explanation of the metrics used in theme comparison
Sample data files are provided in the data directory:
data/sample_responses.json: Sample consultation responsesdata/sample_theme_mapping_1.json: Sample first theme mappingdata/sample_theme_mapping_2.json: Sample second theme mappingdata/sample_comparison_results.json: Sample comparison resultsdata/theme_descriptions.md: Descriptions of what each theme represents
This project was developed with AI assistance. The following AI tools were used:
- Azure OpenAI API (GPT-4o) for generating synthetic consultation responses.
- AI assistance for code development and documentation, ChatGPT 4.0 (custom prompting GPT), Perplexity MCP, Claude 3.7, Cline for Agentic AI capabilities inside VS code.
Thomas James Butler Date: 31/03/2025