# Theory Framework Classification Using Section Classifier Framework

This notebook demonstrates the use of the new section classifier framework for analyzing theoretical frameworks in physics education research articles.



## Initial Setup

In [None]:
import pandas as pd
import json

from LLM_classifier import (
    load_categories_from_json,
    save_categories_to_json,
    create_timestamped_path,
    CostEstimationWrapper,
    Section
)
from methods_classifiers import MethodsSectionClassifier

import sys
import os
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))
from python_code import helpers


from IPython.display import Markdown # type: ignore



In [None]:

api_key = helpers.safe_load_env_variable("OPENAI_API_KEY", "../.env.secret")
# Initialize the classifier with loaded frameworks
classifier = MethodsSectionClassifier(
    api_key = api_key,
    temperature=0.6,
    max_tokens=15000,
    general_model="gpt-4.1-mini",
    reasoning_model="o3-mini"
)

## Data loading

In [None]:
# Load the data
df = pd.read_pickle("../section_type_classification/data/classified_sections_light_gpt-4.1-mini_20250716_122412.pkl")

# Load existing theory categories
methods = load_categories_from_json("./methods_data/initial_categories_20250807_104053.json")

Successfully loaded 25 categories from ./methods_data/initial_categories_20250807_104053.json


## Filter Research Methods Sections

In [5]:

# Helper function to filter DataFrame by category
def filter_df_by_category(
    df,
    value: str,
    column: str = "classification_highest_prob_gpt-4.1-mini"
):
    """
    Returns a DataFrame filtered to only rows where the specified column matches the given value.
    """
    mask = df[column] == value
    return df[mask].copy()

# Filter for theoretical framework sections
methods_df = filter_df_by_category(df, "Methods")
print(f"Found {len(methods_df)} methods sections")

Found 1541 methods sections


## Discover Methods

In [None]:
def to_string(section: Section):
    return f"""Section title: {section.title}
Content: {section.text}"""

sections = classifier._df_to_sections(methods_df)

preped_data = [to_string(el) for el in sections]

discovered_categories = classifier.discover_categories(preped_data[400:])

print(f"Discovered {len(discovered_categories)} categories!")

Category discovery using model: gpt-4.1-mini
Input tokens used: 105256
Output tokens used: 261
Discovered 51 categories!


In [8]:
review_result = classifier.review_and_clean_categories(discovered_categories)
reviewed_disc_categories = review_result.cleaned_categories

Category review using model: o3-mini
Reviewing 51 categories
Input tokens used: 1811
Output tokens used: 4842
Review summary:
  - Original categories: 51
  - Cleaned categories: 25
  - Removed categories: 15
  - Merge suggestions: 5
  - Removed: Recording System Requirements, Technology for Educational Recording, System Setup for Recording, Component Layout for Recording Systems, Hardware Considerations for Recording, Instructional Conceptions and Practices Framework, Data Collection and Analysis, Modeling and Theoretical Frameworks, Data Analysis Methods, Educational Context, Assessment Instruments, Curriculum and Instructional Materials, Validation and Reliability Measures, Participant Demographics, Research Questions and Objectives
  - Suggested merges: Survey Methodology in Physics Education, Experimental Design in Physics Education, Qualitative Interview Methodology in Physics Education, Instructional Study Methods in Physics Education, Data Collection Methods in Physics Education

In [None]:
save_categories_to_json(reviewed_disc_categories, "methods_data/initial_categories.json")


Categories saved to methods_data/initial_categories_20250807_104053.json


In [None]:
# methods = reviewed_disc_categories

## Batch Classify Sections

#### Estimate costs

In [None]:
cost_estimator = CostEstimationWrapper(classifier)

data = classifier._df_to_sections(methods_df)

est = cost_estimator.estimate_full_workflow(data, methods)

display(Markdown(cost_estimator.generate_cost_report(est)))

# Cost Estimation Report
Generated: 2025-08-06 00:21:54

## Category Review

- **Cost**: $0.0029
- **Model**: o3-mini
- **Input Tokens**: 6,589
- **Output Tokens**: 3,170

## Batch Classification

- **Total Cost**: $1.3243
- **Items Processed**: 589
- **Average Cost per Item**: $0.0022
- **Total Input Tokens**: 5,108,903
- **Total Output Tokens**: 47,120

## Summary

- **Total Estimated Cost**: $1.3272
- **Total Input Tokens**: 5,115,492
- **Total Output Tokens**: 50,290

*Note: These are estimates based on current pricing and may vary.*

#### Do the actual classification

Run the classification. the `category_creation_temperature` parampeter decides how aggressive the model is in creating new categories.

In [None]:
# Run batch classification
result_df, updated_categories, errors = classifier.batch_classify_sections_df(methods_df, methods, category_creation_temperature="balanced")


Processing element 1/1541
Input tokens used: 2517
Output tokens used: 566
Processing element 2/1541
Input tokens used: 2183
Output tokens used: 486
Processing element 3/1541
Input tokens used: 3073
Output tokens used: 362
Processing element 4/1541
Input tokens used: 2630
Output tokens used: 483
Processing element 5/1541
Input tokens used: 2706
Output tokens used: 376
Processing element 6/1541
Input tokens used: 2058
Output tokens used: 282
Processing element 7/1541
Input tokens used: 2235
Output tokens used: 282
Processing element 8/1541
Input tokens used: 2404
Output tokens used: 249
Processing element 9/1541
Input tokens used: 2180
Output tokens used: 93
Processing element 10/1541
Input tokens used: 4508
Output tokens used: 169
Processing element 11/1541
Input tokens used: 2042
Output tokens used: 337
Processing element 12/1541
Input tokens used: 2506
Output tokens used: 401
Processing element 13/1541
Input tokens used: 1692
Output tokens used: 256
Processing element 14/1541
Input to

Let us preview the results to make sure they make sense.

In [8]:
result_df

Unnamed: 0,article_id,section_title,classifications_gpt-4.1-mini,probabilities_gpt-4.1-mini,highest_prob_gpt-4.1-mini,text_excerpts_gpt-4.1-mini
0,10.1103/PhysRevSTPER.2.010103,THEORETICAL FRAME: A MODEL OF COGNITION,"[Resource Framing, Knowledge Integration Frame...","{'Resource Framing': 0.6, 'Knowledge Integrati...",Resource Framing,{'Resource Framing': [{'excerpt': 'We are inte...
1,10.1103/PhysRevSTPER.2.010103,STUDENT MODEL SPACE: A MATHEMATICAL REPRESENTA...,"[Mathematical Modeling Framework, Postpositivism]","{'Mathematical Modeling Framework': 0.85, 'Pos...",Mathematical Modeling Framework,{'Mathematical Modeling Framework': [{'excerpt...
2,10.1103/PhysRevSTPER.2.010105,BACKGROUND AND VALIDITY OF BEMA,"[Measurement Theory Framework, Formative Asses...","{'Measurement Theory Framework': 0.75, 'Format...",Measurement Theory Framework,{'Measurement Theory Framework': [{'excerpt': ...
3,10.1103/PhysRevSTPER.2.020101,THEORETICAL FRAMES,"[Modeling, Structure Mapping, and Conceptual B...","{'Modeling, Structure Mapping, and Conceptual ...","Modeling, Structure Mapping, and Conceptual Bl...","{'Modeling, Structure Mapping, and Conceptual ..."
4,10.1103/PhysRevSTPER.2.020103,DEFINING SCIENTIFIC ABILITIES,[Formative Assessment Framework (Assessment fo...,{'Formative Assessment Framework (Assessment f...,Formative Assessment Framework (Assessment for...,{'Formative Assessment Framework (Assessment f...
...,...,...,...,...,...,...
584,10.1103/PhysRevPhysEducRes.20.020145,BACKGROUND,"[Philosophy & History of Science Framework, Ep...",{'Philosophy & History of Science Framework': ...,Philosophy & History of Science Framework,{'Philosophy & History of Science Framework': ...
585,10.1103/PhysRevPhysEducRes.20.020146,THE COMMOGNITIVE THEORY,[Commognitive Theory],{'Commognitive Theory': 1.0},Commognitive Theory,{'Commognitive Theory': [{'excerpt': 'Rooted i...
586,10.1103/PhysRevPhysEducRes.20.020146,ANALYSIS OF SCIENTIFIC CONTENT: CONCEPTUALIZAT...,"[Model of Educational Reconstruction (MER), Co...",{'Model of Educational Reconstruction (MER)': ...,Model of Educational Reconstruction (MER),{'Model of Educational Reconstruction (MER)': ...
587,10.1103/PhysRevPhysEducRes.20.020147,THEORETICAL BACKGROUND,[Johnson-Laird Mental Representation Framework...,{'Johnson-Laird Mental Representation Framewor...,Johnson-Laird Mental Representation Framework,{'Johnson-Laird Mental Representation Framewor...


The classificator also extracts some text exerpts from the classification data. Currently this is not used for anything. But it could be used e.g. for generating improved and more detailed descriptions of research methods by passing a bunch of exerpts for the same framework to a LLM.

Here we print out a sample of them for a simple overview of what it extracts.

In [None]:
# Print out the text excerpts from 3 random rows using sample
sampled = result_df.sample(3, random_state=42)
for idx, row in sampled.iterrows():
    excerpts = row.get("text_excerpts_gpt-4.1-mini", [])
    # If excerpts is a string, try to parse as JSON
    if isinstance(excerpts, str):
        try:
            excerpts = json.loads(excerpts)
        except Exception:
            pass  # If it can't be parsed, leave as is
    if not excerpts or excerpts == "No excerpts found":
        print(f"\nRow {idx}: No excerpts found")
        continue
    print(f"\nRow {idx} excerpts by category:")
    for excerpt in excerpts:
        # If excerpt is a string, try to parse as JSON
        if isinstance(excerpt, str):
            try:
                excerpt = json.loads(excerpt)
            except Exception:
                excerpt = {}
        category = excerpt.get("category", "Unknown Category")
        text = excerpt.get("excerpt", "No excerpt text")
        print(f"  Category: {category}\n    Excerpt: {text}")


Row 1491 excerpts by category:
  Category: Unknown Category
    Excerpt: No excerpt text
  Category: Unknown Category
    Excerpt: No excerpt text
  Category: Unknown Category
    Excerpt: No excerpt text
  Category: Unknown Category
    Excerpt: No excerpt text

Row 1155 excerpts by category:
  Category: Unknown Category
    Excerpt: No excerpt text
  Category: Unknown Category
    Excerpt: No excerpt text
  Category: Unknown Category
    Excerpt: No excerpt text

Row 1252 excerpts by category:
  Category: Unknown Category
    Excerpt: No excerpt text
  Category: Unknown Category
    Excerpt: No excerpt text
  Category: Unknown Category
    Excerpt: No excerpt text


#### Classification and new category processing and saving

In [14]:
len(updated_categories)

44

In [15]:

# Save updated categories if any new ones were discovered
if len(updated_categories) > len(methods):
    print(f"\nDiscovered {len(updated_categories) - len(methods)} new categories")
    save_categories_to_json(updated_categories, "./methods_data/reviewed_categories.json")


Discovered 19 new categories
Categories saved to ./methods_data/reviewed_categories_20250807_190400.json


In [None]:
# Merge result_df columns into theory_df using 'article_id' and 'section_title' for alignment, prefixing with 'theory_'
merge_cols = [col for col in result_df.columns if col not in ['article_id', 'section_title']]
methods_df_merged = methods_df.merge(
    result_df.rename(columns={col: f"method_{col}" for col in merge_cols}),
    on=['article_id', 'section_title'],
    how='left'
)

In [None]:
import numpy as np

# Count the rows in theory_df_merged with a NaN value in 'theory_classification_gpt_4.1-mini'
nan_count = methods_df_merged['method_probabilities_gpt-4.1-mini'].isna().sum()
print(f"Number of rows with NaN in 'method_classification_gpt_4.1-mini': {nan_count}")


Number of rows with NaN in 'theory_classification_gpt_4.1-mini': 0


In [11]:
methods_df_merged

Unnamed: 0,article_abstract,article_articleType,article_authors,article_affiliations,article_date,article_type,article_metadata_last_modified_at,article_last_modified_at,article_id,article_identifiers,...,section_title_embedding,section_content_embedding,probability_dist_gpt-4.1-mini,classification_gpt-4.1-mini,classification_highest_prob_gpt-4.1-mini,failed_validation_gpt-4.1-mini,theory_classifications_gpt-4.1-mini,theory_probabilities_gpt-4.1-mini,theory_highest_prob_gpt-4.1-mini,theory_text_excerpts_gpt-4.1-mini
0,{'value': '<p>We report a detailed study of th...,article,"[{'type': 'Person', 'name': 'N. D. Finkelstein...","[{'name': 'Department of Physics, University o...",2005-09-08,article,2006-03-02T15:40:18+0000,2014-08-22 11:32:05+00:00,10.1103/PhysRevSTPER.1.010101,{'doi': '10.1103/PhysRevSTPER.1.010101'},...,"[-0.014114919118583202, 0.06385361403226852, 0...","[-0.030135443434119225, 0.04174668341875076, -...","{'Methods': 0.9, 'Introduction / Motivation': ...",[Methods],Methods,False,"[Tutorial Implementation, Course and Curriculu...","{'Tutorial Implementation': 0.5, 'Course and C...",Tutorial Implementation,{'Tutorial Implementation': [{'excerpt': 'Tuto...
1,{'value': '<p>We investigated the common diffi...,article,"[{'type': 'Person', 'name': 'Lorenzo G. Rimold...",[{'name': 'Department of Physics and Astronomy...,2005-10-04,article,2006-03-02T15:40:56+0000,2014-02-26 22:53:09+00:00,10.1103/PhysRevSTPER.1.010102,{'doi': '10.1103/PhysRevSTPER.1.010102'},...,"[-0.06506878137588501, 0.03131435438990593, 0....","[-0.029744913801550865, 0.005188309587538242, ...","{'Methods': 0.97, 'Introduction / Motivation':...",[Methods],Methods,False,"[Demonstration-Based Tasks, Multiple-Choice In...","{'Demonstration-Based Tasks': 0.35, 'Multiple-...",Demonstration-Based Tasks,{'Demonstration-Based Tasks': [{'excerpt': 'Th...
2,{'value': '<p>We investigated the common diffi...,article,"[{'type': 'Person', 'name': 'Lorenzo G. Rimold...",[{'name': 'Department of Physics and Astronomy...,2005-10-04,article,2006-03-02T15:40:56+0000,2014-02-26 22:53:09+00:00,10.1103/PhysRevSTPER.1.010102,{'doi': '10.1103/PhysRevSTPER.1.010102'},...,"[-0.021777695044875145, -0.006472709123045206,...","[-0.03321903944015503, -0.0014654365368187428,...","{'Methods': 0.9, 'Results': 0.05, 'Introductio...",[Methods],Methods,False,"[Demonstration-Based Tasks, Qualitative Interv...","{'Demonstration-Based Tasks': 0.75, 'Qualitati...",Demonstration-Based Tasks,{'Demonstration-Based Tasks': [{'excerpt': 'Th...
3,{'value': '<p>This paper examines the effects ...,article,"[{'type': 'Person', 'name': 'N. D. Finkelstein...","[{'name': 'Department of Physics, University o...",2005-10-06,article,2006-03-02T15:41:24+0000,2014-08-22 11:32:06+00:00,10.1103/PhysRevSTPER.1.010103,{'doi': '10.1103/PhysRevSTPER.1.010103'},...,"[-0.08216888457536697, -0.0027919262647628784,...","[-0.02911161445081234, -0.028258124366402626, ...","{'Methods': 0.85, 'Introduction / Motivation':...",[Methods],Methods,False,"[Experimental Design in Physics Education, Cou...",{'Experimental Design in Physics Education': 0...,Experimental Design in Physics Education,{'Experimental Design in Physics Education': [...
4,{'value': '<p>Student success in solving physi...,article,"[{'type': 'Person', 'name': 'Patrick B. Kohl',...","[{'name': 'Department of Physics, University o...",2005-10-19,article,2006-03-02T15:41:57+0000,2014-02-26 22:53:09+00:00,10.1103/PhysRevSTPER.1.010104,{'doi': '10.1103/PhysRevSTPER.1.010104'},...,"[-0.03895077481865883, 0.0529254786670208, 0.0...","[-0.040330804884433746, 0.021954139694571495, ...","{'Introduction / Motivation': 0.05, 'Theoretic...",[Methods],Methods,False,"[Homework and Quiz Study Methods, Experimental...","{'Homework and Quiz Study Methods': 0.7, 'Expe...",Homework and Quiz Study Methods,{'Homework and Quiz Study Methods': [{'excerpt...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1536,{'value': '<p>We examine students’ challenges ...,article,"[{'type': 'Person', 'name': 'Christof Keebaugh...","[{'name': 'Department of Physics, Pennsylvania...",2024-12-05,article,2024-12-05T16:21:31+0000,2024-12-05 16:21:31+00:00,10.1103/PhysRevPhysEducRes.20.020149,{'doi': '10.1103/PhysRevPhysEducRes.20.020149'},...,"[-0.061682552099227905, -0.005451112519949675,...","[-0.07386224716901779, 0.015368035063147545, 0...","{'Introduction / Motivation': 0.0, 'Theoretica...","[Methods, Theoretical Framework]",Methods,False,"[Pedagogical Interventions and Tools, Assessme...","{'Pedagogical Interventions and Tools': 0.6, '...",Pedagogical Interventions and Tools,{'Pedagogical Interventions and Tools': [{'exc...
1537,{'value': '<p>Undergraduate students on track ...,article,"[{'type': 'Person', 'name': 'Madeline Harmer',...","[{'name': 'Physics and Astronomy Department, <...",2024-12-13,article,2024-12-13T15:03:03+0000,2024-12-13 15:03:03+00:00,10.1103/PhysRevPhysEducRes.20.020150,{'doi': '10.1103/PhysRevPhysEducRes.20.020150'},...,"[-0.03920810669660568, 0.05327513813972473, 0....","[-0.08286938071250916, 0.034298013895750046, -...","{'Methods': 0.85, 'Limitations': 0.12, 'Ethica...","[Methods, Limitations]",Methods,False,[Laboratory Instruction and Student Activities...,{'Laboratory Instruction and Student Activitie...,Laboratory Instruction and Student Activities ...,{'Laboratory Instruction and Student Activitie...
1538,{'value': '<p>One hallmark of expertise in phy...,article,"[{'type': 'Person', 'name': 'Emily Marshman', ...",[{'name': 'Department of Physics and Natural S...,2024-12-23,article,2024-12-23T17:00:21+0000,2024-12-23 17:00:21+00:00,10.1103/PhysRevPhysEducRes.20.020152,{'doi': '10.1103/PhysRevPhysEducRes.20.020152'},...,"[-0.055263690650463104, 0.03912600874900818, 0...","[-0.06580902636051178, -0.0011794535676017404,...","{'Methods': 0.95, 'Introduction / Motivation':...",[Methods],Methods,False,"[Tutorial Implementation, Qualitative Intervie...","{'Tutorial Implementation': 0.35, 'Qualitative...",Tutorial Implementation,{'Tutorial Implementation': [{'excerpt': 'The ...
1539,{'value': '<p>[This paper is part of the Focus...,article,"[{'type': 'Person', 'name': 'Nilüfer Didiş Kör...",[{'name': 'Department of Mathematics and Scien...,2024-12-26,article,2024-12-26T14:57:19+0000,2024-12-26 14:57:19+00:00,10.1103/PhysRevPhysEducRes.20.020153,{'doi': '10.1103/PhysRevPhysEducRes.20.020153'},...,"[-0.05474643036723137, 0.03916209936141968, 0....","[-0.06832382082939148, 0.029234101995825768, -...","{'Methods': 0.95, 'Introduction / Motivation':...",[Methods],Methods,False,"[Case Study Analysis in Physics Education, Stu...",{'Case Study Analysis in Physics Education': 0...,Case Study Analysis in Physics Education,{'Case Study Analysis in Physics Education': [...


In [12]:
helpers.save_processed_embeddings(methods_df_merged, "./methods_data/methods_classification_df")

Saved processed_embeddings to ./methods_data/methods_classification_df_20250807_141802.pkl


'./methods_data/methods_classification_df_20250807_141802.pkl'

## Review categories

### Load classified data (to avoid rerunning)

In [None]:
methods_df_merged = pd.read_pickle("./methods_data/methods_classification_df_20250807_141802.pkl")
updated_categories = load_categories_from_json("./methods_data/reviewed_categories_20250721_163329.json")
print(methods_df_merged.keys())
print(len(updated_categories))

Successfully loaded 135 categories from ./reviewed_categories_20250721_163329.json
Index(['article_abstract', 'article_articleType', 'article_authors',
       'article_affiliations', 'article_date', 'article_type',
       'article_metadata_last_modified_at', 'article_last_modified_at',
       'article_id', 'article_identifiers', 'article_issue',
       'article_pageStart', 'article_hasArticleId', 'article_numPages',
       'article_publisher', 'article_rights', 'article_journal',
       'article_title', 'article_volume', 'article_notes',
       'article_tocSection', 'article_fundings',
       'article_classificationSchemes', 'article_doi', 'article_full_text_xml',
       'article_full_text', 'article_year', 'section_relative_position',
       'section_label', 'section_title', 'section_content', 'section_id',
       'section_title_embedding', 'section_content_embedding',
       'probability_dist_gpt-4.1-mini', 'classification_gpt-4.1-mini',
       'classification_highest_prob_gpt-4.1-mi

### Run Cleaning

In [16]:
reviewed_categories = classifier.review_and_clean_categories(updated_categories)

Category review using model: o3-mini
Reviewing 44 categories
Input tokens used: 2073
Output tokens used: 3374
Review summary:
  - Original categories: 44
  - Cleaned categories: 38
  - Removed categories: 4
  - Merge suggestions: 2
  - Removed: Data Collection Methods in Physics Education, Educational Psychology Framework Application, Instructional Context Description, Positionality and Identity Reflection in Physics Education Research
  - Suggested merges: Student Reasoning Analysis, Instructional Intervention Studies


In [18]:
from IPython.display import Markdown
issues = classifier._validate_review_result(reviewed_categories, updated_categories)
report = classifier.generate_validation_report(reviewed_categories, updated_categories, issues)
with open("review_report.md", "w") as f:
    f.write(report)
display(Markdown(report))


# Category Review Validation Report

## Summary

- **Original categories**: 44
- **Cleaned categories**: 38
- **Removed categories**: 4
- **Merge suggestions**: 2

## Category Differences
This is based on the actual difference between the original categories and the cleaned categories.
It does not take into account the merge suggestions.
See the validation section for validation based on the return from the model.

### ✅ Added Categories
- `Instructional Intervention Studies`
- `Student Reasoning Analysis`

### ❌ Removed Categories
- `Data Collection Methods in Physics Education`
- `Educational Psychology Framework Application`
- `Instructional Context Description`
- `Instructional Study Methods in Physics Education`
- `Pedagogical Interventions and Tools`
- `Positionality and Identity Reflection in Physics Education Research`
- `Student Learning and Reasoning Studies`
- `Student Reasoning and Coding in Physics Education`

### ➡️ Unchanged Categories
- `Analogical Scaffolding in Instruction`
- `Animation-Based Assessment`
- `Assessment Development and Validation`
- `Case Study Analysis in Physics Education`
- `Classroom Implementation and Instructional Strategies`
- `Cluster Analysis and Student Conceptual Patterns`
- `Content Analysis in Physics Education`
- `Course Environment Analysis`
- `Course and Curriculum Design`
- `Cross-Sequential Longitudinal Study Design`
- `Demonstration-Based Tasks`
- `Design-Based Research Methodology in Physics Education`
- `Ethnographic Observation and Field Notes Analysis`
- `Experimental Apparatus and Materials`
- `Experimental Design in Physics Education`
- `Homework and Quiz Study Methods`
- `Identity Survey Study`
- `Instructional Practice Comparative Study`
- `Instructor Interview Studies`
- `Laboratory Instruction and Student Activities Analysis`
- `Longitudinal Case Study`
- `Mixed-Methods Study: Simulation and Qualitative Coding`
- `Multiple-Choice Instrument Analysis`
- `Professional Development in Physics Education`
- `Qualitative Interview Methodology in Physics Education`
- `Reasoning Chain Construction and Network Analysis in Physics Education`
- `Secondary Data Analysis in Physics Education Research`
- `Simulation and Demonstration-Based Instructional Design`
- `Social Network Analysis in Physics Education`
- `Statistical Analysis in Physics Education`
- `Survey Methodology in Physics Education`
- `Test Statistical Evaluation`
- `Thematic Analysis and Qualitative Coding`
- `Tutorial Implementation`
- `Usability and Learning Technology Studies`
- `Video Data Collection and Analysis`

## ✅ No Validation Issues

All validation checks passed successfully.

## 🔄 Merge Suggestions

**1. Student Reasoning Analysis**
Combines: `Student Learning and Reasoning Studies`, `Student Reasoning and Coding in Physics Education`
Reasoning: Both entries focus on analyzing how students think and solve physics problems. Merging them provides a unified category that captures the range of approaches (from broad investigations to detailed coding schemes) used to explore student reasoning.

**2. Instructional Intervention Studies**
Combines: `Pedagogical Interventions and Tools`, `Instructional Study Methods in Physics Education`
Reasoning: Both methods center on the evaluation of teaching interventions and the impact of instructional designs on learning outcomes. Combining them streamlines classification and highlights common features in how educational innovations are studied in physics education research.

## 📋 Complete Category Listings

### Original Categories

- **Analogical Scaffolding in Instruction**: Instructional design using analogies and multiple representations to scaffold student learning of complex physics concepts.
- **Animation-Based Assessment**: Use of computer animation versions of conceptual tests to explore student understanding and assessment validity.
- **Assessment Development and Validation**: Methods related to the creation, validation, and use of assessment instruments to measure student learning and understanding.
- **Case Study Analysis in Physics Education**: Research methodology using case study approach to construct a holistic and in-depth understanding of a situation or phenomenon through triangulation of multiple data sources including surveys, interviews, and assessments.
- **Classroom Implementation and Instructional Strategies**: Methods detailing the actual execution of instructional activities in classroom settings, including student group work, feedback, and instructor roles.
- **Cluster Analysis and Student Conceptual Patterns**: Use of cluster analysis to identify patterns in student conceptual responses in physics.
- **Content Analysis in Physics Education**: Use of content analysis methods to analyze qualitative data such as essays or interview transcripts to explore student or teacher perspectives and experiences in physics education.
- **Course Environment Analysis**: Methods for analyzing lectures, exams, and assignments to characterize instructional environments.
- **Course and Curriculum Design**: Methods involving the planning, structuring, and implementation of course content and activities, including curriculum transformation and instructional design.
- **Cross-Sequential Longitudinal Study Design**: Research methodology involving a combination of longitudinal and cross-sectional designs to track multiple cohorts over time, allowing linkage of data across cohorts to study development along a temporal scale efficiently.
- **Data Collection Methods in Physics Education**: This merged category covers procedures and tools for gathering research data, including combined descriptions of overall study design, specific data collection instruments, and protocols.
- **Demonstration-Based Tasks**: Design and use of physical demonstration tasks to probe student conceptual models during interviews.
- **Design-Based Research Methodology in Physics Education**: Research methodology involving iterative development and study of educational interventions in real classroom settings, integrating design and research to improve teaching and learning.
- **Educational Psychology Framework Application**: Application of educational psychology theoretical frameworks, such as wise schooling and stereotype threat, to interpret classroom observation data and understand student learning outcomes.
- **Ethnographic Observation and Field Notes Analysis**: Use of ethnographic methods including detailed field notes and classroom observations to analyze instructor practices and student interactions in physics education settings.
- **Experimental Apparatus and Materials**: Description and use of physical devices or teaching materials developed or adapted for research or instructional purposes.
- **Experimental Design in Physics Education**: A comprehensive category that addresses the design of controlled experiments, including participant selection, intervention procedures, pre-/post-testing, and comparative studies across domains.
- **Homework and Quiz Study Methods**: Design and use of homework and quiz problems in various representational formats to study student understanding.
- **Identity Survey Study**: Use of repeated identity survey instruments to measure constructs such as physics identity components (interest, competence, recognition) longitudinally.
- **Instructional Context Description**: Description of the instructional context including course structure, content, student demographics, and instructional approach, used to situate the research study.
- **Instructional Practice Comparative Study**: Comparative analysis of different instructors' teaching practices, including use of rubrics and coding of observational data, to relate instructional methods to student learning disparities.
- **Instructional Study Methods in Physics Education**: A unified category that includes studies evaluating instructional interventions, course formats, and materials, covering both general instructional studies and content-specific interventions (e.g., Newton's third law materials).
- **Instructor Interview Studies**: Semi-structured interviews with physics faculty to explore instructional conceptions and practices.
- **Laboratory Instruction and Student Activities Analysis**: Comparative studies of design-based and traditional labs, including coding of student activities and interactions.
- **Longitudinal Case Study**: Research methodology using longitudinal case study design involving tracking participants over time to study development and changes.
- **Mixed-Methods Study: Simulation and Qualitative Coding**: Research methodology involving the use of mixed-reality simulation activities combined with qualitative coding of participant reflections and video data to study pedagogical skills and participant impressions.
- **Multiple-Choice Instrument Analysis**: Development and use of model-based analysis of student responses to multiple-choice conceptual assessments.
- **Pedagogical Interventions and Tools**: Design and evaluation of specific teaching methods or tools aimed at improving student learning and epistemic development.
- **Positionality and Identity Reflection in Physics Education Research**: Research method category involving researchers' reflection on their own identities, experiences, and positionalities as they relate to the study of physics education, particularly focusing on how identity, recognition, and social resources impact participation and belonging in physics communities.
- **Professional Development in Physics Education**: Research methods focusing specifically on the design, implementation, and evaluation of professional development programs for physics instructors, including workshops, training meetings, observations, coaching, and reflective practices to prepare instructors for teaching reform-based physics courses.
- **Qualitative Interview Methodology in Physics Education**: This category encompasses qualitative approaches to interviewing, including think‐aloud protocols, demonstration-based and stimulated recall interviews, as well as structured protocols used with both experts and students.
- **Reasoning Chain Construction and Network Analysis in Physics Education**: Research methodology involving the use of reasoning chain construction tasks where students select and order reasoning elements to build physics problem solutions, combined with network analysis techniques to quantitatively analyze associations between reasoning elements, including methods such as network sparsification, community detection, and centrality measures.
- **Secondary Data Analysis in Physics Education Research**: Research methodology involving the reinterpretation and reanalysis of existing data sets collected by other researchers, including re-coding and re-categorization of student responses or difficulties using new theoretical frameworks.
- **Simulation and Demonstration-Based Instructional Design**: Design and use of combined simulation and demonstration-based instructional activities, including development of worksheets and homework assignments that integrate simulation exploration with conceptual questions to enhance student understanding.
- **Social Network Analysis in Physics Education**: Research methods involving the collection, construction, and analysis of social network data to study student interactions, integration, and influence within physics education contexts. This includes the use of centrality measures, network boundary definitions, weighted and directed ties, and longitudinal network data collection.
- **Statistical Analysis in Physics Education**: Use of statistical tests and models such as Shapiro-Wilk test, Kruskal-Wallis rank sum test, chi-square test, logistic regression, likelihood ratio tests, and procedures to handle missing data, applied to analyze physics education research data.
- **Student Learning and Reasoning Studies**: Studies examining student reasoning across and within domains using different representational and instructional treatments.
- **Student Reasoning and Coding in Physics Education**: Methods for coding and analyzing student reasoning patterns and responses to conceptual physics questions.
- **Survey Methodology in Physics Education**: A consolidated category covering the development, administration, scoring, analysis, validation, and reliability assessment of surveys (including instruments to assess student beliefs and attitudes).
- **Test Statistical Evaluation**: Use of item difficulty, discrimination indices, reliability coefficients, and other statistics to evaluate assessment instruments.
- **Thematic Analysis and Qualitative Coding**: Use of thematic analysis methods to qualitatively analyze qualitative data such as transcripts of peer discussions to identify themes related to learning strategies and cognitive processes.
- **Tutorial Implementation**: Description of implementing tutorial-based instruction in physics courses, including course structure, staffing, and instructional methods.
- **Usability and Learning Technology Studies**: Research methods focusing on the evaluation of educational technologies and their impact on student learning and engagement.
- **Video Data Collection and Analysis**: Methodologies for collecting and analyzing video data of student interactions and TA-student engagement in tutorials.

### Cleaned Categories

- **Analogical Scaffolding in Instruction**: Instructional design using analogies and multiple representations to scaffold student learning of complex physics concepts.
- **Animation-Based Assessment**: Use of computer animation versions of conceptual tests to explore student understanding and assessment validity.
- **Assessment Development and Validation**: Methods related to the creation, validation, and use of assessment instruments to measure student learning and understanding.
- **Case Study Analysis in Physics Education**: Research methodology using case study approach to construct a holistic and in-depth understanding of a situation or phenomenon through triangulation of multiple data sources including surveys, interviews, and assessments.
- **Classroom Implementation and Instructional Strategies**: Methods detailing the actual execution of instructional activities in classroom settings, including student group work, feedback, and instructor roles.
- **Cluster Analysis and Student Conceptual Patterns**: Use of cluster analysis to identify patterns in student conceptual responses in physics.
- **Content Analysis in Physics Education**: Use of content analysis methods to analyze qualitative data such as essays or interview transcripts to explore student or teacher perspectives and experiences in physics education.
- **Course Environment Analysis**: Methods for analyzing lectures, exams, and assignments to characterize instructional environments.
- **Course and Curriculum Design**: Methods involving the planning, structuring, and implementation of course content and activities, including curriculum transformation and instructional design.
- **Cross-Sequential Longitudinal Study Design**: Research methodology involving a combination of longitudinal and cross-sectional designs to track multiple cohorts over time, allowing linkage of data across cohorts to study development along a temporal scale efficiently.
- **Demonstration-Based Tasks**: Design and use of physical demonstration tasks to probe student conceptual models during interviews.
- **Design-Based Research Methodology in Physics Education**: Research methodology involving iterative development and study of educational interventions in real classroom settings, integrating design and research to improve teaching and learning.
- **Ethnographic Observation and Field Notes Analysis**: Use of ethnographic methods including detailed field notes and classroom observations to analyze instructor practices and student interactions in physics education settings.
- **Experimental Apparatus and Materials**: Description and use of physical devices or teaching materials developed or adapted for research or instructional purposes.
- **Experimental Design in Physics Education**: A comprehensive category that addresses the design of controlled experiments, including participant selection, intervention procedures, pre-/post-testing, and comparative studies across domains.
- **Homework and Quiz Study Methods**: Design and use of homework and quiz problems in various representational formats to study student understanding.
- **Identity Survey Study**: Use of repeated identity survey instruments to measure constructs such as physics identity components (interest, competence, recognition) longitudinally.
- **Instructional Intervention Studies**: This category consolidates research methods that evaluate and compare instructional interventions, including both the design and evaluation of specific pedagogical tools and broader instructional studies. It merges entries such as 'Pedagogical Interventions and Tools' and 'Instructional Study Methods in Physics Education', which both address systematic evaluation of instructional practices and course innovations.
- **Instructional Practice Comparative Study**: Comparative analysis of different instructors' teaching practices, including use of rubrics and coding of observational data, to relate instructional methods to student learning disparities.
- **Instructor Interview Studies**: Semi-structured interviews with physics faculty to explore instructional conceptions and practices.
- **Laboratory Instruction and Student Activities Analysis**: Comparative studies of design-based and traditional labs, including coding of student activities and interactions.
- **Longitudinal Case Study**: Research methodology using longitudinal case study design involving tracking participants over time to study development and changes.
- **Mixed-Methods Study: Simulation and Qualitative Coding**: Research methodology involving the use of mixed-reality simulation activities combined with qualitative coding of participant reflections and video data to study pedagogical skills and participant impressions.
- **Multiple-Choice Instrument Analysis**: Development and use of model-based analysis of student responses to multiple-choice conceptual assessments.
- **Professional Development in Physics Education**: Research methods focusing specifically on the design, implementation, and evaluation of professional development programs for physics instructors, including workshops, training meetings, observations, coaching, and reflective practices to prepare instructors for teaching reform-based physics courses.
- **Qualitative Interview Methodology in Physics Education**: This category encompasses qualitative approaches to interviewing, including think‐aloud protocols, demonstration-based and stimulated recall interviews, as well as structured protocols used with both experts and students.
- **Reasoning Chain Construction and Network Analysis in Physics Education**: Research methodology involving the use of reasoning chain construction tasks where students select and order reasoning elements to build physics problem solutions, combined with network analysis techniques to quantitatively analyze associations between reasoning elements, including methods such as network sparsification, community detection, and centrality measures.
- **Secondary Data Analysis in Physics Education Research**: Research methodology involving the reinterpretation and reanalysis of existing data sets collected by other researchers, including re-coding and re-categorization of student responses or difficulties using new theoretical frameworks.
- **Simulation and Demonstration-Based Instructional Design**: Design and use of combined simulation and demonstration-based instructional activities, including development of worksheets and homework assignments that integrate simulation exploration with conceptual questions to enhance student understanding.
- **Social Network Analysis in Physics Education**: Research methods involving the collection, construction, and analysis of social network data to study student interactions, integration, and influence within physics education contexts. This includes the use of centrality measures, network boundary definitions, weighted and directed ties, and longitudinal network data collection.
- **Statistical Analysis in Physics Education**: Use of statistical tests and models such as Shapiro-Wilk test, Kruskal-Wallis rank sum test, chi-square test, logistic regression, likelihood ratio tests, and procedures to handle missing data, applied to analyze physics education research data.
- **Student Reasoning Analysis**: This merged category encompasses methods that investigate student reasoning and conceptual understanding through both qualitative coding and analysis of reasoning tasks. It includes approaches originally labeled as 'Student Learning and Reasoning Studies' and 'Student Reasoning and Coding in Physics Education', uniting studies examining reasoning processes with those involving systematic coding of student responses.
- **Survey Methodology in Physics Education**: A consolidated category covering the development, administration, scoring, analysis, validation, and reliability assessment of surveys (including instruments to assess student beliefs and attitudes).
- **Test Statistical Evaluation**: Use of item difficulty, discrimination indices, reliability coefficients, and other statistics to evaluate assessment instruments.
- **Thematic Analysis and Qualitative Coding**: Use of thematic analysis methods to qualitatively analyze qualitative data such as transcripts of peer discussions to identify themes related to learning strategies and cognitive processes.
- **Tutorial Implementation**: Description of implementing tutorial-based instruction in physics courses, including course structure, staffing, and instructional methods.
- **Usability and Learning Technology Studies**: Research methods focusing on the evaluation of educational technologies and their impact on student learning and engagement.
- **Video Data Collection and Analysis**: Methodologies for collecting and analyzing video data of student interactions and TA-student engagement in tutorials.

### Removed Categories

- **Data Collection Methods in Physics Education**: This category attempted to cover overall study design, individual data collection instruments, and protocols in one umbrella, making it overly broad and insufficiently specific for classification. (Reason: The category is too broad and general, lacking the specificity needed to be useful for classification as a discrete research method.)
- **Educational Psychology Framework Application**: This entry describes the application of educational psychology theoretical frameworks (e.g., wise schooling, stereotype threat) to interpret data rather than a concrete research method. (Reason: It is a theoretical framework application, not a research method per se. It does not describe a procedure or protocol for collecting or analyzing data.)
- **Instructional Context Description**: This entry focuses on describing the instructional context (course structure, student demographics, etc.) used to situate research rather than outlining a systematic method. (Reason: It is a descriptive background element rather than a research method or approach for data collection or analysis.)
- **Positionality and Identity Reflection in Physics Education Research**: This category involves researcher reflexivity and personal narrative on identity and positionality. (Reason: While critical for qualitative work, it is not a research method but rather a practice of reflective commentary that does not specify a systematic procedure for data collection or analysis.)

## 📄 JSON Format (Copy-Paste Ready)

### Original Categories (JSON)
```json
[
  {
    "title": "Analogical Scaffolding in Instruction",
    "description": "Instructional design using analogies and multiple representations to scaffold student learning of complex physics concepts."
  },
  {
    "title": "Animation-Based Assessment",
    "description": "Use of computer animation versions of conceptual tests to explore student understanding and assessment validity."
  },
  {
    "title": "Assessment Development and Validation",
    "description": "Methods related to the creation, validation, and use of assessment instruments to measure student learning and understanding."
  },
  {
    "title": "Case Study Analysis in Physics Education",
    "description": "Research methodology using case study approach to construct a holistic and in-depth understanding of a situation or phenomenon through triangulation of multiple data sources including surveys, interviews, and assessments."
  },
  {
    "title": "Classroom Implementation and Instructional Strategies",
    "description": "Methods detailing the actual execution of instructional activities in classroom settings, including student group work, feedback, and instructor roles."
  },
  {
    "title": "Cluster Analysis and Student Conceptual Patterns",
    "description": "Use of cluster analysis to identify patterns in student conceptual responses in physics."
  },
  {
    "title": "Content Analysis in Physics Education",
    "description": "Use of content analysis methods to analyze qualitative data such as essays or interview transcripts to explore student or teacher perspectives and experiences in physics education."
  },
  {
    "title": "Course Environment Analysis",
    "description": "Methods for analyzing lectures, exams, and assignments to characterize instructional environments."
  },
  {
    "title": "Course and Curriculum Design",
    "description": "Methods involving the planning, structuring, and implementation of course content and activities, including curriculum transformation and instructional design."
  },
  {
    "title": "Cross-Sequential Longitudinal Study Design",
    "description": "Research methodology involving a combination of longitudinal and cross-sectional designs to track multiple cohorts over time, allowing linkage of data across cohorts to study development along a temporal scale efficiently."
  },
  {
    "title": "Data Collection Methods in Physics Education",
    "description": "This merged category covers procedures and tools for gathering research data, including combined descriptions of overall study design, specific data collection instruments, and protocols."
  },
  {
    "title": "Demonstration-Based Tasks",
    "description": "Design and use of physical demonstration tasks to probe student conceptual models during interviews."
  },
  {
    "title": "Design-Based Research Methodology in Physics Education",
    "description": "Research methodology involving iterative development and study of educational interventions in real classroom settings, integrating design and research to improve teaching and learning."
  },
  {
    "title": "Educational Psychology Framework Application",
    "description": "Application of educational psychology theoretical frameworks, such as wise schooling and stereotype threat, to interpret classroom observation data and understand student learning outcomes."
  },
  {
    "title": "Ethnographic Observation and Field Notes Analysis",
    "description": "Use of ethnographic methods including detailed field notes and classroom observations to analyze instructor practices and student interactions in physics education settings."
  },
  {
    "title": "Experimental Apparatus and Materials",
    "description": "Description and use of physical devices or teaching materials developed or adapted for research or instructional purposes."
  },
  {
    "title": "Experimental Design in Physics Education",
    "description": "A comprehensive category that addresses the design of controlled experiments, including participant selection, intervention procedures, pre-/post-testing, and comparative studies across domains."
  },
  {
    "title": "Homework and Quiz Study Methods",
    "description": "Design and use of homework and quiz problems in various representational formats to study student understanding."
  },
  {
    "title": "Identity Survey Study",
    "description": "Use of repeated identity survey instruments to measure constructs such as physics identity components (interest, competence, recognition) longitudinally."
  },
  {
    "title": "Instructional Context Description",
    "description": "Description of the instructional context including course structure, content, student demographics, and instructional approach, used to situate the research study."
  },
  {
    "title": "Instructional Practice Comparative Study",
    "description": "Comparative analysis of different instructors' teaching practices, including use of rubrics and coding of observational data, to relate instructional methods to student learning disparities."
  },
  {
    "title": "Instructional Study Methods in Physics Education",
    "description": "A unified category that includes studies evaluating instructional interventions, course formats, and materials, covering both general instructional studies and content-specific interventions (e.g., Newton's third law materials)."
  },
  {
    "title": "Instructor Interview Studies",
    "description": "Semi-structured interviews with physics faculty to explore instructional conceptions and practices."
  },
  {
    "title": "Laboratory Instruction and Student Activities Analysis",
    "description": "Comparative studies of design-based and traditional labs, including coding of student activities and interactions."
  },
  {
    "title": "Longitudinal Case Study",
    "description": "Research methodology using longitudinal case study design involving tracking participants over time to study development and changes."
  },
  {
    "title": "Mixed-Methods Study: Simulation and Qualitative Coding",
    "description": "Research methodology involving the use of mixed-reality simulation activities combined with qualitative coding of participant reflections and video data to study pedagogical skills and participant impressions."
  },
  {
    "title": "Multiple-Choice Instrument Analysis",
    "description": "Development and use of model-based analysis of student responses to multiple-choice conceptual assessments."
  },
  {
    "title": "Pedagogical Interventions and Tools",
    "description": "Design and evaluation of specific teaching methods or tools aimed at improving student learning and epistemic development."
  },
  {
    "title": "Positionality and Identity Reflection in Physics Education Research",
    "description": "Research method category involving researchers' reflection on their own identities, experiences, and positionalities as they relate to the study of physics education, particularly focusing on how identity, recognition, and social resources impact participation and belonging in physics communities."
  },
  {
    "title": "Professional Development in Physics Education",
    "description": "Research methods focusing specifically on the design, implementation, and evaluation of professional development programs for physics instructors, including workshops, training meetings, observations, coaching, and reflective practices to prepare instructors for teaching reform-based physics courses."
  },
  {
    "title": "Qualitative Interview Methodology in Physics Education",
    "description": "This category encompasses qualitative approaches to interviewing, including think\u2010aloud protocols, demonstration-based and stimulated recall interviews, as well as structured protocols used with both experts and students."
  },
  {
    "title": "Reasoning Chain Construction and Network Analysis in Physics Education",
    "description": "Research methodology involving the use of reasoning chain construction tasks where students select and order reasoning elements to build physics problem solutions, combined with network analysis techniques to quantitatively analyze associations between reasoning elements, including methods such as network sparsification, community detection, and centrality measures."
  },
  {
    "title": "Secondary Data Analysis in Physics Education Research",
    "description": "Research methodology involving the reinterpretation and reanalysis of existing data sets collected by other researchers, including re-coding and re-categorization of student responses or difficulties using new theoretical frameworks."
  },
  {
    "title": "Simulation and Demonstration-Based Instructional Design",
    "description": "Design and use of combined simulation and demonstration-based instructional activities, including development of worksheets and homework assignments that integrate simulation exploration with conceptual questions to enhance student understanding."
  },
  {
    "title": "Social Network Analysis in Physics Education",
    "description": "Research methods involving the collection, construction, and analysis of social network data to study student interactions, integration, and influence within physics education contexts. This includes the use of centrality measures, network boundary definitions, weighted and directed ties, and longitudinal network data collection."
  },
  {
    "title": "Statistical Analysis in Physics Education",
    "description": "Use of statistical tests and models such as Shapiro-Wilk test, Kruskal-Wallis rank sum test, chi-square test, logistic regression, likelihood ratio tests, and procedures to handle missing data, applied to analyze physics education research data."
  },
  {
    "title": "Student Learning and Reasoning Studies",
    "description": "Studies examining student reasoning across and within domains using different representational and instructional treatments."
  },
  {
    "title": "Student Reasoning and Coding in Physics Education",
    "description": "Methods for coding and analyzing student reasoning patterns and responses to conceptual physics questions."
  },
  {
    "title": "Survey Methodology in Physics Education",
    "description": "A consolidated category covering the development, administration, scoring, analysis, validation, and reliability assessment of surveys (including instruments to assess student beliefs and attitudes)."
  },
  {
    "title": "Test Statistical Evaluation",
    "description": "Use of item difficulty, discrimination indices, reliability coefficients, and other statistics to evaluate assessment instruments."
  },
  {
    "title": "Thematic Analysis and Qualitative Coding",
    "description": "Use of thematic analysis methods to qualitatively analyze qualitative data such as transcripts of peer discussions to identify themes related to learning strategies and cognitive processes."
  },
  {
    "title": "Tutorial Implementation",
    "description": "Description of implementing tutorial-based instruction in physics courses, including course structure, staffing, and instructional methods."
  },
  {
    "title": "Usability and Learning Technology Studies",
    "description": "Research methods focusing on the evaluation of educational technologies and their impact on student learning and engagement."
  },
  {
    "title": "Video Data Collection and Analysis",
    "description": "Methodologies for collecting and analyzing video data of student interactions and TA-student engagement in tutorials."
  }
]
```

### Cleaned Categories (JSON)
```json
[
  {
    "title": "Analogical Scaffolding in Instruction",
    "description": "Instructional design using analogies and multiple representations to scaffold student learning of complex physics concepts."
  },
  {
    "title": "Animation-Based Assessment",
    "description": "Use of computer animation versions of conceptual tests to explore student understanding and assessment validity."
  },
  {
    "title": "Assessment Development and Validation",
    "description": "Methods related to the creation, validation, and use of assessment instruments to measure student learning and understanding."
  },
  {
    "title": "Case Study Analysis in Physics Education",
    "description": "Research methodology using case study approach to construct a holistic and in-depth understanding of a situation or phenomenon through triangulation of multiple data sources including surveys, interviews, and assessments."
  },
  {
    "title": "Classroom Implementation and Instructional Strategies",
    "description": "Methods detailing the actual execution of instructional activities in classroom settings, including student group work, feedback, and instructor roles."
  },
  {
    "title": "Cluster Analysis and Student Conceptual Patterns",
    "description": "Use of cluster analysis to identify patterns in student conceptual responses in physics."
  },
  {
    "title": "Content Analysis in Physics Education",
    "description": "Use of content analysis methods to analyze qualitative data such as essays or interview transcripts to explore student or teacher perspectives and experiences in physics education."
  },
  {
    "title": "Course Environment Analysis",
    "description": "Methods for analyzing lectures, exams, and assignments to characterize instructional environments."
  },
  {
    "title": "Course and Curriculum Design",
    "description": "Methods involving the planning, structuring, and implementation of course content and activities, including curriculum transformation and instructional design."
  },
  {
    "title": "Cross-Sequential Longitudinal Study Design",
    "description": "Research methodology involving a combination of longitudinal and cross-sectional designs to track multiple cohorts over time, allowing linkage of data across cohorts to study development along a temporal scale efficiently."
  },
  {
    "title": "Demonstration-Based Tasks",
    "description": "Design and use of physical demonstration tasks to probe student conceptual models during interviews."
  },
  {
    "title": "Design-Based Research Methodology in Physics Education",
    "description": "Research methodology involving iterative development and study of educational interventions in real classroom settings, integrating design and research to improve teaching and learning."
  },
  {
    "title": "Ethnographic Observation and Field Notes Analysis",
    "description": "Use of ethnographic methods including detailed field notes and classroom observations to analyze instructor practices and student interactions in physics education settings."
  },
  {
    "title": "Experimental Apparatus and Materials",
    "description": "Description and use of physical devices or teaching materials developed or adapted for research or instructional purposes."
  },
  {
    "title": "Experimental Design in Physics Education",
    "description": "A comprehensive category that addresses the design of controlled experiments, including participant selection, intervention procedures, pre-/post-testing, and comparative studies across domains."
  },
  {
    "title": "Homework and Quiz Study Methods",
    "description": "Design and use of homework and quiz problems in various representational formats to study student understanding."
  },
  {
    "title": "Identity Survey Study",
    "description": "Use of repeated identity survey instruments to measure constructs such as physics identity components (interest, competence, recognition) longitudinally."
  },
  {
    "title": "Instructional Intervention Studies",
    "description": "This category consolidates research methods that evaluate and compare instructional interventions, including both the design and evaluation of specific pedagogical tools and broader instructional studies. It merges entries such as 'Pedagogical Interventions and Tools' and 'Instructional Study Methods in Physics Education', which both address systematic evaluation of instructional practices and course innovations."
  },
  {
    "title": "Instructional Practice Comparative Study",
    "description": "Comparative analysis of different instructors' teaching practices, including use of rubrics and coding of observational data, to relate instructional methods to student learning disparities."
  },
  {
    "title": "Instructor Interview Studies",
    "description": "Semi-structured interviews with physics faculty to explore instructional conceptions and practices."
  },
  {
    "title": "Laboratory Instruction and Student Activities Analysis",
    "description": "Comparative studies of design-based and traditional labs, including coding of student activities and interactions."
  },
  {
    "title": "Longitudinal Case Study",
    "description": "Research methodology using longitudinal case study design involving tracking participants over time to study development and changes."
  },
  {
    "title": "Mixed-Methods Study: Simulation and Qualitative Coding",
    "description": "Research methodology involving the use of mixed-reality simulation activities combined with qualitative coding of participant reflections and video data to study pedagogical skills and participant impressions."
  },
  {
    "title": "Multiple-Choice Instrument Analysis",
    "description": "Development and use of model-based analysis of student responses to multiple-choice conceptual assessments."
  },
  {
    "title": "Professional Development in Physics Education",
    "description": "Research methods focusing specifically on the design, implementation, and evaluation of professional development programs for physics instructors, including workshops, training meetings, observations, coaching, and reflective practices to prepare instructors for teaching reform-based physics courses."
  },
  {
    "title": "Qualitative Interview Methodology in Physics Education",
    "description": "This category encompasses qualitative approaches to interviewing, including think\u2010aloud protocols, demonstration-based and stimulated recall interviews, as well as structured protocols used with both experts and students."
  },
  {
    "title": "Reasoning Chain Construction and Network Analysis in Physics Education",
    "description": "Research methodology involving the use of reasoning chain construction tasks where students select and order reasoning elements to build physics problem solutions, combined with network analysis techniques to quantitatively analyze associations between reasoning elements, including methods such as network sparsification, community detection, and centrality measures."
  },
  {
    "title": "Secondary Data Analysis in Physics Education Research",
    "description": "Research methodology involving the reinterpretation and reanalysis of existing data sets collected by other researchers, including re-coding and re-categorization of student responses or difficulties using new theoretical frameworks."
  },
  {
    "title": "Simulation and Demonstration-Based Instructional Design",
    "description": "Design and use of combined simulation and demonstration-based instructional activities, including development of worksheets and homework assignments that integrate simulation exploration with conceptual questions to enhance student understanding."
  },
  {
    "title": "Social Network Analysis in Physics Education",
    "description": "Research methods involving the collection, construction, and analysis of social network data to study student interactions, integration, and influence within physics education contexts. This includes the use of centrality measures, network boundary definitions, weighted and directed ties, and longitudinal network data collection."
  },
  {
    "title": "Statistical Analysis in Physics Education",
    "description": "Use of statistical tests and models such as Shapiro-Wilk test, Kruskal-Wallis rank sum test, chi-square test, logistic regression, likelihood ratio tests, and procedures to handle missing data, applied to analyze physics education research data."
  },
  {
    "title": "Student Reasoning Analysis",
    "description": "This merged category encompasses methods that investigate student reasoning and conceptual understanding through both qualitative coding and analysis of reasoning tasks. It includes approaches originally labeled as 'Student Learning and Reasoning Studies' and 'Student Reasoning and Coding in Physics Education', uniting studies examining reasoning processes with those involving systematic coding of student responses."
  },
  {
    "title": "Survey Methodology in Physics Education",
    "description": "A consolidated category covering the development, administration, scoring, analysis, validation, and reliability assessment of surveys (including instruments to assess student beliefs and attitudes)."
  },
  {
    "title": "Test Statistical Evaluation",
    "description": "Use of item difficulty, discrimination indices, reliability coefficients, and other statistics to evaluate assessment instruments."
  },
  {
    "title": "Thematic Analysis and Qualitative Coding",
    "description": "Use of thematic analysis methods to qualitatively analyze qualitative data such as transcripts of peer discussions to identify themes related to learning strategies and cognitive processes."
  },
  {
    "title": "Tutorial Implementation",
    "description": "Description of implementing tutorial-based instruction in physics courses, including course structure, staffing, and instructional methods."
  },
  {
    "title": "Usability and Learning Technology Studies",
    "description": "Research methods focusing on the evaluation of educational technologies and their impact on student learning and engagement."
  },
  {
    "title": "Video Data Collection and Analysis",
    "description": "Methodologies for collecting and analyzing video data of student interactions and TA-student engagement in tutorials."
  }
]
```

### Removed Categories (JSON)
```json
[
  {
    "title": "Data Collection Methods in Physics Education",
    "description": "This category attempted to cover overall study design, individual data collection instruments, and protocols in one umbrella, making it overly broad and insufficiently specific for classification.",
    "reason": "The category is too broad and general, lacking the specificity needed to be useful for classification as a discrete research method."
  },
  {
    "title": "Educational Psychology Framework Application",
    "description": "This entry describes the application of educational psychology theoretical frameworks (e.g., wise schooling, stereotype threat) to interpret data rather than a concrete research method.",
    "reason": "It is a theoretical framework application, not a research method per se. It does not describe a procedure or protocol for collecting or analyzing data."
  },
  {
    "title": "Instructional Context Description",
    "description": "This entry focuses on describing the instructional context (course structure, student demographics, etc.) used to situate research rather than outlining a systematic method.",
    "reason": "It is a descriptive background element rather than a research method or approach for data collection or analysis."
  },
  {
    "title": "Positionality and Identity Reflection in Physics Education Research",
    "description": "This category involves researcher reflexivity and personal narrative on identity and positionality.",
    "reason": "While critical for qualitative work, it is not a research method but rather a practice of reflective commentary that does not specify a systematic procedure for data collection or analysis."
  }
]
```

### Merges (JSON)
```json
[
  {
    "merged_title": "Instructional Intervention Studies",
    "frameworks_to_merge": [
      "Pedagogical Interventions and Tools",
      "Instructional Study Methods in Physics Education"
    ],
    "reasoning": "Both methods center on the evaluation of teaching interventions and the impact of instructional designs on learning outcomes. Combining them streamlines classification and highlights common features in how educational innovations are studied in physics education research."
  },
  {
    "merged_title": "Student Reasoning Analysis",
    "frameworks_to_merge": [
      "Student Learning and Reasoning Studies",
      "Student Reasoning and Coding in Physics Education"
    ],
    "reasoning": "Both entries focus on analyzing how students think and solve physics problems. Merging them provides a unified category that captures the range of approaches (from broad investigations to detailed coding schemes) used to explore student reasoning."
  }
]
```

## 💭 Review Reasoning

Each method was reviewed considering its status as a genuine research practice in physics education, its specificity, and its conceptual distinctiveness. Categories that were overly broad, duplicated under different names, or pertained more to theoretical frameworks than to applied research procedures were either merged with overlapping categories or removed entirely. The resulting cleaned list now includes categories that are both sufficiently detailed and distinct for useful classification in physics education research.


In [19]:
save_categories_to_json(reviewed_categories.cleaned_categories, "./methods_data/reviewed_categories.json")

Categories saved to ./methods_data/reviewed_categories_20250807_190755.json


## Create Meta-Categories

In [20]:
from methods_classifiers import ResearchMethodsClassifier


classifier = ResearchMethodsClassifier(api_key=api_key, temperature=0.5,
                                max_tokens=1000, general_model="gpt-4.1", reasoning_model="o3-mini")
                                # OBS! We are using gpt-4.1 here because of the low token length

In [23]:
research_methods = load_categories_from_json("./methods_data/reviewed_categories_20250807_141801.json")

Successfully loaded 44 categories from ./methods_data/reviewed_categories_20250807_141801.json


#### Discover/load initial categories

The classifier can pass the raw data, in this case the framework titles and descriptions, to a LLM to discover a initial set of categories.

In [24]:
preped_discovery_data = [f"{cat.title}: {cat.description}" for cat in research_methods]
discovered_meta_categories = classifier.discover_categories(preped_discovery_data)

Category discovery using model: gpt-4.1
Input tokens used: 2026
Output tokens used: 361


In [25]:
for x in discovered_meta_categories:
    print(x.title)
    print(x.description)
    print("-"*100)

Instructional Design and Implementation Methods
Methods focused on the planning, structuring, and execution of instructional activities, including curriculum development, classroom interventions, and use of specific pedagogical tools.
----------------------------------------------------------------------------------------------------
Assessment and Evaluation Methods
Methods related to the creation, validation, statistical evaluation, and analysis of assessment instruments such as tests, surveys, and quizzes to measure student learning and conceptual understanding.
----------------------------------------------------------------------------------------------------
Qualitative Research Methods
Approaches involving the collection and analysis of qualitative data, such as interviews, thematic analysis, content analysis, and ethnographic observation, to explore perspectives, reasoning, and experiences.
----------------------------------------------------------------------------------------

#### Review and clean the initial categories

The classifier can also review and clean up the categories. Since the amount of tokens used for this process is negligible in the case of meta-categorization, we can run it on the initial categories just for the sake of it.

In [29]:
review_response = classifier.review_and_clean_categories(discovered_meta_categories)
reviewed_meta_categories = review_response.cleaned_categories

Category review using model: o3-mini
Reviewing 9 categories
Input tokens used: 906
Output tokens used: 1379
Review summary:
  - Original categories: 9
  - Cleaned categories: 8
  - Removed categories: 1
  - Merge suggestions: 0
  - Removed: Secondary Data Analysis and Theoretical Framework Application


We can also save these if we'd like.

In [None]:
save_categories_to_json(reviewed_meta_categories, "./methods_data/reviewed_meta_categories.json")

Categories saved to ./methods_data/reviewed_meta_categories_20250807_191155.json


### Run meta-categorization

We can now run the classification on the frameworks using this set of meta-categories.

#### Cost estimate

Let us first estimate the cost of running our classifier with the selected models.

In [None]:
cost_classifier = CostEstimationWrapper(classifier)

estimates = cost_classifier.estimate_full_workflow(research_methods, reviewed_meta_categories, preped_discovery_data)
display(Markdown(cost_classifier.generate_cost_report(estimates)))


# Cost Estimation Report
Generated: 2025-08-06 23:35:44

## Category Discovery

- **Cost**: $0.0219
- **Model**: gpt-4.1
- **Input Tokens**: 7,619
- **Output Tokens**: 830

## Category Review

- **Cost**: $0.0004
- **Model**: o3-mini
- **Input Tokens**: 525
- **Output Tokens**: 470

## Batch Classification

- **Total Cost**: $0.2255
- **Items Processed**: 135
- **Average Cost per Item**: $0.0017
- **Total Input Tokens**: 69,546
- **Total Output Tokens**: 10,800

## Summary

- **Total Estimated Cost**: $0.2477
- **Total Input Tokens**: 77,690
- **Total Output Tokens**: 12,100

*Note: These are estimates based on current pricing and may vary.*

#### Classifying

In [30]:
classification_response = classifier.batch_classify(research_methods, reviewed_meta_categories, "balanced")

Processing element 1/44
Input tokens used: 691
Output tokens used: 63
Processing element 2/44
Input tokens used: 689
Output tokens used: 63
Processing element 3/44
Input tokens used: 692
Output tokens used: 102
Processing element 4/44
Input tokens used: 686
Output tokens used: 65
Processing element 5/44
Input tokens used: 690
Output tokens used: 90
Processing element 6/44
Input tokens used: 688
Output tokens used: 61
Processing element 7/44
Input tokens used: 691
Output tokens used: 90
Processing element 8/44
Input tokens used: 691
Output tokens used: 133
Processing element 9/44
Input tokens used: 694
Output tokens used: 89
Processing element 10/44
Input tokens used: 693
Output tokens used: 70
Processing element 11/44
Input tokens used: 687
Output tokens used: 57
Processing element 12/44
Input tokens used: 692
Output tokens used: 59
Processing element 13/44
Input tokens used: 693
Output tokens used: 65
Processing element 14/44
Input tokens used: 690
Output tokens used: 57
Processing el

In [31]:

classification = classification_response[0]
extended_categories = classification_response[1]

len(extended_categories)
categories_diff = len(extended_categories) - len(discovered_meta_categories)
print(f"{categories_diff} new meta-categories generated during classification!")

1 new meta-categories generated during classification!


### Analyzing

With these classifications of the frameworks, we can now see if we can find any patterns using these meta-categories.

Let us first add a new column to our dataframe. For this, we need a mapping from frameworks to meta-categories.

In [None]:
category_mapping = {}

for cat in classification:
    cat_title = cat[0].title
    cat_prob = cat[1].probabilities  # This is a list of ProbabilityScore objects
    # Find the ProbabilityScore object with the highest probability
    max_prob_obj = max(cat_prob, key=lambda p: p.probability)
    category_mapping[cat_title] = max_prob_obj.category

methods_df_merged['classification_meta_framework_category'] = methods_df_merged['method_highest_prob_gpt-4.1-mini'].map(lambda el: [el])

# Use a set in order to make it only unique values.
# methods_df_merged['classification_meta_framework_category'] = methods_df_merged['method_classifications_gpt-4.1-mini'].map(lambda r: set([category_mapping.get(el) for el in r]))

In [45]:
for el in methods_df_merged["classification_meta_framework_category"]:
    print(el)

['Tutorial Implementation']
['Demonstration-Based Tasks']
['Demonstration-Based Tasks']
['Experimental Design in Physics Education']
['Homework and Quiz Study Methods']
['Video Data Collection and Analysis']
['Experimental Apparatus and Materials']
['Experimental Apparatus and Materials']
['Experimental Apparatus and Materials']
['Experimental Apparatus and Materials']
['Survey Methodology in Physics Education']
['Survey Methodology in Physics Education']
['Survey Methodology in Physics Education']
['Assessment Development and Validation']
['Homework and Quiz Study Methods']
['Course Environment Analysis']
['Multiple-Choice Instrument Analysis']
['Animation-Based Assessment']
['Test Statistical Evaluation']
['Homework and Quiz Study Methods']
['Tutorial Implementation']
['Analogical Scaffolding in Instruction']
['Multiple-Choice Instrument Analysis']
['Test Statistical Evaluation']
['Assessment Development and Validation']
['Assessment Development and Validation']
['Course and Curricul

In [None]:
reversed_map = {}

for key,val in category_mapping.items():
    reversed_map.setdefault(val, []).append(key)

# with open(create_timestamped_path("./methods_data/research_methods_meta_map.json"), "w") as fp:
#     json.dump(reversed_map, fp)


# pd.DataFrame({"Meta Category":reversed_map.keys(), "Frameworks": reversed_map.values()})

Using this, we can now plot the development over time.

In [None]:
import plotly.express as px
from LLM_classifier.utils import create_timestamped_path
import pandas as pd # It's good practice to import pandas

# Ensure the relevant columns exist
category_col = "classification_meta_framework_category"
year_col = 'article_year'

# 1. Start with the original DataFrame and drop rows with NaN values
plot_df = methods_df_merged.dropna(subset=[category_col, year_col]).copy()

# 2. Filter out rows where the category_col list is empty.
#    The condition is changed to > 0 to KEEP rows with non-empty lists.
is_not_empty_mask = plot_df[category_col].str.len() > 0
plot_df = plot_df[is_not_empty_mask].copy()

print(f"Number of sections: {len(plot_df)}")

# 3. THE KEY STEP: Explode the DataFrame.
#    This creates a new row for each category in the list, repeating the year.
plot_df = plot_df.explode(category_col)

print(f"Number of exploded entries: {len(plot_df)}")

# 4. Convert year to integer
plot_df[year_col] = plot_df[year_col].astype(int)

# 5. Get category counts. This groupby now works correctly on the exploded data.
category_counts = (
    plot_df.groupby([year_col, category_col])
    .size()
    .reset_index(name='count')
)

# 6. Create interactive plot (This part remains the same)
fig = px.line(
    category_counts,
    x=year_col,
    y='count',
    color=category_col,
    markers=True,
    labels={
        year_col: 'Year',
        'count': 'Number of Sections',
        category_col: 'Methods Category'
    },
    title='Methods Category Counts by Year (Exploded Data)'
)

fig.update_layout(
    legend_title_text='Methods Category',
    legend=dict(
        orientation="v",
        yanchor="top",
        y=1,
        xanchor="left",
        x=1.01
    ),
    width=1000,
    height=500
)

# output_path = create_timestamped_path("plotly_method_year_exploded.html")
# fig.write_html(output_path)
# print(f"Plot saved to: {output_path}")
fig.show()


Number of sections: 1541
Number of exploded entries: 1541
