Skip to content

EgyptAmaru/chain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CHaiN

CHaiN is a tool for AI-assisted qualitative research analysis and report generation. It accepts 4 input types (research findings, transcripts, data visualizations, and structured spreadsheet data) and produces 3 export formats from a single analysis run. The design is built around a single-source-of-truth JSON schema, a 5-stage insight pipeline, and 2 human review gates placed where researcher input has the most downstream impact.

Stack

Layer Tools
Runtime Google Apps Script
Frontend HTML, CSS, JavaScript, Bootstrap
AI Gemini API
Data JSON schema, Google Sheets, Google Slides

Pipeline

%%{init: {'theme': 'base', 'themeVariables': {'clusterBkg': '#141e26', 'clusterBorder': '#5a7a8a'}}}%%
flowchart TD
    A([Input Collection]) --> T1([Research Findings])
    A --> T2([Transcripts])
    A --> T3([Data Visualizations])
    A --> T4([Spreadsheet Data])
    T1 & T2 & T3 & T4 --> B[Project Detail Extraction\nPre-pipeline API call]
    B --> G1{Gate 1\nResearcher reviews project details}

    subgraph pipeline ["5-stage insight pipeline"]
        C["Batch Analysis\nModel: Flash-lite"]
        C --> D["Cross-file Synthesis\nModel: Flash"]
        D --> E["Thematic Grouping\nModel: Flash · Temp: 0.3"]
        E --> F["Group Data Generation\nModel: Flash"]
        F --> H["Executive Summary\nModel: Flash"]
    end

    G1 --> C
    H --> G2{Gate 2\nResearcher reviews insights}
    G2 --> I[Export Generation\nPost-pipeline rendering]
    I --> J([Slides Report])
    I --> K([Spreadsheet])
    I --> L([HTML Report])

    classDef gate fill:#2a2a2a,stroke:#c8a96e,color:#c8a96e
    classDef pipeline fill:#1e1e1e,stroke:#5a7a8a,color:#e8e4dc
    classDef pre fill:#1a1a1a,stroke:#787470,color:#c8c4bc
    classDef terminal fill:#0d1f0f,stroke:#6b9e78,color:#6b9e78

    class G1,G2 gate
    class C,D,E,F,H pipeline
    class B,I pre
    class A,J,K,L,T1,T2,T3,T4 terminal
Loading

User Flow

  1. Select input type: research findings, transcripts, data visualizations, or spreadsheet data
  2. Provide input files and a research plan
  3. Review project details extracted from the research plan
  4. The pipeline runs: batch analysis, synthesis, thematic grouping, group data generation, executive summary
  5. Review AI-generated insights and revise before committing
  6. The pipeline generates 3 exports from the confirmed insights
  7. Receive links to the generated assets: Slides report, spreadsheet, HTML report

Architecture Decisions

5-stage pipeline with sequential prompting

Sequential prompting processes one file at a time to prevent request failure from exceeding token limits and vague or incomplete responses from the model processing too much data at once.

Model selection and temperature are set per stage based on cost and quality tradeoffs. Gemini Flash-lite handles batch analysis because cost compounds per file and single-file extraction does not require the stronger model. Gemini Flash handles synthesis and thematic grouping because these stages run once and synthesis has no human review gate, making quality preservation the priority. Temperature is fixed at 0.3 for thematic grouping -- inconsistent theme labels cause the grouping logic to fail downstream.

Single-source-of-truth schema with dynamic retrieval

A JSON schema defines the structure for all pipeline outputs. Each stage retrieves only the fields it needs via a filtering function. This reduces the need for defensive programming and limits each call to only the fields relevant to that stage.

The schema is also filtered by input type, since input type determines which fields are relevant. Leaving unused fields in the schema causes the model to generate unpredictable values for them, creating inconsistent downstream handling and cleanup overhead.

Human review gate placement

2 review gates are placed where researcher input has the most downstream impact.

The first gate follows project detail extraction. A missing or unclear project objective propagates through every subsequent pipeline stage, making this the highest-leverage intervention point. The second gate follows insight generation, giving the researcher a final opportunity to revise before the pipeline commits to generating exports.

Parameterized linear reduction

A linear reduction formula calculates a target insight count proportional to data volume. The reduction factor of 0.65 controls for saturation as sample size grows. It cuts insight redundancy with enough strength while exercising restraint to preserve nuance. The floor of 6 ensures the formula never reduces below a meaningful minimum for a substantive report.

Rather than a fixed number of themes, the pipeline derives a min/max range targeting 2 to 6 insights per group, keeping structure proportional to data volume.

Bidirectional serialization for spreadsheet export and re-entry

The data output object is flattened to dot-notation key-value pairs to export as a Google Sheets file. This preserves hierarchy in a human-readable format. Dot-separated keys are parsed to reconstruct the original nested object if a researcher decides to re-enter the CHaiN pipeline using the spreadsheet.

A researcher can correct values directly in the sheet and re-enter the pipeline without re-running analysis, or use the same content with a different Google Slides template to change the visual design.

Multi-format export from a single data object

All 3 export formats are generated from the same data object in a single pipeline run, ensuring consistency across formats and paying the AI analysis cost once regardless of how many formats are generated.

The HTML report was scoped to include animated transitions and quizzes, features differentiating it from a static slide deck report.

SPA architecture and modular codebase

CHaiN is implemented as a Single Page Application with an HTML form interface, reducing dependency on Google's ecosystem and increasing portability. The Bootstrap framework defines the layout, providing a responsive grid that works across device types without ongoing upkeep.

The codebase is organized into a family of Apps Script files. This makes code reusable and debugging and iteration easier, especially as CHaiN expands.

Setup

Running CHaiN

Prerequisites

  • Gemini API key
  • Research plan (Google Doc URL)
  • One of the following data inputs:
    • Google Drive folder containing session transcripts, research findings, or data visualizations
    • CHaiN-formatted Google Sheet
  • Slides report template URL (optional — a default template is pre-populated in the interface)

Deploying Your Own Instance

Create 5 Apps Script projects

Each project corresponds to a folder in this repo. Files with a .js extension on GitHub are script files in Apps Script. Files with an .html extension are HTML files. Include appsscript.json for each project — this file controls runtime permissions and can be enabled under project settings.

Apps Script Project Files
App main.js, index.html, collect_create.html, create_elements.html, utilities.html, navigate.html, stylesheet.html
Gemini gemini.js, utilities.js
HTML Report main.js, index.html, global.html, create.html, export.html, navigate.html, change_page.html, stylesheet.html
Slides Report create_report.js, images.js, utilities.js
SharedUtilities read_data.js, export.js, schema.js, utilities.js

Library dependencies

Import each library into the corresponding project via the Apps Script Libraries panel using the script ID of the source project.

Project Libraries to import
App SharedUtilities, Gemini, SlidesReport
Gemini SharedUtilities
HTML Report SharedUtilities
Slides Report SharedUtilities
SharedUtilities SlidesReport

Web app deployment

Deploy App as a web app. Execute as your account, access set to anyone.

API key

Store your Gemini API key as a script property in the Gemini project under the key GOOGLE_API_KEY.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors