Skip to content

LequeuISIR/GDNAnnotationPlatform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GDN-CC Annotation Platform

Codebase for the server and user interface of the annotation platform used in the publication: The GDN-CC Dataset: Automatic Corpus Clarification for AI-enhanced Democratic Citizen Consultations.

This repository provides a complete environment to replicate or extend the annotation process, featuring a Next.js frontend for annotators and a Flask backend for handling data and interacting with LLMs.

Do not hesitate to contact me by email at lequeu (at) isir.upmc.fr or by raising an issue on this repository for any question or help.


📸 Platform Overview

Main Annotation Interface

Main Interface The main annotation interface. On the left is the citizen contribution to annotate. Each colored rectangle is an argumentative unit. For each argumentative unit can be segmented "Affirmations" (statements), "arguments" (premises) and solutions. On the right is the clarifications given by the LLM, which can be modifed by the annotator.


🏗️ Architecture

The platform is split into two main components:

  • platformServer/: A Flask server handling data persistence, endpoints for LLM models, and summary generation.
  • platformUI/: A Next.js web application serving the user and admin interfaces.

⚙️ Installation & Setup

Prerequisites

  • Node.js and npm
  • Python (managed via uv)

1. Server Setup (platformServer/)

Navigate to the server directory and install dependencies:

cd platformServer
uv sync

Environment Variables: You need three environment variables :

  • GROQ_API_KEY: Your Groq API key.
  • OPENAI_API_KEY: Your OpenAI API key.
  • ANNOTATION_DATA_FILE: Path to your target data file.

2. UI Setup (platformUI/)

Navigate to the frontend directory and install dependencies:

cd ../platformUI
npm install

🚀 Running the Platform

You need to run both the server and the UI concurrently in separate terminal instances.Start the Backend Server:

cd platformServer
uv run app.py --port 3002

The server will run on localhost:3002.

Start the Frontend Interface:

cd platformUI
npm run dev

The interface will run on localhost:3000.

Accessing the Admin Page:

The admin interface is intentionally unlinked from the main navigation. You must access it directly by navigating to http://localhost:3000/admin in your browser.

📊 Data Structure & Export

To aggregate and export all completed annotations, run the following command from the platformServer/ directory:

uv run adminPower.py --save-all

This generates the all_annotations.jsonl file.JSONL SchemaEach line in the exported file follows this structure:

{ 
    "opinion": { 
        "authorName": "str", 
        "len": "int", 
        "opinionId": "int", 
        "text": "str" 
    },
    "results": [ 
        {
            "color": "str", 
            "segments": { 
                "segmentid": {
                    "color": "str", 
                    "start": "int", 
                    "end": "int", 
                    "type": "str", 
                    "hex": "str", 
                    "text": "str" 
                }   
            },
            "LLMtext": "str", 
            "text": "str" 
        } 
    ],
    "llm": "str", 
    "annotator": "str", 
    "time": "float", 
    "date": "str"   
}

Data Dictionary

Field Type Description
opinion.authorName String Represents the theme of the opinion, not the actual author. (Fixed in final dataset).
opinion.len Integer Total character length of the source text.
opinion.opinionId Integer Unique identifier for the opinion.
results.color String Index identifier for the Argumentative Unit (AU).
segment.type String Classification of the segment (solution, claim, or premise).
results.LLMtext String Raw text output generated by the LLM.
results.text String Final text validated/edited by the human annotator.
llm String LLM used for the clarification.
annotator String Annotator ID
time Float Total time spent (in seconds) from loading the opinion to accepting the clarification.
date String Datetime of the annotation.

📝 Citation

preprint:

@article{lequeu2026gdn,
  title={The GDN-CC Dataset: Automatic Corpus Clarification for AI-enhanced Democratic Citizen Consultations},
  author={Lequeu, Pierre-Antoine and Labat, L{\'e}o and Cave, Laur{\`e}ne and Lejeune, Ga{\"e}l and Yvon, Fran{\c{c}}ois and Piwowarski, Benjamin},
  journal={arXiv preprint arXiv:2601.14944},
  year={2026}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors