GDN-CC Annotation Platform

Codebase for the server and user interface of the annotation platform used in the publication: The GDN-CC Dataset: Automatic Corpus Clarification for AI-enhanced Democratic Citizen Consultations.

This repository provides a complete environment to replicate or extend the annotation process, featuring a Next.js frontend for annotators and a Flask backend for handling data and interacting with LLMs.

Do not hesitate to contact me by email at lequeu (at) isir.upmc.fr or by raising an issue on this repository for any question or help.

📸 Platform Overview

Main Annotation Interface

The main annotation interface. On the left is the citizen contribution to annotate. Each colored rectangle is an argumentative unit. For each argumentative unit can be segmented "Affirmations" (statements), "arguments" (premises) and solutions. On the right is the clarifications given by the LLM, which can be modifed by the annotator.

🏗️ Architecture

The platform is split into two main components:

platformServer/: A Flask server handling data persistence, endpoints for LLM models, and summary generation.
platformUI/: A Next.js web application serving the user and admin interfaces.

⚙️ Installation & Setup

Prerequisites

Node.js and npm
Python (managed via uv)

1. Server Setup (`platformServer/`)

Navigate to the server directory and install dependencies:

cd platformServer
uv sync

Environment Variables: You need three environment variables :

GROQ_API_KEY: Your Groq API key.
OPENAI_API_KEY: Your OpenAI API key.
ANNOTATION_DATA_FILE: Path to your target data file.

2. UI Setup (platformUI/)

Navigate to the frontend directory and install dependencies:

cd ../platformUI
npm install

🚀 Running the Platform

You need to run both the server and the UI concurrently in separate terminal instances.Start the Backend Server:

cd platformServer
uv run app.py --port 3002

The server will run on localhost:3002.

Start the Frontend Interface:

cd platformUI
npm run dev

The interface will run on localhost:3000.

Accessing the Admin Page:

The admin interface is intentionally unlinked from the main navigation. You must access it directly by navigating to http://localhost:3000/admin in your browser.

📊 Data Structure & Export

To aggregate and export all completed annotations, run the following command from the platformServer/ directory:

uv run adminPower.py --save-all

This generates the all_annotations.jsonl file.JSONL SchemaEach line in the exported file follows this structure:

{ 
    "opinion": { 
        "authorName": "str", 
        "len": "int", 
        "opinionId": "int", 
        "text": "str" 
    },
    "results": [ 
        {
            "color": "str", 
            "segments": { 
                "segmentid": {
                    "color": "str", 
                    "start": "int", 
                    "end": "int", 
                    "type": "str", 
                    "hex": "str", 
                    "text": "str" 
                }   
            },
            "LLMtext": "str", 
            "text": "str" 
        } 
    ],
    "llm": "str", 
    "annotator": "str", 
    "time": "float", 
    "date": "str"   
}

Data Dictionary

Field	Type	Description
`opinion.authorName`	`String`	Represents the theme of the opinion, not the actual author. (Fixed in final dataset).
`opinion.len`	`Integer`	Total character length of the source text.
`opinion.opinionId`	`Integer`	Unique identifier for the opinion.
`results.color`	`String`	Index identifier for the Argumentative Unit (AU).
`segment.type`	`String`	Classification of the segment (`solution`, `claim`, or `premise`).
`results.LLMtext`	`String`	Raw text output generated by the LLM.
`results.text`	`String`	Final text validated/edited by the human annotator.
`llm`	`String`	LLM used for the clarification.
`annotator`	`String`	Annotator ID
`time`	`Float`	Total time spent (in seconds) from loading the opinion to accepting the clarification.
`date`	`String`	Datetime of the annotation.

📝 Citation

preprint:

@article{lequeu2026gdn,
  title={The GDN-CC Dataset: Automatic Corpus Clarification for AI-enhanced Democratic Citizen Consultations},
  author={Lequeu, Pierre-Antoine and Labat, L{\'e}o and Cave, Laur{\`e}ne and Lejeune, Ga{\"e}l and Yvon, Fran{\c{c}}ois and Piwowarski, Benjamin},
  journal={arXiv preprint arXiv:2601.14944},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
platformServer		platformServer
platformUI		platformUI
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GDN-CC Annotation Platform

📸 Platform Overview

Main Annotation Interface

🏗️ Architecture

⚙️ Installation & Setup

Prerequisites

1. Server Setup (`platformServer/`)

2. UI Setup (platformUI/)

🚀 Running the Platform

Accessing the Admin Page:

📊 Data Structure & Export

Data Dictionary

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GDN-CC Annotation Platform

📸 Platform Overview

Main Annotation Interface

🏗️ Architecture

⚙️ Installation & Setup

Prerequisites

1. Server Setup (platformServer/)

2. UI Setup (platformUI/)

🚀 Running the Platform

Accessing the Admin Page:

📊 Data Structure & Export

Data Dictionary

📝 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Server Setup (`platformServer/`)

Packages