Skip to content

📊 Create DATA_MODEL.md - Political Data Architecture #195

@pethers

Description

@pethers

📋 Issue Type

Documentation / Architecture

🎯 Objective

Create comprehensive DATA_MODEL.md documenting all data structures, entities, relationships, and data flows for riksdagsmonitor's political intelligence platform with CIA integration and multi-language support.

📊 Current State

  • ❌ DATA_MODEL.md does not exist
  • ⚠️ Data architecture undocumented
  • ⚠️ CIA product schemas not centrally documented
  • ⚠️ Political entities and relationships implicit
  • ⚠️ Multi-language data structures scattered

Existing Related Docs:

  • ✅ ARCHITECTURE.md (C4 models) - system architecture
  • ✅ README.md - lists 19 CIA products
  • cia-data/ directory structure exists
  • schemas/ directory with JSON validation

🚀 Desired State

Comprehensive DATA_MODEL.md including:

1. Political Entities

  • Politicians (2,494 historical, 349 current MPs)
  • Parties (8 Swedish parties)
  • Committees (15 parliamentary committees)
  • Documents (109,000+ parliamentary docs)
  • Votes (3.5M+ voting records)
  • Ministries (Government cabinet structure)

2. CIA Data Products (19)

Intelligence Dashboards:

  • Overview Dashboard
  • Party Performance
  • Government Cabinet
  • Election Cycle Analysis

Top 10 Rankings:

  • Most Influential MPs
  • Most Productive MPs
  • Most Controversial MPs
  • Most Absent MPs
  • Party Rebels
  • Coalition Brokers
  • Rising Stars
  • Electoral Risk
  • Ethics Concerns
  • Media Presence

Advanced Analytics:

  • Committee Network Analysis
  • Politician Career Analysis
  • Party Longitudinal Analysis

3. Data Relationships

  • Politician → Party (membership)
  • Politician → Committee (assignments)
  • Politician → Vote (voting records)
  • Politician → Document (authorship)
  • Party → Coalition (alignment)
  • Committee → Document (processing)

4. Data Sources

  • Swedish Riksdag API (data.riksdagen.se)
  • Swedish Election Authority (val.se)
  • Swedish Financial Management (esv.se)
  • World Bank Open Data
  • CIA Platform (www.hack23.com/cia)

5. Data Schemas

  • JSON Schema definitions (schemas/)
  • CSV data structures (cia-data/)
  • Production statistics schema
  • Multi-language metadata

6. Data Pipeline

  • Automated daily updates (03:00 CET)
  • Schema validation workflows
  • Caching strategies (1-24 hour freshness)
  • LocalStorage persistence

7. Multi-Language Data

  • 14-language support (EN, SV, DA, NO, FI, DE, FR, ES, NL, AR, HE, JA, KO, ZH)
  • Translation metadata
  • RTL support (Arabic, Hebrew)
  • Language file structure

📊 CIA Data Integration Context

CIA Product(s): All 19 visualization products
Data Sources:

  • cia-data/production-stats.json - Live statistics
  • cia-data/politician/*.csv - MP data
  • cia-data/seasonal/*.csv - Temporal patterns
  • cia-data/pre-election/*.csv - Election monitoring

Schema References:

Methodology:

  • OSINT collection per DATA_ANALYSIS_INTOP_OSINT.md
  • Risk scoring (45 rules)
  • Network analysis (influence mapping)
  • Transparency assessment

🌐 Translation & Content Alignment

Translation Guide(s):

  • Swedish-Translation-Guide.md (parliamentary terminology)
  • Finnish-Translation-Guide.md (Nordic consistency)
  • Korean-Translation-Guide.md (CJK handling)
  • Spanish-Translation-Guide.md (global Spanish)

Multi-Language Scope: All 14 languages

Implementation Notes:

  • Entity names translated (committee names, document types)
  • Abbreviations preserved (S, M, SD, V, MP, C, L, KD)
  • Document IDs preserved (Prop. 2025/26:100)
  • Date/number formatting per locale

🔧 Implementation Approach

1. Entity-Relationship Diagrams

Create Mermaid ERD diagrams:

erDiagram
    POLITICIAN ||--o{ VOTE : casts
    POLITICIAN }o--|| PARTY : belongs_to
    POLITICIAN }o--o{ COMMITTEE : assigned_to
    POLITICIAN ||--o{ DOCUMENT : authors
    PARTY ||--o{ COALITION : forms
    COMMITTEE ||--o{ DOCUMENT : processes
Loading

2. Data Dictionary

For each entity:

  • Name - Entity identifier
  • Attributes - Fields and data types
  • Keys - Primary/foreign keys
  • Relationships - Cardinality (1:1, 1:N, N:M)
  • Source - Data origin (Riksdag API, CIA platform)
  • Update Frequency - Daily, weekly, on-demand

3. Schema Documentation

  • JSON Schema references
  • CSV column definitions
  • Data validation rules
  • Type safety contracts

4. Data Flow Diagrams

  • Collection → Validation → Storage → Presentation
  • CIA export → riksdagsmonitor import pipeline
  • Daily statistics update workflow

5. Multi-Language Data Architecture

  • Language file structure (index_{lang}.html)
  • Translation metadata storage
  • RTL layout data requirements
  • Hreflang SEO structure

6. Performance Considerations

  • Caching strategies (LocalStorage, GitHub CDN)
  • Data freshness checks (1-24 hours)
  • Lazy loading patterns
  • Code splitting by dashboard

📚 References

Repository:

CIA References:

ISMS Policies:

✅ Acceptance Criteria

  • DATA_MODEL.md created with comprehensive data architecture
  • Entity-relationship diagrams (Mermaid ERD)
  • Data dictionary for all entities (Politicians, Parties, Committees, Documents, Votes)
  • 19 CIA product schemas documented
  • Data source mapping (Riksdag API, CIA platform)
  • Data pipeline documentation (collection → validation → storage)
  • Multi-language data architecture (14 languages)
  • Schema references (JSON Schema, CSV definitions)
  • Performance and caching strategy
  • Document control metadata (version, owner, last updated)
  • C4 data model integration with ARCHITECTURE.md
  • Reviewed by documentation-architect agent

🤖 Recommended Agent

documentation-architect - Specialized in C4 models, ERD diagrams, Mermaid visualization, and comprehensive technical documentation

🏷️ Labels

type:documentation, priority:high, component:architecture, component:data-integration, component:cia-data, isms, agent:documentation-architect

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions