Skip to content

Feature: Implementation-specific tags (techniques, dependencies, patterns) #2434

@MarkusNeusinger

Description

@MarkusNeusinger

Summary

Add a new tagging system for implementation-level tags that capture library-specific techniques, dependencies, and patterns. These are distinct from the existing spec-level tags which describe what is visualized.

Current State

Spec-level tags (in specification.yaml) describe the plot itself:

  • plot_type: scatter, bar, line, heatmap...
  • data_type: numeric, categorical, timeseries...
  • domain: statistics, finance, science...
  • features: basic, stacked, animated, 3d...

These apply uniformly to ALL implementations of a spec.

Proposed Addition

Implementation-level tags (in metadata/{library}.yaml) describe how an implementation works:

Suggested Tag Dimensions

Dimension Purpose Examples
dependencies External libraries used beyond the main plotting lib sklearn, scipy, statsmodels, numpy-advanced, pandas-groupby
techniques Specific coding/visualization techniques manual-ticks, custom-colormap, twin-axes, inset-axes, manual-kde, subplot-mosaic
patterns Code structure patterns animation-loop, callback-functions, streaming-data, lazy-loading
data_ops Data manipulation approaches pivot-table, groupby-agg, rolling-window, binning, normalization
styling Visual styling approaches theme-customization, annotation-heavy, minimal-chrome, publication-ready

Example Use Cases

  1. Find all plots using sklearn:

    ?impl_deps=sklearn
    
  2. Find matplotlib implementations with manual tick configuration:

    ?lib=matplotlib&impl_tech=manual-ticks
    
  3. Find implementations using custom colormaps:

    ?impl_tech=custom-colormap
    
  4. Find seaborn plots with twin axes:

    ?lib=seaborn&impl_tech=twin-axes
    

Proposed Schema

metadata/{library}.yaml Addition

library: matplotlib
specification_id: scatter-regression
# ... existing fields ...

# NEW: Implementation-specific tags
impl_tags:
  dependencies:
    - sklearn
    - scipy
  techniques:
    - manual-ticks
    - twin-axes
  data_ops:
    - linear-regression
  styling:
    - publication-ready

Database Schema Addition

class Impl(Base):
    # ... existing fields ...
    impl_tags: Mapped[Optional[dict]]  # JSONB: {dependencies, techniques, patterns, ...}

With GIN index for efficient filtering.

API Extension

New filter parameters:

  • impl_deps - Filter by dependencies
  • impl_tech - Filter by techniques
  • impl_pattern - Filter by patterns
  • impl_data - Filter by data operations
  • impl_style - Filter by styling approaches

Benefits

  1. Discoverability: Users can find implementations using specific techniques they want to learn
  2. Learning paths: "Show me all examples of twin-axes" across all libraries
  3. Dependency awareness: Know which plots require sklearn before trying to run them
  4. Cross-library comparison: "How do different libraries handle manual tick setting?"
  5. Code quality insights: Find "publication-ready" styled implementations

Implementation Considerations

Tag Assignment

  • Could be auto-detected by analyzing the implementation code
  • Or assigned by AI during code review (impl-review.yml)
  • Or a combination: AI suggests, human approves

Migration

  • Existing implementations would need tagging (could be done in bulk via workflow)
  • New implementations get tagged during generation/review

Questions to Decide

  1. Auto-detection vs AI-assigned vs manual?
  2. Fixed vocabulary or free-form tags?
  3. Which dimensions are most valuable to start with?
  4. Frontend UI: separate filter section or integrated with existing?

Related

  • Current tagging docs: docs/concepts/tagging-system.md
  • Spec tags stored in: plots/{spec-id}/specification.yaml
  • Database models: core/database/models.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions