Skip to content

feat: implementation-level tags for code analysis (#2434)#3179

Merged
MarkusNeusinger merged 9 commits intomainfrom
feature/impl-tags
Jan 7, 2026
Merged

feat: implementation-level tags for code analysis (#2434)#3179
MarkusNeusinger merged 9 commits intomainfrom
feature/impl-tags

Conversation

@MarkusNeusinger
Copy link
Copy Markdown
Owner

Summary

  • Add implementation-level tags (impl_tags) to describe HOW code implements a plot
  • Backfill all 1,476 metadata files with impl_tags
  • Add 5 new filter categories in the UI: uses, technique, pattern, dataprep, style
  • Short URL keys for filters: dep, tech, pat, prep, style
  • Add tooltips for all 11 filter categories
  • Track impl_tags in Plausible analytics

Changes

Backend

  • New impl_tags structure in metadata YAML files
  • API returns impl-level filter counts alongside spec-level
  • Updated FilterCountsResponse schema

Frontend

  • 11 filter categories in dropdown (6 spec-level + 5 impl-level)
  • Tooltips describing each filter category
  • Plausible tracking for all filter types

Data

  • 1,476 metadata files updated with impl_tags
  • Tags: dependencies, techniques, patterns, dataprep, styling

Test plan

  • Unit tests pass (874 tests)
  • Ruff check and format pass
  • Manual testing: all 11 categories appear in dropdown
  • Tooltips display correctly

Closes #2434

🤖 Generated with Claude Code

MarkusNeusinger and others added 5 commits January 7, 2026 12:41
- Introduced impl_tags to describe implementation details
- Updated workflows and scripts to handle new impl_tags structure
- Enhanced filtering and counting mechanisms for impl-level tags
Added impl_tags section with 5 dimensions:
- dependencies: 14 unique (selenium, scipy, sklearn most common)
- techniques: 24 unique (html-export, annotations, layer-composition)
- patterns: 10 unique (data-generation in 93% of files)
- dataprep: 9 unique (kde, normalization, regression)
- styling: 6 unique (alpha-blending, grid-styling, minimal-chrome)

Cleaned up styling tags:
- Removed 'publication-ready' (too vague, not meaningful)
- Fixed 'custom-colormap' to only tag actual cmap= usage
- Added 'minimal-chrome' where axis('off') is used

Updated tagging rules in prompts/impl-tags-generator.md

Issue #2434

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added _create_direct_engine_sync() function that uses pg8000 driver
for sync database operations. This allows running the sync script
locally with DATABASE_URL instead of only in Cloud SQL environments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update _image_matches_groups tests to include impl_lookup parameter
- Update _calculate_contextual_counts test to include impl_lookup
- Update _calculate_or_counts tests to include impl_lookup parameter
- Update init_db_sync test to mock _create_direct_engine_sync
- Update app version/description assertions to match current values

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add impl_dep, impl_tech, impl_pat, impl_prep, impl_style
to the tracked filter categories for Plausible analytics.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 7, 2026 13:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces implementation-level tags to describe HOW code implements plots, complementing existing spec-level tags that describe WHAT is visualized. The system adds 5 new filter dimensions (dependencies, techniques, patterns, dataprep, styling) with corresponding UI filters and analytics tracking.

Key Changes:

  • New two-level tagging architecture separating specification intent from implementation details
  • Backend support for impl_tags in database schema, API responses, and filtering logic
  • Frontend UI with 11 total filter categories (6 spec-level + 5 impl-level) including tooltips
  • Backfilled 1,476 metadata files with implementation tags

Reviewed changes

Copilot reviewed 300 out of 1496 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
docs/reference/tagging-system.md Documents two-level tagging system with naming conventions and examples
core/database/models.py Adds impl_tags JSONB column to Impl model
core/database/connection.py Adds sync engine creation for local development
automation/scripts/sync_to_postgres.py Syncs impl_tags from YAML to database
alembic/versions/a2f4b8c91d23_add_impl_tags.py Migration adding impl_tags column with GIN index
api/schemas.py Adds impl_tags field and 5 new filter count categories
api/routers/plots.py Implements filtering and counting logic for impl-level tags
app/src/types/index.ts Defines 5 new filter categories with labels and tooltips
app/src/hooks/useAnalytics.ts Tracks impl-level filter usage in Plausible
app/src/components/FilterBar.tsx Displays tooltips for all filter categories
plots//metadata/.yaml 1,476 files backfilled with impl_tags structure
.github/workflows/spec-create.yml Updates tagging guide reference
CLAUDE.md Documents impl_tags in metadata structure

Comment thread core/database/connection.py Outdated

def _create_direct_engine():
"""Create engine using direct DATABASE_URL connection."""
"""Create async engine using direct DATABASE_URL connection."""
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected docstring: function creates a sync engine, not async.

Copilot uses AI. Check for mistakes.
Comment thread api/routers/plots.py Outdated
filter_groups: list[dict], all_images: list[dict], spec_id_to_tags: dict, spec_lookup: dict
filter_groups: list[dict], all_images: list[dict], spec_id_to_tags: dict, spec_lookup: dict, impl_lookup: dict
) -> list[dict]:
"""Calculate OR preview counts for each filter group."""
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function signature has 5 parameters but the docstring only documents the first 3. Add documentation for spec_lookup and impl_lookup parameters.

Suggested change
"""Calculate OR preview counts for each filter group."""
"""Calculate OR preview counts for each filter group.
Args:
filter_groups: List of filter group dictionaries defining categories and values for filtering.
all_images: List of image dictionaries to evaluate against the filter groups.
spec_id_to_tags: Mapping from specification IDs to their associated tag metadata.
spec_lookup: Mapping from specification IDs to full specification metadata, including tags.
impl_lookup: Mapping from (spec_id, library) pairs to implementation-level tag metadata.
Returns:
A list of dictionaries, one per filter group, mapping each possible value in that group
to the number of images that would match when that value is applied with the other groups.
"""

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +24
- custom-legend
- bezier-curves
- columndatasource
- html-export
patterns:
- data-generation
- iteration-over-groups
- columndatasource
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value columndatasource appears in both techniques (line 20) and patterns (line 25). Based on the tagging documentation, columndatasource is a code structure pattern, not a visualization technique. Remove it from the techniques list.

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 7, 2026

Codecov Report

❌ Patch coverage is 76.41509% with 25 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
api/routers/plots.py 74.68% 20 Missing ⚠️
core/database/connection.py 77.77% 4 Missing ⚠️
automation/scripts/sync_to_postgres.py 50.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

- Use short URL keys for impl-level filters: dep, tech, pat, prep, style
  (consistent with spec-level filters like lib, plot, data)
- Add FILTER_TOOLTIPS with descriptions for all 11 filter categories
- Update Plausible analytics orderedKeys for tracking
- Remove publication-ready from tagging docs example

Issue #2434

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings January 7, 2026 13:55
- Fix docstring: sync engine, not async (connection.py)
- Add missing docstring params for spec_lookup and impl_lookup (plots.py)
- Remove duplicate columndatasource from techniques (bokeh.yaml)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 300 out of 1496 changed files in this pull request and generated no new comments.

- Add 5 impl-level URL params: dep, tech, pat, prep, style
- Update category values reference with spec/impl distinction
- Add filter_* properties for og:image tracking
- Update API response example with impl-level counts
Copilot AI review requested due to automatic review settings January 7, 2026 14:03
- Add tests for dep, tech, pat, prep, style filter matching
- Add test for impl_not_in_lookup case
- Add test for global_counts with impl_tags
- Add test for contextual_counts with impl_tags

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 300 out of 1498 changed files in this pull request and generated 2 comments.



def _create_direct_engine_sync():
"""Create sync engine using direct DATABASE_URL connection (for sync scripts)."""
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function docstring says it's for 'sync scripts', but this is more specific - it's for local development scripts that need sync database access. Consider updating to match the explanation in init_db_sync() which mentions 'scripts like sync_to_postgres.py'.

Suggested change
"""Create sync engine using direct DATABASE_URL connection (for sync scripts)."""
"""Create sync engine using direct DATABASE_URL connection.
Used for local development scripts that need synchronous database access,
such as scripts like sync_to_postgres.py.
"""

Copilot uses AI. Check for mistakes.
Comment thread api/routers/plots.py
Comment on lines +199 to +212
filter_groups: list[dict], all_images: list[dict], spec_id_to_tags: dict, spec_lookup: dict, impl_lookup: dict
) -> list[dict]:
"""Calculate OR preview counts for each filter group."""
"""Calculate OR preview counts for each filter group.

Args:
filter_groups: List of filter group dictionaries defining categories and values.
all_images: List of image dictionaries to evaluate against the filter groups.
spec_id_to_tags: Mapping from specification IDs to their associated tag metadata.
spec_lookup: Mapping from specification IDs to full specification metadata.
impl_lookup: Mapping from (spec_id, library) pairs to implementation-level tags.

Returns:
List of dicts, one per filter group, mapping values to matching image counts.
"""
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the docstring was added, other functions in this file like _calculate_global_counts, _calculate_contextual_counts, _filter_images, and _image_matches_groups also gained new parameters but did not receive updated docstrings. For consistency, consider adding similar documentation to those functions.

Copilot uses AI. Check for mistakes.
@MarkusNeusinger MarkusNeusinger merged commit e7f89a9 into main Jan 7, 2026
8 of 9 checks passed
@MarkusNeusinger MarkusNeusinger deleted the feature/impl-tags branch January 7, 2026 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Implementation-specific tags (techniques, dependencies, patterns)

2 participants