feat: automated index recommendations with live schema (#7)#54
Merged
Conversation
…der (#7) Foundation for automated index recommendations (issue #7), Layers 1-3: - Layer 1: UserDatabaseConnection model with Fernet-encrypted credentials (cryptography>=42.0, DB_CONNECTION_KEY env var). CRUD views at /connections/. - Layer 2: LiveSchemaContext snapshots tables/columns/indexes/FKs from a user's PG/MySQL/SQLite via DatabaseIntrospector and caches them in the query_analysis_cache (2h TTL). Wires DatabaseStatisticsManager. _fetch_live_statistics. AnalysisContext gains an optional live_schema. - Layer 3: IndexRecommender service — extracts candidates from WHERE/JOIN/ORDER BY/GROUP BY, classifies redundancy (EXACT/SUBSUMED) against existing indexes, scores cost-benefit via HypoPG EXPLAIN-deltas on PostgreSQL (with graceful heuristic fallback) or row-count×selectivity heuristics on MySQL/SQLite, and emits engine-specific CREATE/DROP DDL. Caps at 5 ranked recommendations, surfaces advisories. 39 new tests (test_connections, test_live_schema_context, test_index_recommender). Full suite: 633 tests, 18 fail / 4 err — same baseline as main (no regressions). Layers 4-9 (grade-flow integration, UI panel, ML features, GA4 event, integration tests, docs) follow in subsequent commits.
Completes issue #7 Layers 4-9 on top of Layer 1-3 foundation: - L4: QueryAnalysis.index_recommendations JSONField (migration 0005); QueryGradeForm gains optional user-scoped db_connection picker; grade_query view runs IndexRecommender after analyze_query when a connection is supplied, persists results, touches last_used_at. Failures are logged but never break the grade. - L5: "Index recommendations" panel in grade_results.html — confidence pill, redundancy badge, predicted-improvement %, expandable CodeMirror DDL block per recommendation, "Copy all DDL" action. Connection picker in grade_form.html. Navbar dropdown link to /connections/. - L6: FeatureExtractor.extract_index_features companion method (existing_index_count, unindexed_join/where_columns, largest_table_rows, recommended_index_count). Kept *separate* from the 45-feature main vector so deployed HYBRID_SCORER models keep loading. Persisted under index_recommendations.index_features for future supervised training. - L7: Server-side GA4 event index_recommendation_generated via session- flag pattern with recommendation_count / database_engine / confidence_high_count / redundant_filtered_count params. - L8: 4 TransactionTestCase integration tests covering the happy path, no-connection fallback, user-scoped picker, and introspector-failure resilience. Mocks _try_hypopg_improvement to avoid alias leakage that triggers ATOMIC_REQUESTS KeyError. Full suite: 637 tests, 18 fail / 4 err — same baseline as main (no regressions). +43 new passing tests (12 + 7 + 20 + 4).
ringo380
added a commit
that referenced
this pull request
May 17, 2026
After PRs #65–#68 merged, the pre-existing-failure floor was 15 (11 failures + 4 errors / 637 tests). All 15 were either UX-pass template-string drift (sentence vs. title case, retitled headings), behavior drift (anon trial removed login gate), or missing fixture paths. None were real bugs. Categories: - test_anonymous_trial.test_anon_grade_page_shows_trial_banner (1) Asserted "Trial mode" — no template renders that string anywhere. Switched to "free grades left", which the banner does render. - test_feedback (5) Title-case → sentence-case across submit form heading, update heading, and analytics page heading. test_feedback_button_in_results asserted "Provide Feedback" but the actual button on grade_results is labeled "Detailed feedback" (links to the same submit_feedback URL). - test_integration (3, legacy) - test_authentication_required: /grade/ is no longer login-gated (anon trial flow); only history/account/connections require auth. - test_full_query_grading_workflow: "Query Analysis Results" retitled to "Grade results" in the UX pass. - test_grade_display_formatting: grade-{letter} CSS class was retired; grade pill now uses Tailwind utilities. Assert visible grade letter directly. - test_database_analysis.test_database_analyze_get (1) Page heading retitled "Database Architecture Analysis" → "Connect a database" (#54 connection-mgmt UI). - test_optimization.test_optimization_integration_workflow (1) Optimization section + tab labels lowercased and shortened. - analyzer.tests.ParserTestCase (4 errors) setUp() looked for sample logs under analyzer/samples/ but they live at the repo-root samples/ dir. Fixed the path computation. After this change: `python manage.py test analyzer` → 637 tests, 0 failures, 0 errors, 14 skipped.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #7. Absorbs the schema-introspection slice of #6.
Summary
/connections/.What's in here
UserDatabaseConnectionmodel, Fernet crypto, CRUD views/templatesanalyzer/models/connection_models.py,analyzer/services/connection_crypto.py,analyzer/views/connection_views.py,analyzer/templates/analyzer/connections/LiveSchemaContext(Redis-cached schema snapshots) + wiresDatabaseStatisticsManager._fetch_live_statisticsanalyzer/services/live_schema_context.py,analyzer/ml/integration/database_stats.pyIndexRecommenderservice + DB-specific DDL generatoranalyzer/services/index_recommender.py,analyzer/services/index_script_generator.pyQueryAnalysis.index_recommendationsJSONField + grade-flow wiringanalyzer/views/query_grading_views.py,analyzer/forms/query_forms.pyanalyzer/templates/analyzer/grade_results.html,grade_form.html,base.htmlextract_index_featuresfor future supervised training (does not change the 45-feature main vector)analyzer/ml/core/feature_extractor.pyindex_recommendation_generatedserver-side event via session-flag patternanalyzer/views/query_grading_views.pyanalyzer/test_connections.py,test_live_schema_context.py,test_index_recommender.py,test_grade_with_live_schema.pyCost-benefit approach (why not HybridQueryGrader?)
HybridQueryGraderscores query text patterns — it has no awareness of indexes or schema, so feeding hypothetical indexes through it would produce identical scores. Real cost-benefit needs EXPLAIN-grounded data:pg_extension, creates each candidate as a hypothetical index, diffsTotal CostfromEXPLAIN (FORMAT JSON).HIGHconfidence. Falls back to heuristic + emits an advisory if HypoPG isn't installed.MEDIUMconfidence with row counts,LOWwithout.Recommendations are ranked by
improvement × confidence_weight, capped at 5.Required before merge
DB_CONNECTION_KEYon Railway (railway variables --service querygrade --set "DB_CONNECTION_KEY=$(python -c 'from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())')" --skip-deploys)cryptography>=42.0is added to bothrequirements.txtandrequirements-prod.txtTest plan
python manage.py test analyzer.test_connections analyzer.test_live_schema_context analyzer.test_index_recommender analyzer.test_grade_with_live_schema— all 43 passpython manage.py test analyzer— 637 tests, 18 fail / 4 err (same asmainbaseline; no regressions)orderstable, save connection, gradeSELECT * FROM orders WHERE customer_id = 42, verify ~95-99% predicted improvement card with valid DDLAlready existsredundancy badgeMEDIUMconfidenceindex_recommendation_generatedwith the four params after a successful recommendationOut of scope (follow-up issues)