Nelson/aip 611 determine number of experimentsruns in last 3 months by Dashing-Nelson · Pull Request #1963 · everycure-org/matrix

Dashing-Nelson · 2025-11-24T11:01:48Z

Description of the changes

This pull request introduces several improvements to the pipeline configuration and execution for the matrix project, focusing on better thread safety, enhanced configurability via environment variables, and updated dependencies. The most significant changes include switching to a thread-safe Spark session manager, enabling pipeline parameters to be set through environment variables, and running Kedro pipelines with the ThreadRunner for enhanced parallelism.

Pipeline execution and configuration:

Updated Kedro pipeline runs in both the Makefile and docker-compose.ci.yml to use the ThreadRunner, enabling parallel execution and potentially faster test runs. [1] [2]
Changed dynamic pipeline parameters (n_cross_val_folds and num_shards) in settings.py to be configurable via environment variables, improving flexibility for different environments.

Thread safety and Spark session management:

Replaced getActiveSession() with SparkManager.get_or_create_session() in gcp.py to ensure thread safety when creating Spark DataFrames.

Dependency and environment management:

Updated the submodule commit for infra/secrets, likely pulling in new secrets or configuration changes.
Added an import for os in settings.py to support environment variable configuration.

Fixes / Resolves the following issues:

Checklist:

Added label to PR (e.g. enhancement or bug)
Ensured the PR is named descriptively. FYI: This name is used as part of our changelog & release notes.
Looked at the diff on github to make sure no unwanted files have been committed.
Made corresponding changes to the documentation
Added tests that prove my fix is effective or that my feature works
Any dependent changes have been merged and published in downstream modules
If breaking changes occur or you need everyone to run a command locally after
pulling in latest main, uncomment the below "Merge Notification" section and
describe steps necessary for people
Ran on sample data using kedro run -e sample -p test_sample (see sample environment guide)

… command

- Implemented a Jupyter notebook that connects to the MLflow tracking server. - Added steps for GKE authentication, service discovery, and port-forwarding. - Included functionality to query and analyze experiments and runs from the last 3 months. - Summarized results by experiment and status, with options for detailed statistics and CSV export.

…tsruns-in-last-3-months

…1963) * Change test runner from ThreadRunner to ParallelRunner in docker test command * Change test runner from ParallelRunner to ThreadRunner in docker test command * Add runbook to count MLflow experiments from the last 3 months - Implemented a Jupyter notebook that connects to the MLflow tracking server. - Added steps for GKE authentication, service discovery, and port-forwarding. - Included functionality to query and analyze experiments and runs from the last 3 months. - Summarized results by experiment and status, with options for detailed statistics and CSV export.

* Correct trails to trial * Update ec_clinical_trial ingestion, transform and fabricator with ec_id * Update ec_clinical_trials transformation columns * Update off label ingestion and transformation to handle ec_drug_id * Update kgml ground truth ingestion, transformation and fabrication * Update drugbank * Update ec ground truth * Update pipelines/matrix/conf/base/fabricator/parameters.yml * Update pipelines/matrix/conf/base/fabricator/parameters.yml * Update pipelines/matrix/conf/base/fabricator/parameters.yml * Follow Piotr's widsom * Minor fixes * Join generated pairs with known pairs on EC_id * Add xgboost * Join with embeddings on curie * Bump secrets * Remove source like columns before predicting * Update pipelines/matrix/src/matrix/pipelines/matrix_generation/nodes.py Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com> * UV lock * Fix matrix test data * Fix fabricator params * Drop disease synonyms column * Fix drug/disease in the embeddings * Add default pipeline * Remove polars changes * Add key node pages to dashboard (#1887) * Add index & templated page for characterizing key nodes * remove prefixes, show count col first, remove unique subjects/objects cols * Optimize key_nodes_stats query performance by removing expensive descendant edge computations and sourcing from pre-computed release aggregate data * pull primary knowledge source out of the edge tables on key nodes pages * Add interactive chord diagram for key node category visualization This commit adds an interactive chord/radial diagram to visualize key node connections to biolink categories, with drill-down functionality to show example edges. New Features: - Interactive chord diagram showing key node in center with connected categories in oval layout - Direction-agnostic category grouping using biolink hierarchy (reduces 44 to ~14 parent groups) - Node sizes scaled by distinct connected nodes count - Link widths scaled by total edge count - Click-to-drill-down: select category to see example edges - Diverse edge sampling: 10 example edges per primary knowledge source - Clickable knowledge source links to detail pages - Dark mode compatible styling New Files: - sources/bq/key_nodes_category_summary.sql: Aggregates edges by parent biolink categories - sources/bq/key_nodes_category_edges.sql: Fetches example edges with diverse sampling - pages/_components/KeyNodeChordDashboard.svelte: Main visualization component with ECharts - pages/_lib/key-node-chord/constants.js: Layout constants and color palette - pages/_lib/key-node-chord/chord-layout.js: Position calculations and formatting utilities Modified Files: - pages/Key Nodes/[key_node_id].md: Integrated chord dashboard component Technical Details: - Used ECharts with direct initialization for click event handling (Evidence.dev workaround) - Oval-shaped layout (OUTER_RADIUS_X: 280, OUTER_RADIUS_Y: 150) - Straight connecting lines (curveness: 0) - Relative paths for knowledge source links with infores: prefix - Evidence.dev DataTable integration for drill-down with pagination reset 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Code tidying for chord dashboard Code Cleanup: - Remove unused SQL columns (subject_category, object_category) from queries - Remove unused import (CATEGORY_COLORS) from Svelte component - Improve variable naming clarity (c → category, n → node) Documentation: - Add explanatory comments for complex category mapping logic in SQL - Document design decisions (oval shape, scaling ranges, straight lines) - Add edge case handling comments (equal counts in scaling) - Add component header documenting key technical decisions Technical Details: - Explain node size range (20-60px) ensures clickability without overlap - Explain link width range (2-12px) maintains visibility without overwhelming - Document biolink hierarchy mapping strategy reduces 44+ categories to ~14 groups - Clarify why direct ECharts initialization is used (for click event handling) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove search from DataTables and fix premature error message Fixes: - Remove search=true from 5 DataTable components (chord dashboard + 4 on key node page) - Change error message condition to check if key_node_info is defined before showing The search feature requires Query objects but we're passing filtered JavaScript arrays, triggering "Search Failed - Please use a query instead" toast warnings. The error message was showing immediately on page load before the query completed. Now uses {:else if key_node_info !== undefined} to only show after query finishes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove Connection Flow Sankey and rename section to Graph Edges Changes: - Remove "Connection Flow" Sankey diagram section (replaced by chord diagram) - Remove unused key_node_connected_categories SQL query - Rename "Interactive Category Explorer" to "{node_name} Graph Edges" - Update section description to better explain the visualization The chord diagram provides a cleaner, more interactive way to explore category connections than the Sankey flow diagram. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * use a shared data source for bar charts, graph and table, make sure connected category is always the other side of the association * removed unused source queries * remove unused another no-longer used query * move category colors from component to shared colors.js * key node custom viz labels compatible with dark mode * use category color from key node in key node viz, improve border colors in graph & bar charts * Update title for example/sample edges table * randomize key node sample edges * revert to deterministic sorting of example edges * update key node graph node and label colors for better visibility * include display of pks in edge counts added/removed/significant change, since it was already the natural key we were using, added explanatory text for each table * swap out markdown bold for strong tags --------- Co-authored-by: Claude <noreply@anthropic.com> * Nelson/aip 616 rd make topological embeddings resilient to spot failures (#1957) * Add spot node pools to GKE configuration and remove spot tolerations from Argo workflow template * Add tolerations for neo4j pod to enhance resource scheduling --------- Co-authored-by: Jacques Vergine <jacques.vergine35@gmail.com> * Nelson/aip 611 determine number of experimentsruns in last 3 months (#1963) * Change test runner from ThreadRunner to ParallelRunner in docker test command * Change test runner from ParallelRunner to ThreadRunner in docker test command * Add runbook to count MLflow experiments from the last 3 months - Implemented a Jupyter notebook that connects to the MLflow tracking server. - Added steps for GKE authentication, service discovery, and port-forwarding. - Included functionality to query and analyze experiments and runs from the last 3 months. - Summarized results by experiment and status, with options for detailed statistics and CSV export. * docs: add LiteLLM New Provider Guide and update usage documentation (#1964) * Add xgboost * UV lock * Bump uv lock * Correct trails to trial * trails to trials --------- Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com> Co-authored-by: Kevin Schaper <kevinschaper@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>

…1963) * Change test runner from ThreadRunner to ParallelRunner in docker test command * Change test runner from ParallelRunner to ThreadRunner in docker test command * Add runbook to count MLflow experiments from the last 3 months - Implemented a Jupyter notebook that connects to the MLflow tracking server. - Added steps for GKE authentication, service discovery, and port-forwarding. - Included functionality to query and analyze experiments and runs from the last 3 months. - Summarized results by experiment and status, with options for detailed statistics and CSV export.

* Correct trails to trial * Update ec_clinical_trial ingestion, transform and fabricator with ec_id * Update ec_clinical_trials transformation columns * Update off label ingestion and transformation to handle ec_drug_id * Update kgml ground truth ingestion, transformation and fabrication * Update drugbank * Update ec ground truth * Update pipelines/matrix/conf/base/fabricator/parameters.yml * Update pipelines/matrix/conf/base/fabricator/parameters.yml * Update pipelines/matrix/conf/base/fabricator/parameters.yml * Follow Piotr's widsom * Minor fixes * Join generated pairs with known pairs on EC_id * Add xgboost * Join with embeddings on curie * Bump secrets * Remove source like columns before predicting * Update pipelines/matrix/src/matrix/pipelines/matrix_generation/nodes.py Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com> * UV lock * Fix matrix test data * Fix fabricator params * Drop disease synonyms column * Fix drug/disease in the embeddings * Add default pipeline * Remove polars changes * Add key node pages to dashboard (#1887) * Add index & templated page for characterizing key nodes * remove prefixes, show count col first, remove unique subjects/objects cols * Optimize key_nodes_stats query performance by removing expensive descendant edge computations and sourcing from pre-computed release aggregate data * pull primary knowledge source out of the edge tables on key nodes pages * Add interactive chord diagram for key node category visualization This commit adds an interactive chord/radial diagram to visualize key node connections to biolink categories, with drill-down functionality to show example edges. New Features: - Interactive chord diagram showing key node in center with connected categories in oval layout - Direction-agnostic category grouping using biolink hierarchy (reduces 44 to ~14 parent groups) - Node sizes scaled by distinct connected nodes count - Link widths scaled by total edge count - Click-to-drill-down: select category to see example edges - Diverse edge sampling: 10 example edges per primary knowledge source - Clickable knowledge source links to detail pages - Dark mode compatible styling New Files: - sources/bq/key_nodes_category_summary.sql: Aggregates edges by parent biolink categories - sources/bq/key_nodes_category_edges.sql: Fetches example edges with diverse sampling - pages/_components/KeyNodeChordDashboard.svelte: Main visualization component with ECharts - pages/_lib/key-node-chord/constants.js: Layout constants and color palette - pages/_lib/key-node-chord/chord-layout.js: Position calculations and formatting utilities Modified Files: - pages/Key Nodes/[key_node_id].md: Integrated chord dashboard component Technical Details: - Used ECharts with direct initialization for click event handling (Evidence.dev workaround) - Oval-shaped layout (OUTER_RADIUS_X: 280, OUTER_RADIUS_Y: 150) - Straight connecting lines (curveness: 0) - Relative paths for knowledge source links with infores: prefix - Evidence.dev DataTable integration for drill-down with pagination reset 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Code tidying for chord dashboard Code Cleanup: - Remove unused SQL columns (subject_category, object_category) from queries - Remove unused import (CATEGORY_COLORS) from Svelte component - Improve variable naming clarity (c → category, n → node) Documentation: - Add explanatory comments for complex category mapping logic in SQL - Document design decisions (oval shape, scaling ranges, straight lines) - Add edge case handling comments (equal counts in scaling) - Add component header documenting key technical decisions Technical Details: - Explain node size range (20-60px) ensures clickability without overlap - Explain link width range (2-12px) maintains visibility without overwhelming - Document biolink hierarchy mapping strategy reduces 44+ categories to ~14 groups - Clarify why direct ECharts initialization is used (for click event handling) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove search from DataTables and fix premature error message Fixes: - Remove search=true from 5 DataTable components (chord dashboard + 4 on key node page) - Change error message condition to check if key_node_info is defined before showing The search feature requires Query objects but we're passing filtered JavaScript arrays, triggering "Search Failed - Please use a query instead" toast warnings. The error message was showing immediately on page load before the query completed. Now uses {:else if key_node_info !== undefined} to only show after query finishes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove Connection Flow Sankey and rename section to Graph Edges Changes: - Remove "Connection Flow" Sankey diagram section (replaced by chord diagram) - Remove unused key_node_connected_categories SQL query - Rename "Interactive Category Explorer" to "{node_name} Graph Edges" - Update section description to better explain the visualization The chord diagram provides a cleaner, more interactive way to explore category connections than the Sankey flow diagram. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * use a shared data source for bar charts, graph and table, make sure connected category is always the other side of the association * removed unused source queries * remove unused another no-longer used query * move category colors from component to shared colors.js * key node custom viz labels compatible with dark mode * use category color from key node in key node viz, improve border colors in graph & bar charts * Update title for example/sample edges table * randomize key node sample edges * revert to deterministic sorting of example edges * update key node graph node and label colors for better visibility * include display of pks in edge counts added/removed/significant change, since it was already the natural key we were using, added explanatory text for each table * swap out markdown bold for strong tags --------- Co-authored-by: Claude <noreply@anthropic.com> * Nelson/aip 616 rd make topological embeddings resilient to spot failures (#1957) * Add spot node pools to GKE configuration and remove spot tolerations from Argo workflow template * Add tolerations for neo4j pod to enhance resource scheduling --------- Co-authored-by: Jacques Vergine <jacques.vergine35@gmail.com> * Nelson/aip 611 determine number of experimentsruns in last 3 months (#1963) * Change test runner from ThreadRunner to ParallelRunner in docker test command * Change test runner from ParallelRunner to ThreadRunner in docker test command * Add runbook to count MLflow experiments from the last 3 months - Implemented a Jupyter notebook that connects to the MLflow tracking server. - Added steps for GKE authentication, service discovery, and port-forwarding. - Included functionality to query and analyze experiments and runs from the last 3 months. - Summarized results by experiment and status, with options for detailed statistics and CSV export. * docs: add LiteLLM New Provider Guide and update usage documentation (#1964) * Add xgboost * UV lock * Bump uv lock * Correct trails to trial * trails to trials --------- Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com> Co-authored-by: Kevin Schaper <kevinschaper@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>

Dashing-Nelson self-assigned this Nov 24, 2025

Dashing-Nelson requested a review from a team as a code owner November 24, 2025 11:01

Dashing-Nelson requested a review from pascalwhoop November 24, 2025 11:01

Dashing-Nelson had a problem deploying to dev November 24, 2025 11:02 — with GitHub Actions Error

Dashing-Nelson added 3 commits November 24, 2025 11:02

Change test runner from ThreadRunner to ParallelRunner in docker test…

c1df0fd

… command

Change test runner from ParallelRunner to ThreadRunner in docker test…

8daf3e6

… command

Dashing-Nelson force-pushed the nelson/aip-611-determine-number-of-experimentsruns-in-last-3-months branch from 6cf404b to 81223c0 Compare November 24, 2025 11:02

Dashing-Nelson requested a review from JacquesVergine November 24, 2025 11:02

Dashing-Nelson temporarily deployed to dev November 24, 2025 11:03 — with GitHub Actions Inactive

JacquesVergine approved these changes Nov 26, 2025

View reviewed changes

Dashing-Nelson enabled auto-merge (squash) November 26, 2025 17:01

Merge branch 'main' into nelson/aip-611-determine-number-of-experimen…

89e74f2

…tsruns-in-last-3-months

Dashing-Nelson merged commit 9c4e3da into main Nov 26, 2025
8 checks passed

Dashing-Nelson temporarily deployed to dev November 26, 2025 17:01 — with GitHub Actions Inactive

Dashing-Nelson deleted the nelson/aip-611-determine-number-of-experimentsruns-in-last-3-months branch November 26, 2025 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nelson/aip 611 determine number of experimentsruns in last 3 months#1963

Nelson/aip 611 determine number of experimentsruns in last 3 months#1963
Dashing-Nelson merged 4 commits intomainfrom
nelson/aip-611-determine-number-of-experimentsruns-in-last-3-months

Dashing-Nelson commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Dashing-Nelson commented Nov 24, 2025

Description of the changes

Fixes / Resolves the following issues:

Checklist:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants