Skip to content

enlarge disk size#1978

Merged
pascalwhoop merged 1 commit intomainfrom
hotfix/PVC
Dec 3, 2025
Merged

enlarge disk size#1978
pascalwhoop merged 1 commit intomainfrom
hotfix/PVC

Conversation

@pascalwhoop
Copy link
Copy Markdown
Contributor

No description provided.

@pascalwhoop pascalwhoop merged commit 9a61dbf into main Dec 3, 2025
9 checks passed
@pascalwhoop pascalwhoop deleted the hotfix/PVC branch December 3, 2025 18:13
JacquesVergine pushed a commit that referenced this pull request Dec 4, 2025
JacquesVergine added a commit that referenced this pull request Dec 17, 2025
* Rename GT edges' subject_ec_id to source_ec_id

* Update test and function to use pyspark

* fix typo

* feat:  docker cloud build for matrix image (#1822)

* feat:  docker cloud build for matrix image

* Refactor cloud build configuration to use dynamic GCP project and environment variables

* Update GCP project name in Makefile for production environment

* refactor: update build_push_docker to handle environment-specific image pushes

* fix: update docker_cloud_build substitutions to use TAG variable

* doc: clarify build_push_docker function description to specify cloud image building

---------

Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>
Co-authored-by: Nelson Alfonso <nelson_alfonso@icloud.com>
Co-authored-by: Jacques Vergine <jacques.vergine35@gmail.com>

* Fix EC clinical trial ingestion file to parquet (#1972)

* Update pipelines/matrix/src/matrix/pipelines/matrix_generation/nodes.py

Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>

* Remove KG and pandas dataset now that we've movzd to spark

* Update pipelines/matrix/src/matrix/pipelines/matrix_generation/nodes.py

Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>

* Refactor flag adding in functions

* bump

* Fix off label bug

* Fix source and target regexp to only include "source_" + number columns

* Set spark checkpoint

* Migrate old disease list code to Kedro (v2) (#1943)

Migrated the matrix-disease-list repo functionality to kedro with extensive refactoring (see PR)

---------

Co-authored-by: Jacques Vergine <jacques.vergine35@gmail.com>

* enlarge disk size (#1978)

* fix: update OpenAI key reference (#1968)

* fix: update OpenAI key reference and set target revision for Helm release

* feat: enhance PostgreSQL pooler parameters for improved performance

* Fix docker cloud build  (#1977)

* Revert "feat:  docker cloud build for matrix image (#1822)"

This reverts commit 09bf859.

* bump uv lock

* Set spark checkpoint directory (#1979)

* Set spark checkpoint

* Bump secrets

* Fix spark utils

* Drop duplicates

* Change data leakage test to happen on EC_id

* WIP - Update run comparison parameters

* Update run comparison file paths

* Remove ec_id joins

* Fix off label join (thanks Piotr)

* Bump off label

* Bump to trigger CI again

---------

Co-authored-by: Pascal Bro <pascal@everycure.org>
Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>
Co-authored-by: Nelson Alfonso <nelson_alfonso@icloud.com>
Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>
Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>
JacquesVergine added a commit that referenced this pull request Dec 18, 2025
* Rename GT edges' subject_ec_id to source_ec_id

* Update test and function to use pyspark

* fix typo

* feat:  docker cloud build for matrix image (#1822)

* feat:  docker cloud build for matrix image

* Refactor cloud build configuration to use dynamic GCP project and environment variables

* Update GCP project name in Makefile for production environment

* refactor: update build_push_docker to handle environment-specific image pushes

* fix: update docker_cloud_build substitutions to use TAG variable

* doc: clarify build_push_docker function description to specify cloud image building

---------

Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>
Co-authored-by: Nelson Alfonso <nelson_alfonso@icloud.com>
Co-authored-by: Jacques Vergine <jacques.vergine35@gmail.com>

* Fix EC clinical trial ingestion file to parquet (#1972)

* Update pipelines/matrix/src/matrix/pipelines/matrix_generation/nodes.py

Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>

* Remove KG and pandas dataset now that we've movzd to spark

* Update pipelines/matrix/src/matrix/pipelines/matrix_generation/nodes.py

Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>

* Refactor flag adding in functions

* bump

* Fix off label bug

* Fix source and target regexp to only include "source_" + number columns

* Set spark checkpoint

* Migrate old disease list code to Kedro (v2) (#1943)

Migrated the matrix-disease-list repo functionality to kedro with extensive refactoring (see PR)

---------

Co-authored-by: Jacques Vergine <jacques.vergine35@gmail.com>

* enlarge disk size (#1978)

* fix: update OpenAI key reference (#1968)

* fix: update OpenAI key reference and set target revision for Helm release

* feat: enhance PostgreSQL pooler parameters for improved performance

* Fix docker cloud build  (#1977)

* Revert "feat:  docker cloud build for matrix image (#1822)"

This reverts commit 09bf859.

* bump uv lock

* Set spark checkpoint directory (#1979)

* Set spark checkpoint

* Bump secrets

* Fix spark utils

* Drop duplicates

* Change data leakage test to happen on EC_id

* WIP - Update run comparison parameters

* Update run comparison file paths

* Remove ec_id joins

* Fix off label join (thanks Piotr)

* Bump off label

* Bump to trigger CI again

* Add JaM to matrix pipeline

* Update jam columns

* Rename ec_ground_truth to medic_ground_truth

* Remove on_or_off label

---------

Co-authored-by: Pascal Bro <pascal@everycure.org>
Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>
Co-authored-by: Nelson Alfonso <nelson_alfonso@icloud.com>
Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>
Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>
Dashing-Nelson pushed a commit that referenced this pull request Dec 18, 2025
Dashing-Nelson added a commit that referenced this pull request Dec 18, 2025
* Rename GT edges' subject_ec_id to source_ec_id

* Update test and function to use pyspark

* fix typo

* feat:  docker cloud build for matrix image (#1822)

* feat:  docker cloud build for matrix image

* Refactor cloud build configuration to use dynamic GCP project and environment variables

* Update GCP project name in Makefile for production environment

* refactor: update build_push_docker to handle environment-specific image pushes

* fix: update docker_cloud_build substitutions to use TAG variable

* doc: clarify build_push_docker function description to specify cloud image building

---------

Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>
Co-authored-by: Nelson Alfonso <nelson_alfonso@icloud.com>
Co-authored-by: Jacques Vergine <jacques.vergine35@gmail.com>

* Fix EC clinical trial ingestion file to parquet (#1972)

* Update pipelines/matrix/src/matrix/pipelines/matrix_generation/nodes.py

Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>

* Remove KG and pandas dataset now that we've movzd to spark

* Update pipelines/matrix/src/matrix/pipelines/matrix_generation/nodes.py

Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>

* Refactor flag adding in functions

* bump

* Fix off label bug

* Fix source and target regexp to only include "source_" + number columns

* Set spark checkpoint

* Migrate old disease list code to Kedro (v2) (#1943)

Migrated the matrix-disease-list repo functionality to kedro with extensive refactoring (see PR)

---------

Co-authored-by: Jacques Vergine <jacques.vergine35@gmail.com>

* enlarge disk size (#1978)

* fix: update OpenAI key reference (#1968)

* fix: update OpenAI key reference and set target revision for Helm release

* feat: enhance PostgreSQL pooler parameters for improved performance

* Fix docker cloud build  (#1977)

* Revert "feat:  docker cloud build for matrix image (#1822)"

This reverts commit 09bf859.

* bump uv lock

* Set spark checkpoint directory (#1979)

* Set spark checkpoint

* Bump secrets

* Fix spark utils

* Drop duplicates

* Change data leakage test to happen on EC_id

* WIP - Update run comparison parameters

* Update run comparison file paths

* Remove ec_id joins

* Fix off label join (thanks Piotr)

* Bump off label

* Bump to trigger CI again

---------

Co-authored-by: Pascal Bro <pascal@everycure.org>
Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>
Co-authored-by: Nelson Alfonso <nelson_alfonso@icloud.com>
Co-authored-by: Alexei Stepanenko <alexei.stepa@gmail.com>
Co-authored-by: Nico Matentzoglu <nicolas.matentzoglu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants