Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
75e1775
CU-869aujr7h: Add nightly workflow to check library stability
mart-r Oct 14, 2025
68deffd
CU-869aujr7h: Update working directory in new workflow
mart-r Oct 14, 2025
63e7eea
CU-869aujr7h: Update comment in new workflow
mart-r Oct 14, 2025
aaf9906
CU-869aujr7h: Disallow incompatible transformers version
mart-r Oct 14, 2025
86698af
CU-869aujr7h: Fix worklflow install / sync
mart-r Oct 14, 2025
b3b955a
CU-869aujr7h: Make worklflow only have read permissions
mart-r Oct 14, 2025
99042b0
CU-869aujr7h: Install without lock
mart-r Oct 14, 2025
9ad4a9e
CU-869aujr7h: Use non-uv pip for lock-free install
mart-r Oct 14, 2025
07d072e
CU-869aujr7h: Force usage of correct python version in workflow
mart-r Oct 14, 2025
606769d
CU-869aujr7h: Fix versions in workflow (3.10 instead of 3.1)
mart-r Oct 14, 2025
ecc18de
Typing fix for regression utils
mart-r Oct 10, 2025
ad6eb74
Typing fix for modern bert RelCAT
mart-r Oct 10, 2025
0138827
Merge branch 'main' into build/medcat/CU-869aujr7h-add-library-stabil…
mart-r Oct 14, 2025
3dd38f4
CU-869aujr7h: Change the way tests timeout is set up
mart-r Oct 14, 2025
4cc196f
CU-869aujr7h: Attempt to fix builds on Windows by ignoring Windows + …
mart-r Oct 14, 2025
2f5beb7
Merge branch 'main' into build/medcat/CU-869aujr7h-add-library-stabil…
mart-r Oct 14, 2025
aeabbaa
CU-869aujr7h: Remove python 3.9 from matrix
mart-r Oct 14, 2025
112d3f9
CU-869aujr7h: Attempt fix mock for Windows
mart-r Oct 14, 2025
53eee06
CU-869aujr7h: Use CPU-only torch for MacOS in workflow to avoid MPS i…
mart-r Oct 14, 2025
7b84d9a
CU-869aujr7h: Force installation to happen through bash so IF works o…
mart-r Oct 14, 2025
ef5c3e5
Merge branch 'main' into build/medcat/CU-869aujr7h-add-library-stabil…
mart-r Oct 15, 2025
8817043
CU-869aujr7h: Add 3.13 for lib stability workflow
mart-r Oct 15, 2025
f201270
CU-869aujr7h: [NEEDS TO BE REVERTED] Only run on MacOS and Windows on…
mart-r Oct 15, 2025
ea89d14
CU-869aujr7h: Allow 45 minutes for tests so tests on MacOS don't time…
mart-r Oct 15, 2025
2347a6c
CU-869aujr7h: Use temporary directory instead of named temp file for …
mart-r Oct 15, 2025
2dbce7f
CU-869aujr7h: Avoid heavy RAM tests (DeID) on MacOS during CI
mart-r Oct 15, 2025
3976a51
CU-869aujr7h: Ignore further tests for MacOS runner
mart-r Oct 15, 2025
663a20c
CU-869aujr7h: Make component tests more flexible
mart-r Oct 15, 2025
ff56c75
CU-869aujr7h: Fix test skip method call
mart-r Oct 16, 2025
242a02f
Revert "CU-869aujr7h: [NEEDS TO BE REVERTED] Only run on MacOS and Wi…
mart-r Oct 16, 2025
c337428
CU-869aujr7h: Remove push-specific workflow triggers
mart-r Oct 16, 2025
62da42b
CU-869avau57: Require numpy 2.1 or above for python 3.13
mart-r Oct 16, 2025
ed8426e
CU-869aujr7h: Rename helper method to avoid heavy RAM tests on MacOS …
mart-r Oct 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions .github/workflows/medcat-v2-lib-stabiliy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: MedCAT-nightly-stability-check
permissions:
contents: read
on:
schedule:
- cron: "0 3 * * *" # every day at 3am UTC
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to try this for now at 3am to see what it is like. Personally might suggest doing it during working hours? Or just before 9am anyway. Trade off between when having the email come in is useful, vs slowing down other builds...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I wanted it overnight is so that it doesn't disturb other work. This spawns 4x3=12 workflows, all of which take quite a while (20-30 minutes). And if this were to happen during work hours, it migth cause workflows for active work to be queued. So I figured I'd run it at night - when no other work is happening - and deal with it in the morning if I have to.

workflow_dispatch: # allow manual runs
pull_request:
paths:
- ".github/workflows/medcat-v2-lib-stabiliy.yml"

defaults:
run:
working-directory: ./medcat-v2


jobs:
test:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a design point of view:

I'm thinking in general you can make this as close to the _main one as possible? Keeping it simple and easy to maintain/

(Great to try all the other OS's just in this not main though - so this is one diff that is good)

But taking this example for the Test step, can we make them completely identical? Either change main or this one - as I dont want to have to think about if timeout-minutes is better than timeout (as an example of keeping it simple)

In here:

      - name: Test
        run: |
          uv run --python ${{ matrix.python-version }} python -m unittest discover
        timeout-minutes: 45

Vs in main

      - name: Test
        run: |
          timeout 30m uv run python -m unittest discover

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason the specific example was changed was because that didn't work for the Windows runner since it didn't have the timeout command available. And I didn't think it was within the scope here to change the main workflow to be in line with this change.

From a brief look we should be able to fully modularise this. I.e have the normal workflow steps (linting, typing, tests) in a separate workflow file (that never runs on its own) and we use it for both the main workflow and this one, but with slightly different inputs. I think it's probably worth doing. But I'd leave it for another task.

python-version: [ "3.10", "3.11", "3.12", "3.13"]

steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v3
with:
python-version: ${{ matrix.python-version }}

- name: Install with latest deps
shell: bash
run: |
uv run --python ${{ matrix.python-version }} python -m ensurepip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey as wildcard one for another time (if ever)

This one reads the pyproject file and the python in this folder right? So it kind of says "the github project is still working based on what is in main"

Are we able to go a layer higher and do something extra to say "the pypi library is working". Or in other words really test for "Can users probably install and use Medcat today?"

uv pip install medcat # So actually bring in the latest pypi version
uv run some-easy-test

With some super easy test like

x = CAT.load_model_pack(" ")
result = x.get_entities("test")
assert whatever

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this reads the dependencies for pyproject.toml and installs stuff based on that.

We could add something to test PyPI based install as well. But I'm not sure it adds a lot. It sounds like testing PyPI's infrastucture and/or wheel mechanics at that point. And I feel like that's someone else's responsibility.

Perhaps a better option for this would be an isolated test in the production / release workflow? Something tests that installation of the wheel does in fact work as expected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm more coming from the perspective of what is actually being verified:

Right now there's nothing that checks for the case of "I followed your readme instructions today and it worked". This test as is here starts with "I checked out your main branch... and ran unit tests" which isn't quite the same thing

Interestingly, back when it was pinned versions it did actually test for it in a way - at least the tutorial/service tests verified it I think. Maybe this change is really where the gap was introduced, and the uv.lock part has confused it.

As a suggestion - if alongside your test here we also made tutorials run nightly but from the pinned version, does it totally solve for the uv.lock concern, as well as be the full user facing test?

uv run --python ${{ matrix.python-version }} python -m pip install --upgrade pip
# install cpu-only torch for MacOS
if [[ "$RUNNER_OS" == "macOS" ]]; then
uv run --python ${{ matrix.python-version }} python -m pip install torch --index-url https://download.pytorch.org/whl/cpu
fi
uv run --python ${{ matrix.python-version }} python -m pip install ".[spacy,deid,meta-cat,rel-cat,dict-ner,dev]"
- name: Check types
run: |
uv run --python ${{ matrix.python-version }} python -m mypy --follow-imports=normal medcat
- name: Ruff linting
run: |
uv run --python ${{ matrix.python-version }} python -m ruff check medcat --preview
- name: Test
run: |
uv run --python ${{ matrix.python-version }} python -m unittest discover
timeout-minutes: 45

- name: Model regression
run: |
uv run --python ${{ matrix.python-version }} bash tests/backwards_compatibility/run_current.sh
10 changes: 7 additions & 3 deletions medcat-v2/medcat/utils/cdb_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from typing import TypedDict, cast
import tempfile
import dill
import os

from copy import deepcopy

Expand Down Expand Up @@ -216,7 +217,10 @@ def on_disk_memory_capture(cdb):
Yields:
None
"""
with tempfile.NamedTemporaryFile() as tf:
save_cdb_state(cdb, tf.name)
# NOTE: using temporary directory so that it also works on Windows
# otherwise you can't reopen a temporary file in Windows (apparently)
with tempfile.TemporaryDirectory() as temp_dir:
temp_file_name = os.path.join(temp_dir, "cdb_state.dat")
save_cdb_state(cdb, temp_file_name)
yield
load_and_apply_cdb_state(cdb, tf.name)
load_and_apply_cdb_state(cdb, temp_file_name)
7 changes: 6 additions & 1 deletion medcat-v2/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,8 @@ classifiers = [
# For an analysis of this field vs pip's requirements files see:
# https://packaging.python.org/discussions/install-requires-vs-requirements/
dependencies = [ # Optional
"numpy>2.0",
"numpy>=2.1; python_version >= '3.13'",
"numpy>=2.0; python_version < '3.13'",
"dill",
"pandas>=2.2,<3.0",
"tqdm>=4.64,<5.0",
Expand Down Expand Up @@ -102,6 +103,8 @@ dict_ner = [
]
deid = [
"datasets>=2.2.2,<3.0.0",
# Transformers 4.57 doesn't support 3.9
"transformers!=4.57.0; python_version == '3.9'",
"transformers>=4.41.0,<5.0", # avoid major bump
# Transformers 4.57 doesn't support 3.9
"transformers!=4.57.0; python_version == '3.9'",
Expand All @@ -112,6 +115,8 @@ deid = [
"scipy>=1.14; python_version >= '3.13'",
]
rel_cat = [
# Transformers 4.57 doesn't support 3.9
"transformers!=4.57.0; python_version == '3.9'",
"transformers>=4.41.0,<5.0", # avoid major bump
# Transformers 4.57 doesn't support 3.9
"transformers!=4.57.0; python_version == '3.9'",
Expand Down
23 changes: 21 additions & 2 deletions medcat-v2/tests/components/helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,32 @@ def setUpClass(cls):
cls.vtokenizer = FTokenizer()
cls.comp_cnf: ComponentConfig = getattr(
cls.cnf.components, cls.comp_type.name)
if isinstance(cls.default_creator, Type):
cls._def_creator_name_opts = (cls.default_creator.__name__,)
else:
# classmethod
cls._def_creator_name_opts = (".".join((
# etiher class.method_name
cls.default_creator.__self__.__name__,
cls.default_creator.__name__)),
# or just method_name
cls.default_creator.__name__
)

def test_has_default(self):
avail_components = types.get_registered_components(self.comp_type)
self.assertEqual(len(avail_components), self.expected_def_components)
name, cls_name = avail_components[0]
self.assertEqual(name, self.default)
self.assertIs(cls_name, self.default_creator.__name__)
# 1 name / cls name
eq_name = [name == self.default for name, _ in avail_components]
eq_cls = [cls_name in self._def_creator_name_opts
for _, cls_name in avail_components]
self.assertEqual(sum(eq_name), 1)
# NOTE: for NER both the default as well as the Dict based NER
# have the came class name, so may be more than 1
self.assertGreaterEqual(sum(eq_cls), 1)
# needs to have the same class where name is equal
self.assertTrue(eq_cls[eq_name.index(True)])

def test_can_create_def_component(self):
component = types.create_core_component(
Expand Down
5 changes: 4 additions & 1 deletion medcat-v2/tests/components/ner/trf/test_transformers_ner.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,13 @@
from medcat.model_creation.cdb_maker import CDBMaker
from transformers import TrainerCallback

from unittest import TestCase
from unittest import TestCase, skipIf
import unittest.mock

from ...addons.meta_cat.test_meta_cat import FakeTokenizer
from ....pipeline.test_pipeline import FakeCDB, Config
from .... import RESOURCES_PATH
from ....utils.ner.test_deid import is_macos_on_ci


class TransformersNERTests(TestCase):
Expand Down Expand Up @@ -280,6 +281,8 @@ def test_ignore_extra_labels(self):
)


@skipIf(not is_macos_on_ci(),
"MacOS on workflow doesn't have enough memory")
class AdditionalTransfromersNERTests(TestCase):
TOKENIZER = FakeTokenizer()
CNF = ConfigTransformersNER()
Expand Down
11 changes: 10 additions & 1 deletion medcat-v2/tests/utils/ner/test_deid.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@
cnf.general.nlp.provider = 'spacy'


def is_macos_on_ci() -> bool:
return os.getenv("RUNNER_OS", "None").lower() != "macos"


def _get_def_cdb():
return CDB(config=cnf)

Expand Down Expand Up @@ -112,13 +116,16 @@ def _train_model_once() -> tuple[tuple[Any, Any, Any], deid.DeIdModel]:
return retval, model


_TRAINED_MODEL_AND_INFO = _train_model_once()
if is_macos_on_ci():
_TRAINED_MODEL_AND_INFO = _train_model_once()


def train_model_once() -> tuple[tuple[Any, Any, Any], deid.DeIdModel]:
return _TRAINED_MODEL_AND_INFO


@unittest.skipIf(not is_macos_on_ci(),
"MacOS on workflow doesn't have enough memory")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So minor - could you alternatively find if there is enough available memory to a system? As I'm guessing this change will stop the test running on your mac locally as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I could check for available memory. But I don't know exactly what the necessary memory is. And as such, I didn't want to put in a number that I didn't trust.

With that said, because the method checks the RUNNER_OS environmental variable (rather than just the OS) and that isn't (normally) set on a regular system (it certainly isn't on mine), it'll allow the running of the tests locally.

class DeIDModelTests(unittest.TestCase):
save_folder = os.path.join("results", "final_model")

Expand Down Expand Up @@ -171,6 +178,8 @@ def test_add_new_concepts(self):
''' # noqa


@unittest.skipIf(not is_macos_on_ci(),
"MacOS on workflow doesn't have enough memory")
class DeIDModelWorks(unittest.TestCase):
save_folder = os.path.join("results", "final_model")

Expand Down
11 changes: 6 additions & 5 deletions medcat-v2/tests/utils/test_cdb_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,18 +113,19 @@ def test_state_restored(self):

class StateSavedOnDiskTests(StateSavedTests):
on_disk = True
_named_tempory_file = tempfile.NamedTemporaryFile
_named_tempory_directory = tempfile.TemporaryDirectory

@classmethod
def saved_name_temp_file(cls):
tf = cls._named_tempory_file()
cls.temp_file_name = tf.name
tf = cls._named_tempory_directory()
cls.temp_file_name = os.path.join(tf.name, "cdb_state.dat")
return tf

@classmethod
def setUpClass(cls) -> None:
with mock.patch("builtins.open", side_effect=open) as cls.popen:
with mock.patch("tempfile.NamedTemporaryFile",
with mock.patch("medcat.utils.cdb_state.open", side_effect=open
) as cls.popen:
with mock.patch("tempfile.TemporaryDirectory",
side_effect=cls.saved_name_temp_file) as cls.pntf:
return super().setUpClass()

Expand Down
Loading