-
Notifications
You must be signed in to change notification settings - Fork 4
build(medcat): CU-869aujr7h Add nightly workflow to check library stability #171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
75e1775
68deffd
63e7eea
aaf9906
86698af
b3b955a
99042b0
9ad4a9e
07d072e
606769d
ecc18de
ad6eb74
0138827
3dd38f4
4cc196f
2f5beb7
aeabbaa
112d3f9
53eee06
7b84d9a
ef5c3e5
8817043
f201270
ea89d14
2347a6c
2dbce7f
3976a51
663a20c
ff56c75
242a02f
c337428
62da42b
ed8426e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| name: MedCAT-nightly-stability-check | ||
| permissions: | ||
| contents: read | ||
| on: | ||
| schedule: | ||
| - cron: "0 3 * * *" # every day at 3am UTC | ||
| workflow_dispatch: # allow manual runs | ||
| pull_request: | ||
| paths: | ||
| - ".github/workflows/medcat-v2-lib-stabiliy.yml" | ||
|
|
||
| defaults: | ||
| run: | ||
| working-directory: ./medcat-v2 | ||
|
|
||
|
|
||
| jobs: | ||
| test: | ||
| runs-on: ${{ matrix.os }} | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| os: [ubuntu-latest, macos-latest, windows-latest] | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From a design point of view: I'm thinking in general you can make this as close to the _main one as possible? Keeping it simple and easy to maintain/ (Great to try all the other OS's just in this not main though - so this is one diff that is good) But taking this example for the Test step, can we make them completely identical? Either change main or this one - as I dont want to have to think about if timeout-minutes is better than timeout (as an example of keeping it simple) In here: Vs in main
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only reason the specific example was changed was because that didn't work for the Windows runner since it didn't have the From a brief look we should be able to fully modularise this. I.e have the normal workflow steps (linting, typing, tests) in a separate workflow file (that never runs on its own) and we use it for both the main workflow and this one, but with slightly different inputs. I think it's probably worth doing. But I'd leave it for another task. |
||
| python-version: [ "3.10", "3.11", "3.12", "3.13"] | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - uses: astral-sh/setup-uv@v3 | ||
| with: | ||
| python-version: ${{ matrix.python-version }} | ||
|
|
||
| - name: Install with latest deps | ||
| shell: bash | ||
| run: | | ||
| uv run --python ${{ matrix.python-version }} python -m ensurepip | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey as wildcard one for another time (if ever) This one reads the pyproject file and the python in this folder right? So it kind of says "the github project is still working based on what is in main" Are we able to go a layer higher and do something extra to say "the pypi library is working". Or in other words really test for "Can users probably install and use Medcat today?" With some super easy test like
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this reads the dependencies for We could add something to test PyPI based install as well. But I'm not sure it adds a lot. It sounds like testing PyPI's infrastucture and/or wheel mechanics at that point. And I feel like that's someone else's responsibility. Perhaps a better option for this would be an isolated test in the production / release workflow? Something tests that installation of the wheel does in fact work as expected.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I'm more coming from the perspective of what is actually being verified: Right now there's nothing that checks for the case of "I followed your readme instructions today and it worked". This test as is here starts with "I checked out your main branch... and ran unit tests" which isn't quite the same thing Interestingly, back when it was pinned versions it did actually test for it in a way - at least the tutorial/service tests verified it I think. Maybe this change is really where the gap was introduced, and the uv.lock part has confused it. As a suggestion - if alongside your test here we also made tutorials run nightly but from the pinned version, does it totally solve for the uv.lock concern, as well as be the full user facing test? |
||
| uv run --python ${{ matrix.python-version }} python -m pip install --upgrade pip | ||
| # install cpu-only torch for MacOS | ||
| if [[ "$RUNNER_OS" == "macOS" ]]; then | ||
| uv run --python ${{ matrix.python-version }} python -m pip install torch --index-url https://download.pytorch.org/whl/cpu | ||
| fi | ||
| uv run --python ${{ matrix.python-version }} python -m pip install ".[spacy,deid,meta-cat,rel-cat,dict-ner,dev]" | ||
| - name: Check types | ||
| run: | | ||
| uv run --python ${{ matrix.python-version }} python -m mypy --follow-imports=normal medcat | ||
| - name: Ruff linting | ||
| run: | | ||
| uv run --python ${{ matrix.python-version }} python -m ruff check medcat --preview | ||
| - name: Test | ||
| run: | | ||
| uv run --python ${{ matrix.python-version }} python -m unittest discover | ||
| timeout-minutes: 45 | ||
|
|
||
| - name: Model regression | ||
| run: | | ||
| uv run --python ${{ matrix.python-version }} bash tests/backwards_compatibility/run_current.sh | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,6 +36,10 @@ | |
| cnf.general.nlp.provider = 'spacy' | ||
|
|
||
|
|
||
| def is_macos_on_ci() -> bool: | ||
| return os.getenv("RUNNER_OS", "None").lower() != "macos" | ||
|
|
||
|
|
||
| def _get_def_cdb(): | ||
| return CDB(config=cnf) | ||
|
|
||
|
|
@@ -112,13 +116,16 @@ def _train_model_once() -> tuple[tuple[Any, Any, Any], deid.DeIdModel]: | |
| return retval, model | ||
|
|
||
|
|
||
| _TRAINED_MODEL_AND_INFO = _train_model_once() | ||
| if is_macos_on_ci(): | ||
| _TRAINED_MODEL_AND_INFO = _train_model_once() | ||
|
|
||
|
|
||
| def train_model_once() -> tuple[tuple[Any, Any, Any], deid.DeIdModel]: | ||
| return _TRAINED_MODEL_AND_INFO | ||
|
|
||
|
|
||
| @unittest.skipIf(not is_macos_on_ci(), | ||
| "MacOS on workflow doesn't have enough memory") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So minor - could you alternatively find if there is enough available memory to a system? As I'm guessing this change will stop the test running on your mac locally as well
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I could check for available memory. But I don't know exactly what the necessary memory is. And as such, I didn't want to put in a number that I didn't trust. With that said, because the method checks the |
||
| class DeIDModelTests(unittest.TestCase): | ||
| save_folder = os.path.join("results", "final_model") | ||
|
|
||
|
|
@@ -171,6 +178,8 @@ def test_add_new_concepts(self): | |
| ''' # noqa | ||
|
|
||
|
|
||
| @unittest.skipIf(not is_macos_on_ci(), | ||
| "MacOS on workflow doesn't have enough memory") | ||
| class DeIDModelWorks(unittest.TestCase): | ||
| save_folder = os.path.join("results", "final_model") | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to try this for now at 3am to see what it is like. Personally might suggest doing it during working hours? Or just before 9am anyway. Trade off between when having the email come in is useful, vs slowing down other builds...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I wanted it overnight is so that it doesn't disturb other work. This spawns 4x3=12 workflows, all of which take quite a while (20-30 minutes). And if this were to happen during work hours, it migth cause workflows for active work to be queued. So I figured I'd run it at night - when no other work is happening - and deal with it in the morning if I have to.