Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: test docker CIL #838

Merged
merged 11 commits into from
Oct 13, 2023
Merged

CI: test docker CIL #838

merged 11 commits into from
Oct 13, 2023

Conversation

@casperdcl casperdcl self-assigned this Sep 12, 2023
@casperdcl
Copy link
Member Author

@paskino I'm not sure this is actually required, since we now have RUN_CTEST

if [ "$RUN_CTEST" = 1 ]; then

paskino
paskino previously approved these changes Sep 12, 2023
Copy link
Contributor

@paskino paskino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's very good. It picks the version from version_config.cmake and uses that.

@KrisThielemans
Copy link
Member

I think it's very good. It picks the version from version_config.cmake and uses that.

but we're getting CIL via conda...

@KrisThielemans
Copy link
Member

Error is

++ docker run --rm synerbi/sirf:service python -c 'import cil; print(cil.version.commit_hash[1:])'
error: exec: "python": executable file not found in $PATH
+ DEFAULT_CIL_TAG='Creating sirfuser:1000:1000
Adding user `sirfuser'\'' to group `users'\'' ...
Adding user sirfuser to group users
Done.
Updating file ownership for /home/sirfuser
Switching to sirfuser and executing python -c import cil; print(cil.version.commit_hash[1:])'

So, aside from not finding python, the docker run adds lots of other things to the output. I guess it could be avoided by doing a dummy docker run first, such that sirfuser is already created. Sigh

@casperdcl
Copy link
Member Author

btw how urgent is fixing this?

I was thinking of reworking/simplifing the docker stuff... but I expect it would take some time to get a more robust solution working.

@paskino
Copy link
Contributor

paskino commented Sep 18, 2023

We would like to merge this in and then shortly release SIRF 3.5.0 with relevant tested docker images.

@KrisThielemans
Copy link
Member

Or we don't merge it in, and test the docker images manually.

mkdir cil_sirf_test
touch cil_sirf_test/__init__.py
# DEFAULT_CIL_TAG=$(sed -nr 's/^\s*set\(DEFAULT_CIL_TAG (\w+)\)/\1/p' version_config.cmake)
DEFAULT_CIL_TAG=$(docker run --rm synerbi/sirf:service /opt/conda/bin/python -c 'import cil; print(cil.version.commit_hash[1:])' | tail -n1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to CIL_TAG or CIL_HASH

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to leave it as DEFAULT_CIL_TAG because that's what it's called in version_config.cmake (so it's easy to grep for all usages in future)... wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, the reason for this fix is that it might be different from the DEFAULT_CIL_TAG if we install from conda (otherwise we could have simple used version_config.cmake)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put another way, it is the "actual" CIL Tag

- name: Build docker image
# Builds docker image from Docker file.
shell: bash -l {0}
run: |
set -ex;
# ${{ env.compose_command }} pull sirf
${{ env.compose_command }} build --pull sirf


- name: Test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Test
- name: Test CIL

@casperdcl
Copy link
Member Author

casperdcl commented Sep 20, 2023

first problem: cil.version is borked. Hack for now:

docker run --rm -it synerbi/sirf:service /opt/conda/bin/python
>>> import cil
>>> cil.version.version
'23.0.1'
>>> cil.version.commit_hash  # borked
'-1'
>>> f'v{cil.version.version}' if cil.version.commit_hash == '-1' else cil.version.commit_hash
'v23.0.1'

second problem: sirf isn't found

docker run --rm -v ./cil_sirf_test:/cil_sirf_test synerbi/sirf:service /opt/conda/bin/python -m unittest discover -v /cil_sirf_test
...
----------------------------------------------------------------------
TEST SYSTEM CONFIGURATION
CIL version:  23.0.1
{'has_astra': True,
 'has_ccpi_regularisation': False,
 'has_cvxpy': False,
 'has_ipp': True,
 'has_numba': True,
 'has_nvidia': False,
 'has_tigre': True,
 'has_tomophantom': False}
----------------------------------------------------------------------

test_Gradient (test_SIRF.TestGradientMR_2D) ... skipped 'Skipping as SIRF is not available'
test_TVdenoisingMR (test_SIRF.TestGradientMR_2D) ... skipped 'Has SIRF'
test_Gradient (test_SIRF.TestGradientPET_2D) ... skipped 'Skipping as SIRF is not available'
test_Gradient (test_SIRF.TestGradientPET_3D) ... skipped 'Skipping as SIRF is not available'
test_BlockDataContainer_with_SIRF_DataContainer_add (test_SIRF.TestSIRFCILIntegration) ... skipped 'Has SIRF'
test_BlockDataContainer_with_SIRF_DataContainer_divide (test_SIRF.TestSIRFCILIntegration) ... skipped 'Has SIRF'
test_BlockDataContainer_with_SIRF_DataContainer_multiply (test_SIRF.TestSIRFCILIntegration) ... skipped 'Has SIRF'
test_BlockDataContainer_with_SIRF_DataContainer_subtract (test_SIRF.TestSIRFCILIntegration) ... skipped 'Has SIRF'

----------------------------------------------------------------------
Ran 8 tests in 0.000s

OK (skipped=8)

third problem: PYTHON_INSTALL_DIR should be replaced with /opt/conda but for some reason isn't

docker run --rm synerbi/sirf:service cat .bashrc | grep activate
[ -f PYTHON_INSTALL_DIR/bin/activate ] && . PYTHON_INSTALL_DIR/bin/activate

hack for now:

docker run --rm -v ./cil_sirf_test:/cil_sirf_test synerbi/sirf:service bash --login -c '/opt/conda/bin/python -m unittest discover -v /cil_sirf_test'

fourth problem: gadgetron isn't running

Started reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%..
Finished reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
connection to gadgetron server failed, trying again...
...
connection to gadgetron server failed
ERROR
...
----------------------------------------------------------------------
Ran 8 tests in 7.271s

FAILED (errors=6)

hack for now:

#file: ./cil_sirf_test/run.sh
this=$(dirname "${BASH_SOURCE[0]}")
$SIRF_PATH/../../INSTALL/bin/gadgetron >& ~/gadgetron.log&
/opt/conda/bin/python -m unittest discover -v $this
for i in $(jobs -p); do kill -n 15 $i; done 2>/dev/null
docker run --rm -v ./cil_sirf_test:/cil_sirf_test synerbi/sirf:service bash --login /cil_sirf_test/run.sh

fifth problem: AttributeError: 'ImageData' object has no attribute 'get_item'

Started reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%..
Finished reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
Message received with ID: 5
Input stream has terminated
Message received with ID: 5
Input stream has terminated
WARNING:root:C backend is only for arrays of datatype float32 - defaulting to `numpy` backend
ok
test_TVdenoisingMR (test_SIRF.TestGradientMR_2D) ... Started reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%..
Finished reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
Message received with ID: 5
Input stream has terminated
Message received with ID: 5
Input stream has terminated
WARNING:root:C backend is only for arrays of datatype float32 - defaulting to `numpy` backend
...
WARNING:root:C backend is only for arrays of datatype float32 - defaulting to `numpy` backend
ok
test_Gradient (test_SIRF.TestGradientPET_2D) ... ok
test_Gradient (test_SIRF.TestGradientPET_3D) ... ok
test_BlockDataContainer_with_SIRF_DataContainer_add (test_SIRF.TestSIRFCILIntegration) ... ERROR
test_BlockDataContainer_with_SIRF_DataContainer_divide (test_SIRF.TestSIRFCILIntegration) ... ERROR
test_BlockDataContainer_with_SIRF_DataContainer_multiply (test_SIRF.TestSIRFCILIntegration) ... ERROR
test_BlockDataContainer_with_SIRF_DataContainer_subtract (test_SIRF.TestSIRFCILIntegration) ... ERROR

======================================================================
ERROR: test_BlockDataContainer_with_SIRF_DataContainer_add (test_SIRF.TestSIRFCILIntegration)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/cil_sirf_test/test_SIRF.py", line 343, in test_BlockDataContainer_with_SIRF_DataContainer_add
    self.assertBlockDataContainerEqual(bdc , bdc1)
  File "/cil_sirf_test/testclass.py", line 40, in assertBlockDataContainerEqual
    self.assertBlockDataContainerEqual(container1.get_item(col),container2.get_item(col))
  File "/cil_sirf_test/testclass.py", line 34, in assertBlockDataContainerEqual
    if issubclass(container1.get_item(col).__class__, DataContainer):
AttributeError: 'ImageData' object has no attribute 'get_item'

======================================================================
ERROR: test_BlockDataContainer_with_SIRF_DataContainer_divide (test_SIRF.TestSIRFCILIntegration)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/cil_sirf_test/test_SIRF.py", line 298, in test_BlockDataContainer_with_SIRF_DataContainer_divide
    self.assertBlockDataContainerEqual(bdc , bdc1)
  File "/cil_sirf_test/testclass.py", line 40, in assertBlockDataContainerEqual
    self.assertBlockDataContainerEqual(container1.get_item(col),container2.get_item(col))
  File "/cil_sirf_test/testclass.py", line 34, in assertBlockDataContainerEqual
    if issubclass(container1.get_item(col).__class__, DataContainer):
AttributeError: 'ImageData' object has no attribute 'get_item'

======================================================================
ERROR: test_BlockDataContainer_with_SIRF_DataContainer_multiply (test_SIRF.TestSIRFCILIntegration)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/cil_sirf_test/test_SIRF.py", line 319, in test_BlockDataContainer_with_SIRF_DataContainer_multiply
    self.assertBlockDataContainerEqual(bdc , bdc1)
  File "/cil_sirf_test/testclass.py", line 40, in assertBlockDataContainerEqual
    self.assertBlockDataContainerEqual(container1.get_item(col),container2.get_item(col))
  File "/cil_sirf_test/testclass.py", line 34, in assertBlockDataContainerEqual
    if issubclass(container1.get_item(col).__class__, DataContainer):
AttributeError: 'ImageData' object has no attribute 'get_item'

======================================================================
ERROR: test_BlockDataContainer_with_SIRF_DataContainer_subtract (test_SIRF.TestSIRFCILIntegration)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/cil_sirf_test/test_SIRF.py", line 363, in test_BlockDataContainer_with_SIRF_DataContainer_subtract
    self.assertBlockDataContainerEqual(bdc , bdc1)
  File "/cil_sirf_test/testclass.py", line 40, in assertBlockDataContainerEqual
    self.assertBlockDataContainerEqual(container1.get_item(col),container2.get_item(col))
  File "/cil_sirf_test/testclass.py", line 34, in assertBlockDataContainerEqual
    if issubclass(container1.get_item(col).__class__, DataContainer):
AttributeError: 'ImageData' object has no attribute 'get_item'

----------------------------------------------------------------------
Ran 8 tests in 35.843s

FAILED (errors=4)

KMN

@KrisThielemans
Copy link
Member

I had to look that abbrev up (although I guessed correctly)!

some of these seem to tie up with the experience of #827 (comment) and other difficulties experienced there.

No idea about the get_item problem

@casperdcl casperdcl dismissed paskino’s stale review September 20, 2023 18:36

failing tests need your review

@paskino
Copy link
Contributor

paskino commented Sep 21, 2023

The get_item problem has been fixed as of TomographicImaging/CIL#1481 or better TomographicImaging/CIL#1502 which means that the minimum commit of CIL that you can use is 0ba2f8e2935db66ec3dfe8c88e407e8d0eb5609f which is exactly the hash that is in

set(DEFAULT_CIL_TAG 0ba2f8e2935db66ec3dfe8c88e407e8d0eb5609f)

Notice that the bug is in test_SIRF.py rather than in CIL itself. This mess will disappear once we tag CIL.

@casperdcl
Copy link
Member Author

casperdcl commented Sep 21, 2023

You mean the CIL version installed (v23.0.1) has nothing to do with the version specified in version_config.cmake (0ba2f8e2)? Then what's the point of the latter?

@KrisThielemans
Copy link
Member

Right, that'll be then because we get the conda version of CIL on docker. So, a few choices:

  • give up on this test for 3.5
  • set BUILD_CIL=ON and somehow prevent to install CIL via conda on top (i.e. don't use SIRF-Exercises/environment.yml at all, or edit it first.). This is an interesting problem in itself: if you build a package yourself, how do you prevent conda messing it up.
  • make a CIL-dev conda version
  • release a new version of CIL

which one do you prefer?

@KrisThielemans
Copy link
Member

You mean the CIL version installed (v23.0.1) has nothing to do with the version specified in version_config.cmake (0ba2f8e2)? Then what's the point of the latter?

see 2nd bullet above. version_config.cmake is used when using the SB with BUILD_CIL=ON. Not when doing conda install cil

@paskino
Copy link
Contributor

paskino commented Sep 21, 2023

  • give up on this test for 3.5
  • set BUILD_CIL=ON and somehow prevent to install CIL via conda on top (i.e. don't use SIRF-Exercises/environment.yml at all, or edit it first.). This is an interesting problem in itself: if you build a package yourself, how do you prevent conda messing it up.
  • make a CIL-dev conda version

This exists, the install command is about conda install cil -c ccpi/label/dev

  • release a new version of CIL

A new version of CIL should be out by the 27/9/23.

IMHO all this mess will go away once we tag and release CIL, so I suggest to wait until then.

@KrisThielemans
Copy link
Member

ok. Of course, we will have the same problem later. I think what this is saying (no surprise): either use conda or use the SuperBuild, but don't try to mix.

@casperdcl
Copy link
Member Author

casperdcl commented Sep 22, 2023

@KrisThielemans
Copy link
Member

I have no idea really, but looks interesting.

We started with the SB because lots of things were not available on various systems (Gadgetron, ISMRMD, STIR, ...). That is less and less the case (although most of these are on conda only). Another reason was to have tight version control. I'm not sure how scikit-build would resolve external dependencies.

I'm happy to go with whatever makes our life easier (but serves the needs).

This needs a serious discussion of course.

@casperdcl
Copy link
Member Author

fyi this PR is blocked on TomographicImaging/CIL#1509 :)

@KrisThielemans
Copy link
Member

that' s a long checklist!

@paskino
Copy link
Contributor

paskino commented Oct 3, 2023

This seems to need a SIRF update then... (which means branching off from 3.5 to release 3.5.1).

Let's then proceed with tagging the SB with 3.5 without this PR and tackle this later.

BTW, we are writing a list of residual incompatibilities between SIRF and CIL. We added to CIL a number of hacks, which we would like to remove, to overcome these incompatibilities, TomographicImaging/CIL#1516

@KrisThielemans
Copy link
Member

Let's then proceed with tagging the SB with 3.5 without this PR and tackle this later.

I don't know... I guess you'd want the freshly released CIL in our version_config.cmake. That'd presumably make GHA fail (or at least, I'd hope it would. TBH, I don't understand why it didn't, unless the current CIL tag didn't include that test).

We could leave the CIL version as-is and then release 3.5.1 soon after which would presumably

  • use SIRF 3.5.1 with fixing the above
  • use CIL from today
  • switch BUILD_CIL=ON in docker, and remove using the SIRF- exercises environment.yml.

In any case, these steps seems minor (aside from probably the last one...) and need to be completed ASAP. If they are indeed minor, then releasing 3.5.0 as-is might be double work. Up to you.

@KrisThielemans
Copy link
Member

Of course, there are 2 issues on our milestone list that do need to be resolved. https://github.com/SyneRBI/SIRF-SuperBuild/milestone/14

@paskino paskino added this to the v3.6 milestone Oct 4, 2023
@casperdcl casperdcl mentioned this pull request Oct 9, 2023
@casperdcl casperdcl modified the milestones: v3.6, v3.5 Oct 11, 2023
casperdcl and others added 8 commits October 11, 2023 20:47
still need to do error checking nad renaming to CIL_HASH, but worth seeing what it does

Co-authored-by: Casper da Costa-Luis <casper.dcl@physics.org>
- fix `cil.version`
- fix missing `sirf`
- fix missing `python`
- fix missing `gadgetron`
@casperdcl
Copy link
Member Author

casperdcl commented Oct 11, 2023

ok so

should be gewd for SIRF 3.5.0?

@casperdcl casperdcl merged commit 5f394eb into master Oct 13, 2023
12 checks passed
@casperdcl casperdcl deleted the docker-CIL-tests branch October 13, 2023 09:51
@paskino paskino mentioned this pull request Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

jupyter missing in latest docker pre-built service image
3 participants