Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with Travis-CI #1

Open
jeixav opened this issue Dec 2, 2018 · 33 comments
Open

Integration with Travis-CI #1

jeixav opened this issue Dec 2, 2018 · 33 comments

Comments

@jeixav
Copy link

jeixav commented Dec 2, 2018

Is it possible to use this Dockerfile with Travis-CI?

I am interested in testing Python-RPN with Travis-CI, but have not managed to make it work yet.

The Dockerfile and .travis.yml files I setup are:

My last Travis build is here: https://travis-ci.com/jeixav/python-rpn/builds/93263102

@neishm
Copy link
Owner

neishm commented Dec 3, 2018

I've never tried that before, but I think it should be possible.

Looks like the profile is being created at /home/ssm/.profile in the Dockerfile, but it's being referenced as /home/ssm/.profile_armnlib in the travis.yml recipe?

@neishm
Copy link
Owner

neishm commented Dec 4, 2018

Nevermind, rmnlib-install was recently updated to create a profile at /home/ssm/.profile_armnlib. The other profile being generated in the Dockerfile is no longer necessary. I can see from the build log that .profile_armnlib is being generated, but for some reason .travis.yml is unable to find the file later on. My only guess is it hasn't actually mounted the docker image for the before_script stage...

@neishm
Copy link
Owner

neishm commented Dec 4, 2018

I think the problem is, pytest has to be invoked within the built docker image. The script section doesn't actually run in this image (as far as I know), it has to be explicitly done through docker run. I have an updated version at neishm/python-rpn@5cf91a3 which gets past the initial failure, however pytest fails for most tests. The main error is:

AttributeError: /home/ssm/ssm-domains-base/lib/linux26-x86-64/lib/Linux_x86-64/gfortran/librmnshared_016.3.1.so: undefined symbol: crc32

If I remove the crc32 section from rpnpy.librmn.proto entirely, then a lot more things pass.

@neishm
Copy link
Owner

neishm commented Dec 5, 2018

Aaaand, here's the source of the crc32 problem: ECCC-ASTD-MRD/librmn@686e512.

Looks like librmn 16.3 and onward will no longer have a crc32 method.

@neishm
Copy link
Owner

neishm commented Dec 5, 2018

Just opened a new issue for the missing crc32 symbol (meteokid/python-rpn#17).

I added a try/except block to work around this problem (neishm/python-rpn@2c6a6e5), and now more than half the tests are passing (so, progress?)

The remaining failures are mainly from:

  • A missing c_crc32 method. Expected from the workaround. Affects some of the rpnpy.librmn.grids methods.
  • A missing libburp_c library (not available in rmnlib-install or any public reposistory yet?)
  • A missing ATM_MODEL_DFILES environment variable, and the associated test files.
  • A missing Cvgd_new_gen2 symbol in libdescrip, probably because rmnlib-install uses an older version of that library

Build log is at https://api.travis-ci.org/v3/job/463976070/log.txt

@neishm
Copy link
Owner

neishm commented Dec 10, 2018

The crc32 problem is resolved now (switched to zlib.crc32). The vgrid (libdescrip) package has been updated to version 6.4, which resolves the missing symbol. The $ATM_MODEL_DFILES directory is now created from the gem-data package at http://collaboration.cmc.ec.gc.ca/science/ssm/. However, this does not yet have the bcmk_vgrid files, so some rpnpy.vgd tests are still failing.

The majority of the remaining failures are due to:

  • Missing $CMCGRIDF directory. There are some tests which require the most recent forecast data in RPN standard format. Maybe some substitute files could be used, from a past date? Unfortunately I'm unable to locate any public-facing candidates for these files.
  • Missing libburp_c_shared.so - this is not yet open-source, so just have to wait until it's available.

There are a few failures which might be genuine problems. The following tests have AssertionErrors:

  • RpnPyLibrmnBurp.testmrbcvtencodeKnownValues
  • RpnPyLibrmnBurp.testmrfoptKnownValues
  • Librmn_grids_Test.test_ll2rll_rot
  • Librmn_grids_Test.test_ll2rll_rot2
  • VGDProtoTests.testGetPutOptInt

Latest log is at https://api.travis-ci.org/v3/job/466197770/log.txt

@jeixav
Copy link
Author

jeixav commented Mar 31, 2019

Please try to clone https://github.com/jeixav/python-rpn.git on a system with git-lfs installed:

git clone https://github.com/jeixav/python-rpn.git
git checkout feat/travis

That should provide the file cmcgridf/prog/regeta/2019033000_048 to help the tests that reference $CMCGRIDF. The file seems excessively large (764M). Maybe @toutoumeteo could help craft a smaller file.

The Travis build failed. It may be due to the way Docker and OpenMPI interact, see: https://stackoverflow.com/a/46163274.

@neishm
Copy link
Owner

neishm commented Apr 2, 2019

Travis-CI builds should already be within an Ubuntu container, so you might be able to transform the Dockerfile rules into the equivalent lines for the .travis.yml file and put everything directly in there. I think the equivalent commands would be something like

before_script:
  - apt-get update && apt-get install -y git make libssl-dev ksh gfortran libopenmpi-dev python liburi-perl - wget libncurses5-dev libc6-dev-i386 openmpi-bin
  - git clone https://github.com/jeixav/rmnlib-install.git 
  - cd rmnlib-install
  - git fetch && git checkout 92f256dae42c27ba37ea1f91dbbac97b079728d9

  # Add gem-data as a package to install.
  # Copy and adapt the lines in the Makefile from the 'afsisio' package.
  - sed -i '/afsisio.* \\$/{p;s/afsisio_1.0u/gem-data_4.2.0/}' Makefile
  - sed -i '/afsisio.*:/,/^$/H;${p;s/.*//;x;s/afsisio_1.0u/gem-data_4.2.0/g}' Makefile

  - make auto-install VGRID_RELEASE=6.4

  # Auto-load profile for interactive sessions.
  - echo . $HOME/.profile_armnlib >> $HOME/.bashrc
  - echo . r.load.dot dev/gfortran >> $HOME/.bashrc

  # Extra packages needed for python-rpn tests
  - apt-get update && apt-get install -y python-pytest python-numpy python-tz python-scipy

then the .bashrc file would have to somehow be sourced in the script: section of .travis.yml?

Or better yet, the commands could go into a standalone script (like install-rmnlib-directly-on-my-machine-and-possibly-bung-it-up.sh), and just call that script from .travis.yml?

@neishm
Copy link
Owner

neishm commented May 15, 2019

Just tried building the feat/travis branch on my own fork, but it's crashing even sooner than yours, complaining that an LFS object doesn't exist on the server.

https://travis-ci.org/neishm/python-rpn/jobs/532931309

@jeixav
Copy link
Author

jeixav commented May 15, 2019

The git clone command that fails with https://github.com/neishm/python-rpn.git works if I replace it with https://github.com/jeixav/python-rpn.git.

I think you need to clone my Python-RPN Git repository, grab the binary files stored with git-lfs and add them to your Python-RPN repository.

@neishm
Copy link
Owner

neishm commented May 17, 2019

I was able to get the LFS file working by manually cloning into a "new" repository (not using the fork button). Apparently there's a issue where you can't push LFS objects into a repository if it happens to be a fork from another repository - git-lfs/git-lfs#1906.

I'm partway through getting a native travis-ci version working without Docker layer to avoid that OpenMPI problem (https://github.com/neishm/python-rpn-unforked/tree/local-ssm-files). Unfortunately it's still crashing for me because it's looking for an RPN file with today's date (https://api.travis-ci.org/v3/job/533924289/log.txt)... trying to figure out where that's coming from.

Also testing out pushing the gem-data and other SSM files to git-lfs since the collaboration site can become too slow to work with travis-ci sometimes.

@jeixav
Copy link
Author

jeixav commented May 18, 2019

Is today's date is produced by calls to datetime.date.today() in the tests? For example:

https://github.com/meteokid/python-rpn/blob/2d0c280700f1a9a79a72af453d576f62be7cc449/share/tests/test_cookbook.py#L349

neishm pushed a commit to neishm/python-rpn-wheel that referenced this issue May 23, 2019
…x container.

This is cobbled together from python-rpn continuous integration tests that are
being developed by Kristjan Onu.  See discussion at neishm/rmnlib-install-docker#1

Note: Unlike the CI tests, the correctness of python-rpn is not being checked
here. The purpose of this test suite is to make sure the wheel is properly
packaged, with no missing modules / shared library symbols / etc.
This is currently only tested for a 64-bit Linux environment.

So far there are multiple failures (25 failures, 13 errors to be exact).  Still
need to determine which errors may be due to the wheel creation, and which ones
are upstream errors with python-rpn or the tests themselves.
@neishm
Copy link
Owner

neishm commented May 23, 2019

Yes, that was the problem. If I link to CMCGRIDF file to another filename with today's date, then it's happy. Now getting an AssertionError with test_librmn_grids.Librmn_grids_Test (log: https://api.travis-ci.org/v3/job/536488362/log.txt). At least the CMCGRIDF errors are addressed...

@jeixav
Copy link
Author

jeixav commented May 24, 2019

Can you check if the AssertionError is reproducible on a Science network supercomputer? If yes, @meteokid may be able to help.

@neishm
Copy link
Owner

neishm commented May 24, 2019

I can reproduce on the science network (specifically eccc-ppp1), by running:

. r.load.dot eccc/mrd/rpn/MIG/ENV/py/2.7/rpnpy/2.1.b3
python /fs/ssm/eccc/mrd/rpn/MIG/ENV/d/py/2.7/rpnpy/rpnpy_2.1.b3/rpnpy_2.1.b3_all/share/tests/test_librmn_grids.py

This gives the following AssertionErrors:

======================================================================
FAIL: test_ll2rll_rot (__main__.Librmn_grids_Test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/fs/ssm/eccc/mrd/rpn/MIG/ENV/d/py/2.7/rpnpy/rpnpy_2.1.b3/rpnpy_2.1.b3_all/share/tests/test_librmn_grids.py", line 395, in test_ll2rll_rot
    self.assertTrue(ok)
AssertionError: False is not true

======================================================================
FAIL: test_ll2rll_rot2 (__main__.Librmn_grids_Test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/fs/ssm/eccc/mrd/rpn/MIG/ENV/d/py/2.7/rpnpy/rpnpy_2.1.b3/rpnpy_2.1.b3_all/share/tests/test_librmn_grids.py", line 417, in test_ll2rll_rot2
    self.assertTrue(ok)
AssertionError: False is not true

It's giving the same error on the science network when testing against the latest master branch.

@neishm
Copy link
Owner

neishm commented Jun 7, 2019

I was able to resolve some of the test failures (including the test_ll2rll_rot and test_ll2rll_rot2 failures). Now it's stuck on some vgd tests (https://api.travis-ci.org/v3/job/542846145/log.txt). Those same tests seem to pass when running on the science network, so not sure why they're failing in the Travis-CI environment.

Also, libburpc tests are still disabled since we don't yet have a libburpc package within rmnlib-install.

@toutoumeteo
Copy link
Contributor

toutoumeteo commented Jun 8, 2019 via email

@neishm
Copy link
Owner

neishm commented Jun 8, 2019

Looks like those are the only two errors. Re-ran to allow arbitrary number of test failures:
https://api.travis-ci.org/v3/job/543230091/log.txt

Still just those two tests are failing. It's using the vgrid from rmnlib-install (https://github.com/mfvalin/rmnlib-install/tree/master/vgrid_6.4_linux26-x86-64).

Strangely, testing from a similar version of vgrid (6.4.b2) in the python-rpn-wheels package seems to work, so not sure if it's actually a problem in the vgrid source or not...

@toutoumeteo
Copy link
Contributor

Is https://github.com/neishm/python-rpn branch fix_tests the right repo and branch to investigate the vgrid problem you’ve reperted?

André

P.S. Sorry for the previous empty email

@neishm
Copy link
Owner

neishm commented Jun 9, 2019

Hi André,
The latest Travis-CI test was done from https://github.com/neishm/python-rpn-unforked, on the local-ssm-files branch.

@toutoumeteo
Copy link
Contributor

toutoumeteo commented Jun 10, 2019 via email

@toutoumeteo
Copy link
Contributor

I was able to run the vgrid tests, including the new one for ALLOW_SIGMA, on travis and they all run fine. To make it faster, I commented out the gem-data part of the .travis.yml file, it took only 7.5 minutes.

My repo is https://github.com/toutoumeteo/python-rpn-unforked.git branch feat/travis

There is nothing there to add to your repo. I put the info here only to document the work that was done.

@neishm
Copy link
Owner

neishm commented Jun 10, 2019

Thanks, André. So if I understand correctly, there was no problem with vgrid when you ran the tests? I need to have a closer look at your setup to see what the difference is...

To answer your question about images, no I don't think we have that. I'm new to this whole CI thing, so I'm still trying to figure out all these features. I know @jeixav was playing around with git-lfs to get some speedups with the CI tests, but this sounds like another option we could try?

For what it's worth, I'm able to trigger the vgrid "error" locally with the latest python-rpn-wheel repository, by running make and then make test. However, if I change the test to run test_vgd_proto.py or test_vgd_base.py directly instead of pytest, then there is no error. Triggering the "error" seems to depend on how the test is invoked...

@toutoumeteo
Copy link
Contributor

Mike,

This is correct, there was no problem when I ran the vgrid tests on GitHub CI (travis) and gitlab CI.

I will start to look at the rpnpy code and make some tests with the python wheel that I've installed on my server and eventually with travis.

@toutoumeteo
Copy link
Contributor

How do I get the ATM_MODEL_DFILES to run the rpnpy tests locally on my server? I looked in the travis log but could not figure out where it gets it.

Thanks

@neishm
Copy link
Owner

neishm commented Jun 12, 2019

The ATM_MODEL_DFILES are being installed from the file local-ssm-files/gem-data_4.2.0_all.ssm. It requires git-lfs in order to properly fetch the data, otherwise the file is just a placeholder.

The package can also be downloaded from http://collaboration.cmc.ec.gc.ca/science/ssm/gem-data_4.2.0_all.ssm. It's just a gzipped tarball, so you should be able to untar it and set ATM_MODEL_DFILES to <extraction point>/gem-data_4.2.0_all/share/data/dfiles.

@neishm
Copy link
Owner

neishm commented Jun 17, 2019

I just got a notice from GitHub that I've exceeded my limit for LFS data, so the current approach isn't sustainable in the long-term. I'm going to look into options for reducing the file size - maybe add some hooks in the unit tests to snoop on which records / files are actually read, and create test files with only those records? Hopefully the size can be brought down significantly from the 1.48 GB that the ATM_MODEL_DFILES and CMCGRIDF data is using...

@neishm
Copy link
Owner

neishm commented Jun 18, 2019

Actually, since GitLab CI is already working for André's tests (with super speed), and there appears to be no size limit on the images there, would it make sense to just move the CI branch to GitLab? Or would that complicate the workflow too much?

@toutoumeteo
Copy link
Contributor

toutoumeteo commented Jun 18, 2019 via email

@neishm
Copy link
Owner

neishm commented Jun 20, 2019

I've identified the issue I was having with the vgrid tests. The tests themselves are fine when run individually, but some of them have side-effects on the 'ALLOW_SIGMA' option that can cause assertion errors when run together. I have a potential fix for this documented at https://github.com/meteokid/python-rpn/pull/19/files/382a934d49c80efd1c50001901d6c896180378ce..70f71226276274d0db0ae0eff2382c692e16d64b
Haven't tested on the travis-ci build yet - need to revert from my LFS data back to the SSM web depot due to quota issues with LFS.

@toutoumeteo
Copy link
Contributor

Congratulations Mike!

I looked at your code and it look fine to me. I see now that one has to set ALLOW_SIGMA to it default state (false) so the other tests can run OK. Sometimes it is the tests that have a bug...

@neishm
Copy link
Owner

neishm commented Jun 21, 2019

travis-ci tests are passing with this latest update. Log: https://travis-ci.org/neishm/python-rpn-unforked/builds/548868850

Updates are in https://github.com/neishm/python-rpn-unforked/tree/feat/travis

Note that this is still skipping the libburpc tests, since I don't think that library is available in rmnlib-install yet.

@neishm
Copy link
Owner

neishm commented Jul 27, 2019

For doing quick tests, the python-rpn-wheel package generates a small test dataset (around 11MB) with simplified versions of the data files. It's saved in the eccc_rpnpy_tests wheel file from the "make" invocation. It needs to download over a gigabyte of data for the initial creation, but after pre-generating it can be easily installed for subsequent tests. I've had some success using it for integration tests for python-rpn-wheel (see for instance https://github.com/neishm/python-rpn-wheel/blob/daa86e64df89f620941e9414fd2677caf0959e80/.travis.yml#L24-L26).

A basic interactive test environment can be set up by running source <(rpy.testenv) which would give ATM_MODEL_DFILES, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants