Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run make check in Jenkins #125

Merged

Conversation

elliottslaughter
Copy link
Contributor

I noticed that make check is not being run in Jenkins. I'm adding it because I suspect there are failures in the test suite that aren't being caught at the moment.

@ahbarnett
Copy link
Collaborator

@janden @garrettwrong Dear Joakim & Garrett, do either of you have a few minutes to look into what's happening here? Since we're starting to get a bunch of users, it would be great to have CI? (of course, various devs like Melody, Johannes, and occasionally me, are regularly checking it locally..)
It fails at the Py import stage, ie it's an env issue not a code issue:

    from pycuda._driver import *  # noqa
E   SystemError: initialization of _driver raised unreported exception

I know nothing about the jenkins GPU setup... Thanks, Alex

@janden
Copy link
Collaborator

janden commented Feb 4, 2022

So it looks like there are two issues here:
– CI is failing currently due to NumPy version mismatches. This should be handled by #129.
– We are currently only running the Python tests, not the C++ tests.

Regarding the issue of the C++ tests, I think that our idea was that the Python tests could cover most of the code, so any bugs would trip them. That being said, I don't know that that is achieved, so it might make sense to include them. @ahbarnett, what do you think?

If we do want to include them, we have to see how that might best be achieved. It could be here in the Jenkinsfile or it could be in the Dockerfile where the library is being built (and the other calls to make are performed).

@ahbarnett
Copy link
Collaborator

ahbarnett commented Feb 4, 2022

Actually I have a question about this for @janden: does the Jenkins test (now using python3 -m pip install -e)
actually compile libcufinufft.so locally, or are this precompiled and downloaded from somewhere else?

It the .so is locally compiled, then I think the py tests are good enough for CI. The tests in make check are more extensive, less supportable, and let's keep them for devs only.

But @elliottslaughter, do you suspect particular make check items of not working on the Jenkins system?
Maybe now since commit 46a8e8a fixes the environment, you can upstream fetch that to your branch anyway.

@elliottslaughter
Copy link
Contributor Author

I think I was confused about the make check because I expected an issue like #116 (comment) to break the build, but it doesn't. (I suspect it may be a GCC vs Clang issue, so the default CUDA setup with GCC doesn't catch it.)

Anyway, this is your project so feel free to do as you see fit, but my own two cents would be that the more tests, the better. Even if the Python and C++ tests overlap, make check is at least testing a different language interface. The CI is the perfect place to put this since you don't need to remember to run it, and it helps sanity check that everything stays working.

I rebased and the run passes, but maybe I messed the Jenkins syntax because I can't actually see the make check running anywhere in the job output.

@ahbarnett
Copy link
Collaborator

ahbarnett commented Feb 8, 2022 via email

Copy link
Collaborator

@ahbarnett ahbarnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Elliot,
Thanks for upstreaming this PR. But I can't just pull it in since I don't see the Jenkins run producing any output from make check so I don't believe it's being run in CI, riight? However, I now can be persuaded that having it in CI is a good thing. Maybe @janden as author or the Jenkinsfile could help get this PR in? Thanks all, Alex

@elliottslaughter
Copy link
Contributor Author

elliottslaughter commented Feb 9, 2022

Is it possible that you're running a centrally-controlled Jenkinsfile, rather than the one in the repo? I tried another version of my change, and it literally isn't taking the script I'm putting in. I'm starting to suspect it's not actually running the Jenkinsfile out of the repository at all but using a centrally managed version, or something along those lines.

@ahbarnett
Copy link
Collaborator

ahbarnett commented Feb 9, 2022 via email

@janden
Copy link
Collaborator

janden commented Feb 10, 2022

Actually I have a question about this for @janden: does the Jenkins test (now using python3 -m pip install -e)
actually compile libcufinufft.so locally, or are this precompiled and downloaded from somewhere else?

It's compiled as part of the Dockerfile each time the CI is run.

Anyway, this is your project so feel free to do as you see fit, but my own two cents would be that the more tests, the better. Even if the Python and C++ tests overlap, make check is at least testing a different language interface. The CI is the perfect place to put this since you don't need to remember to run it, and it helps sanity check that everything stays working.

I agree.

Is it possible that you're running a centrally-controlled Jenkinsfile, rather than the one in the repo?

I'm not sure. It could be that it runs the one in the master branch (for safety reasons). That being said, I'm fairly certain that the Dockerfile is being updated from the branch, so you may have better luck putting in the make check line there. I also think it makes more sense to put it there since that's where the other calls to make are.

@elliottslaughter
Copy link
Contributor Author

Good news: running make check in the Dockerfile does indeed cause it to run.

Bad news: it is now hitting:

../bin/spread1d_test: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory

I'm not really familiar with how these containers are set up, so I'm not sure where this library would be located?

@janden
Copy link
Collaborator

janden commented Feb 11, 2022

I'm not really familiar with how these containers are set up, so I'm not sure where this library would be located?

This line

ENV LD_LIBRARY_PATH /io/lib:${LD_LIBRARY_PATH}
sets the necessary environment variable. Try invoking make check after it.

@elliottslaughter
Copy link
Contributor Author

@janden I believe the most recent change I made matches what you suggested, but the Jenkins build still fails.

@janden
Copy link
Collaborator

janden commented Feb 14, 2022

Looks like the LD_LIBRARY_PATH is still incorrect. Let me try something.

@janden
Copy link
Collaborator

janden commented Feb 14, 2022

So I think I know what's happening. The way the docker image is set up, it doesn't actually have access to the GPU during the build phase. At this point, all that's happening is that the library is being compiled, so it doesn't actually need the GPU. Later, when we're testing, we do use the GPU, which is why Jenkins runs the docker image with the --gpus 1 flag set (otherwise we'd see the same crash during pytest).

It may be possible to give the docker image access to the GPU during the build phase, but this seems complicated. I therefore suggest you move the make check invocation back into the Jenkinsfile. Then we get back to the original problem that Jenkins won't run it until it's in master, but we'll just have to merge it in and see what happens.

@elliottslaughter
Copy link
Contributor Author

Ok, I have reset this PR to its original form, adding the make check line to Jenkinsfile.

However, I really think you should check the Jenkins configuration first. As you can see at the link below, one of the options to configure the contents of a Jenkinsfile is through the admin UI. If this is how the instance is currently set up, it literally won't matter what is in the master branch (or anything else, aside from the UI).

https://www.jenkins.io/doc/book/pipeline/getting-started/#through-the-classic-ui

You'll want to double check that you've followed these instructions to make sure you're pulling the Jenkinsfile from the repository itself:

https://www.jenkins.io/doc/book/pipeline/getting-started/#defining-a-pipeline-in-scm

I don't see anything in there that would suggest that it only pays attention to certain branches (which is why I'm suggesting double checking that it's not configured via the UI).

@janden
Copy link
Collaborator

janden commented Feb 15, 2022

I don't have admin access to Jenkins, so there is no good way for me to verify or modify these settings.

It looks like Jenkins doesn't trust the Jenkinsfile in this PR (see https://jenkins.flatironinstitute.org/job/cufinufft/indexing/events), but it does trust the one from master. It might trust it if the PR comes from this repository. Will give it a try. Otherwise it won't hurt to just merge it.

@janden
Copy link
Collaborator

janden commented Feb 15, 2022

Ok. Looks like my hypothesis was correct.

I suggest we close this PR, undraft #134, and continue there. Since it's a branch on the main repository, Jenkins trusts the Jenkinsfile, so we can make any changes we want there and it should show up in the CI.

@ahbarnett ahbarnett merged commit 3ae9871 into flatironinstitute:master Feb 15, 2022
@elliottslaughter
Copy link
Contributor Author

Looks like make check ran in master. Thanks all!

@elliottslaughter elliottslaughter deleted the eds/jenkins-make-check branch February 15, 2022 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants