Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased tolerance in restart test. #241

Merged
merged 1 commit into from
Feb 1, 2018

Conversation

joakim-hove
Copy link
Member

No description provided.

@joakim-hove
Copy link
Member Author

jenkins build this with downstreams please

@atgeirr
Copy link
Member

atgeirr commented Feb 1, 2018

Seems like a quick fix, but what was the actual test that was failing due to the unit conversion change? I'd expect all non-temperature results to be identical (subtracting zero should not change anything) so what changed? The test result output is not too useful, as it just points to the comparison function and not actual failed comparison, did you run this in the debugger to check?

@joakim-hove
Copy link
Member Author

did you run this in the debugger to check?

Well - std::cout << at least. The failure was only in the temperature field; have updated the test to make that clearer.

@joakim-hove
Copy link
Member Author

jenkins build this with downstreams please

@joakim-hove
Copy link
Member Author

jenkins build this with downstream please

@joakim-hove joakim-hove merged commit ccc3997 into OPM:master Feb 1, 2018
@atgeirr
Copy link
Member

atgeirr commented Feb 1, 2018

I get lots of test failures in opm-simulators now. Not sure how (or if) it is related to this though, after all Jenkins was green on this.

@atgeirr
Copy link
Member

atgeirr commented Feb 1, 2018

The test failures are related to changed TEMP, and so are related to this mess. I could not understand why Jenkins was green on this, but it turns out that the opm-simulator tests were never run for this PR (nor of course for the original troublemaker...). For this PR, I think that your second command to jenkins maybe overwrote the first (correct) one. Perhaps jenkins should return failure whenever it fails to parse the command? Would that be easy to do, @akva2? It also seems to be the "with downstreams" that cause the most mistakes, could we simply make that the default and remove the option to not build the downstream modules?

@akva2
Copy link
Member

akva2 commented Feb 1, 2018

sanity checking the input is a bit too much meh in bash, and there are way to many corner cases to consider imo. i can remove the option though i really think people should pay attention and not let jenkins pay the price..

@joakim-hove
Copy link
Member Author

Double damn - first of all: My fault - again; I will fix.

But even though I am the main guilty here: I am not the first one to f... up this, and I will not be the last, so maybe we could make some improvements to the system to reduce the number of problems for distracted operators like myself. Some suggestions/thoughts:

  1. It can take "quite long" time to get an indication that something has started, that was what happened here - and I tried again with wrong syntax. Could we increase the polling frequency - or maybe even use push from github?

  2. Maybe "with downstreams" should be default; i.e. jenkins build this please should build with downstreams - could possibly introduce an alternative syntax for only building the current PR.

  3. I agree with @atgeirr that sanity checking of the input would make sense. I don't know (anything frankly) about the system - but the fact sanity checking in bash is painful quite frankly seems like a weird argument to me - can I recommend e.g 🐍?

Anyway - I take full responsability for repeated f...ups and will now go about fixing the mess.

@akva2
Copy link
Member

akva2 commented Feb 1, 2018

  1. that will replace the 5 min poll with lots of bugs with the github triggers, in particular with rebased branches. it will build the old branches and lots of confusion. the delay serves a purpose.
  2. i totally disagree but i have opened a PR to do this like you want it.
  3. jenkins interacts with the job scripts through environment variables and we spawn lots of processes. python would be an extremely shoddy tool for the job. code cannot fix lack of attention, no matter how hard i try so in my book it's an utter waste of effort. if i'm told to spend time on it i will of course.

@joakim-hove joakim-hove deleted the fix-restart-tol branch February 1, 2018 10:57
@andlaus
Copy link
Contributor

andlaus commented Feb 1, 2018

thanks for moping this up quickly -- I certainly also deserve a fair amount of the blame!

@joakim-hove
Copy link
Member Author

joakim-hove commented Feb 1, 2018

that will replace the 5 min poll [...]

Looking at: #242 right now the Github heading on my comment is "joakim-hove commented 44 minutes ago" and there is no response from Jenkins - so as a end-user I do not have a guarantee of response within one polling period? Lack of response creates uncertainty.

[...] in my book it's an utter waste of effort

I do not know the effort involved, and certainly do not pay the bill - but anyway: When people repeatedly make mistakes like I have done today you can always (rightly so) say that people should get their act together and pay attention, but maybe it would also make sense to look into improving the system?

@akva2
Copy link
Member

akva2 commented Feb 1, 2018

if you use the github triggers, it immediately fires the trigger when a branch is rewritten. the problem is it can take github longer time to update the pull/merge/xxx branch than 'instant'. you then end up building the old version of the rewritten branch, which leads to headaches. while this can still happen with the 5 min poll, it's much more unlikely.

i'm not objecting to making changes, but my point is where does it stop? should i verify all repo names? the existence of pr's? etc etc. no matter how hard you try there will things that ultimately is at the mercy of the attention of the user.

@atgeirr
Copy link
Member

atgeirr commented Feb 1, 2018

The failure to trigger in #242 is due to some timeout in the jenkins plugin. However it should be safe to re-issue the command if it does not "take" after 10 minutes or so. I have added a brief Troubleshooting section to https://wiki.opm-project.org/index.php?title=Jenkins_triggers

Defaulting to building with downstreams: should be online now for the PR-builders, #242 has run tests in opm-simulators despite not having the old downstreams trigger.

About syntax checking: we don't invest in it now, as the most-frequent error (forgetting to ask for downstreams builds) has been taken care of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants