Started by user anonymous
Building remotely on mm-win-xp-1 in workspace c:\jenkins\workspace\mozilla-central_addons
Deleting project workspace... No emails were triggered.
Unable to access upstream workspace for artifact copy. Slave node offline?
Build step 'Copy artifacts from another project' marked build as failure
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure
Sending email to: firstname.lastname@example.org
Somehow this is related to the mozmill-environment artifact. I manually checked the workspace for each platform and none of those existed anymore, even I have seen them yesterday after running this job. So somehow they have been deleted. Running the jobs per platform fixes the problem for now. Lets monitor and get it fixed for real.
This happened again today. See http://10.250.73.243:8080/job/ondemand_update/1517/console
Rerunning get_mozmill-environments fixed it once more.
But it failed again a couple of hours later when Juan did the ondemand update testrun. I have restarted Jenkins and now it seems to stick. I think early next quarter we have to ensure that we can upgrade Jenkins and all the plugins ASAP. I can believe that it is related to some bad interaction.
So this issue is directly related to our mozmill-environment job which is a multiple axis job in Jenkins. It gets triggered manually whenever we release a new version of the environment. All the jobs ended successfully so far.
But under some unknown circumstances the workspace of this job gets deleted. We do not know why that happens and we have to investigate that further. So Dave will ask in #jenkins if this is a known issue. If not and we can't have it fixed soon, I will create a crontab script which checks for the workspace folder each 5 minutes, and if not present runs the job via the Jenkins API and sends us an email with details. That should help us to analyze the system log to retrieve further information.
It would appear that we are hitting JENKINS-4501
If it is possible that this job doesn't have to be a matrix job lets go with that! I'm totally behind that idea. Would you have the time to work on that Dave? If not I could try to get this started tomorrow.
I might be able to work on this on the plane tomorrow.
We landed the patch from dave and I think we don't have to go back to a matrix job. Lets close this issue now.
Looks like a bug in the xshell plugin stopped us from delivering it to production. As Dave pointed out it has been fixed by jenkinsci/xshell-plugin@a13e799 but we cannot upgrade to it because it requires a higher Jenkins version. Not sure why we haven't seen this problem earlier.
We're not blocked from changing the environments job back to a standard job any more, but I don't think we've seen this error recently. If not, I think we should close this issue for now. What do you think @whimboo?
If we could really switch away from a matrix job I would support that move!
Probably just need to unbitrot my https://github.com/davehunt/mozmill-ci/tree/remove-matrix branch.
We saw this again today on staging
As a workaround we might want to re-save the config twice a month for that matrix job, which might be a workaround for the problem. As Dave mentioned he might find the time to work on that next week. Our goal would be to get the matrix job removed completely.
Switched get_mozmill-environments from a matrix job to a standard job. (
Switch get_mozmill-environments from a matrix to a standard job. (#151)
I pushed the code from PR #325 to staging and noticed that we do not restrict this job to be run on master only. That means curl will fail because it is not installed on all the slaves. I will come up with a follow-up PR to fix that.
Don't allow roaming for get_mozmill-environments job (#151)
Ok, everything is live on staging and works as expected. I have seen that we had another fallout of this issue on staging today, so lets see how everything works the next days.
Fix copy artifacts filter for Aurora update job (#151)
Follow-up fix for the broken aurora_update job has been pushed to staging and is active now. The only missing piece is the xshell plugin now. But @davehunt is working on a temporary version we can use in the interim.
This is no longer an issue for us given that we do not use the matrix job anymore. The change is live on production, and we can close out this issue.