Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run each build in a new directory in /tmp #349

Closed
aneeshusa opened this issue Apr 30, 2016 · 8 comments
Closed

Run each build in a new directory in /tmp #349

aneeshusa opened this issue Apr 30, 2016 · 8 comments

Comments

@aneeshusa
Copy link
Member

@aneeshusa aneeshusa commented Apr 30, 2016

IRC convo: http://logs.glob.uno/?c=mozilla%23servo&s=28+Apr+2016&e=7+May+2016#c419373

There is an intermittent problem on the Mac builders where a seemingly empty directory will be unremovable, with the filesystem reporting it is not empty, breaking the builds until the directory is moved out of the way. This is hypothesized to be an HFS/HFS+ bug.

We may be able to remove the impact from this by running each build in a new, separate directory in /tmp (e.g. made with mktemp), so that even if one build encounters this problem, subsequent builds are in a different directory and are not affected. These directories should be removed at the end of the build so we do not run out of disk space; bonus points for reporting a warning if we encounter one of these mystical non-removable directories while attempting to do so.

Note that this can manifest during the git step added in ServoFactory, so the directory creation needs to happen beforehand.

@aneeshusa
Copy link
Member Author

@aneeshusa aneeshusa commented May 5, 2016

I'm not 100% sure about this, but this may also speed up builds on Linux if /tmp is backed by tmpfs. Also depends on how much RAM we have on the builders and how much space is needed during compilation/testing.

@aneeshusa
Copy link
Member Author

@aneeshusa aneeshusa commented May 6, 2016

This may have spread to our windows builder:

exceptions.WindowsError: [Error 5] Access is denied: u'c:\\buildbot\\slave\\windows\\build\\.cargo\\git\\checkouts\\angle-7cd76ec19448e619\\.cargo-lock-angle-7cd76ec19448e619-servo'
@aneeshusa
Copy link
Member Author

@aneeshusa aneeshusa commented May 7, 2016

I looked into this and it is possible, but is blocked on #316.

Buildbot (neither 8 nor 9) does not seem to natively support running each build in a different directory; the working directory can be configured for each slave and/or builder, but not for individual builds as it resolves to a static string once startup is complete.
However, we can instead dynamically change the steps: add a step at the beginning of every build to create a temporary directory, then cd into that directory for all subsequent steps; this would require #316 for dynamic step creation. This would be somewhat similar to how to the Windows builder uses cd, for instance. We would also need to add a step to remove the temporary directory, which will need to set alwaysRun = True to ensure that cleanup always happens at the end of the build even if it ends prematurely (e.g. due to failed compile).

@metajack
Copy link
Contributor

@metajack metajack commented May 7, 2016

At this point maybe it makes more sense to investigate further the behavior that prompted this workaround?

@aneeshusa
Copy link
Member Author

@aneeshusa aneeshusa commented May 7, 2016

@metajack It seems that all 4 of the linked instances occurred on servo-mac1, so it might be an isolated problem. I'll try to take a closer look tomorrow.

@aneeshusa
Copy link
Member Author

@aneeshusa aneeshusa commented May 13, 2016

I went to take a look at this, but when we paved over mac1 and mac3 recently, all of the evidence was wiped away. I'll investigate if it happens again.

@larsbergstrom
Copy link
Contributor

@larsbergstrom larsbergstrom commented May 24, 2016

It seems to happen when the machine mysterious dies and attempts to reboot. Sometimes that reboot succeeds, but at least twice in the last few weeks it has not come back up fully and I've had to open a support ticket to get it turned back on.

@edunham
Copy link
Contributor

@edunham edunham commented Jul 5, 2017

Have we reproduced the underlying issue in the 14 months since this issue was last touched? If so, please re-open. If not, something unrelated might have fixed it somehow.

@edunham edunham closed this Jul 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.