Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random failures on CI related to file rename #9568

Closed
mxr576 opened this issue Dec 20, 2020 · 4 comments · Fixed by #9580
Closed

Random failures on CI related to file rename #9568

mxr576 opened this issue Dec 20, 2020 · 4 comments · Fixed by #9580

Comments

@mxr576
Copy link
Contributor

mxr576 commented Dec 20, 2020

We are running several builds (15-20) at the same time on the same server in parallel as part of our CI workflow. Some builds are randomly failing with the below-described errors. All builds are running in Docker, every build item spins up an isolated environment from the same images. The source code is mounted from the host, but Composer's cache is container only, it is not shared or mounted.

Not sure what other information can I disclose which could be useful to figure out what is failing exactly and why. Does this ring a bell to anybody?

My composer.json:

N/A

Output of composer diagnose:

Checking git settings: OK
Checking http connectivity to packagist: OK
Checking platform settings: OK
Checking git settings: OK
Checking http connectivity to packagist: OK
Checking https connectivity to packagist: OK
Checking github.com oauth access: OK
Checking disk free space: OK
Checking pubkeys: 
Tags Public Key Fingerprint: 57815BA2 7E54DC31 7ECC7CC5 573090D0  87719BA6 8F3BB723 4E5D42D0 84A14642
Dev Public Key Fingerprint: 4AC45767 E5EC2265 2F0C1167 CBBB8A2B  0C708369 153E328C AD90147D AFE50952
OK
Checking composer version: OK
Composer version: 2.0.8
PHP version: 7.3.25
PHP binary path: /usr/local/bin/php
OpenSSL version: OpenSSL 1.1.1g  21 Apr 2020

When I run this command:

composer install foo/bar

I get the following output:

[ErrorException]                                                                                                                                                                                                              

rename(/home/user/.composer/cache/repo/https---repo.packagist.org/provider-php-http~promise.json.tmp,/home/user/.composer/cache/repo/https---repo.packagist.org/provider-php-http~promise.json): No such file or directory  
@mvannes
Copy link

mvannes commented Dec 24, 2020

Experiencing the same issue in our gitlab CI. It feels related to composer 2, as the advent of this is when we started seeing these build failures. But so far we've found no real cause as to why this could happen.

@mxr576
Copy link
Contributor Author

mxr576 commented Dec 30, 2020

Another usual error that could be related to this, Composer fails to extract downloaded artifacts. I/O or Network error? 🤔

image

@rtm-ctrlz
Copy link
Contributor

The source code is mounted from the host

Looks like all of your concurrent jobs using same (mount from host) vendor directory, if it is true - I think that will have race conditions almost for every step during installs/updates.

My setup is opposite to ours:

  • separated build directory (source+vendor) for every job; reason for this - every test can do whatever it needs with sources/composer
  • shared composer cache; reason - speedup install (cache is stored at nvme-ssd) + minimize network usage (speedup + decrease load on package repos)

Also I not quite understand - how you are getting cache-rename issue - if every job uses it's own cache directory :(

@mxr576
Copy link
Contributor Author

mxr576 commented Jan 5, 2021

Looks like all of your concurrent jobs using same (mount from host) vendor directory,

Damn, you are right. The initial script that creates those builds that run separately from each other uses the same Composer instance. Before Composer 2, we disabled Composer cache with COMPOSER_CACHE_DIR=/dev/null and now I remember, because we had similar issues.

So the #9568 (comment) is unrelated or it can be mitigated and the fix in #9580 looks good to me to fix race condition in cache writes that I originally reported here.

janmashat pushed a commit to Pronovix/drupal-qa that referenced this issue Jan 7, 2021
* Add patch: "mkdir can fail in \\Drupal\\Core\\Test\\TestRunnerKernel::boot()
because of a race condition [#3190859]"
* Add patch: file_scan_ignore_directories is ignored in kernel tests [#3190974]
* Add patch: copy() can fail in FunctionalTestSetupTrait::prepareSettings()
because of a race condition  [#3191369]
* Add missing custom PHPUnit bootstrap file to CI test runner
* Rollback removal of COMPOSER_CACHE_DIR=/dev/null because without that
random failures can occur when multiple builds are running in parallel,
ex.: composer/composer#9568 (comment)
* Instead of killing Composer local caching with COMPOSER_CACHE_DIR=/dev/null
introduce a dedicated Composer cache component for every component that
consecutive component builds can leverage
* Fix random "Undefined index: value" in parallel tests caused by
database connection error
* Make sure that required database and webserver services are actually
able to accept requests before CI starts running integration tests

Change-Id: Ie691b8b94ba6d95fadf1699ab4e2eeb86a5b9941
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants