Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Revert "Split long tests into multiple checks"" #32515

Merged
merged 2 commits into from
Dec 10, 2021

Conversation

alesapin
Copy link
Member

Reverts #32514

@alesapin
Copy link
Member Author

alesapin commented Dec 10, 2021

Looks like very obscure bug in docker. After I've merged this PR functional tests start failing in master with a very strange error:

  1. https://github.com/ClickHouse/ClickHouse/runs/4485218767?check_suite_focus=true
  2. https://github.com/ClickHouse/ClickHouse/runs/4485015291?check_suite_focus=true

/run.sh: line 118: ./process_functional_tests_result.py: No such file or directory

Obviously you can notice, that I didn't change anything related to process_functional_tests_result.py. But fortunately we have ls -lha in the beginning of the run.sh script:

+ ls -lha
total 130M
drwxr-xr-x   1 root root 4.0K Dec 10 12:03 .
drwxr-xr-x   1 root root 4.0K Dec 10 12:03 ..
-rwxr-xr-x   1 root root    0 Dec 10 12:03 .dockerenv
lrwxrwxrwx   1 root root    7 Oct  6 13:47 bin -> usr/bin
drwxr-xr-x   2 root root 4.0K Apr 15  2020 boot
drwxr-xr-x   5 root root  340 Dec 10 12:03 dev
-rwxr-xr-x   1 root root 769K Dec 10 10:13 dpkg-deb
drwxr-xr-x   1 root root 4.0K Dec 10 12:04 etc
drwxr-xr-x   2 root root 4.0K Apr 15  2020 home
lrwxrwxrwx   1 root root    7 Oct  6 13:47 lib -> usr/lib
lrwxrwxrwx   1 root root    9 Oct  6 13:47 lib32 -> usr/lib32
lrwxrwxrwx   1 root root    9 Oct  6 13:47 lib64 -> usr/lib64
lrwxrwxrwx   1 root root   10 Oct  6 13:47 libx32 -> usr/libx32
-rwxr-xr-x   1 root root  21M Dec  9 21:16 mc
drwxr-xr-x   2 root root 4.0K Oct  6 13:47 media
-rwxr-xr-x   1 root root 108M Dec  9 04:37 minio
drwxr-xr-x   2 root root 4.0K Oct  6 13:47 mnt
drwxr-xr-x   2 root root 4.0K Oct  6 13:47 opt
drwxrwxr-x   2 1000 1000 4.0K Dec 10 12:03 package_folder
dr-xr-xr-x 292 root root    0 Dec 10 12:03 proc
drwx------   1 root root 4.0K Dec 10 11:43 root
drwxr-xr-x   1 root root 4.0K Dec 10 12:04 run
-rwxrwxr-x   1 root root 8.5K Dec 10 11:42 run.sh
lrwxrwxrwx   1 root root    8 Oct  6 13:47 sbin -> usr/sbin
-rwxrwxr-x   1 root root 1.5K Dec 10 11:09 setup_minio.sh
drwxr-xr-x   2 root root 4.0K Oct  6 13:47 srv
dr-xr-xr-x  13 root root    0 Dec 10 12:03 sys
drwxrwxr-x   2 1000 1000 4.0K Dec 10 12:03 test_output
drwxrwxrwt   1 root root 4.0K Dec 10 11:44 tmp
drwxr-xr-x   1 root root 4.0K Dec 10 10:12 usr
drwxr-xr-x   1 root root 4.0K Oct  6 13:58 var

So file /process_functional_tests_result.py really disappeared. From test log I can find run command:

docker run --volume=/home/ubuntu/actions-runner/_work/_temp/stateless_memory/packages:/package_folder --volume=/home/ubuntu/actions-runner/_work/_temp/stateless_memory/result_path:/test_output --volume=/home/ubuntu/actions-runner/_work/_temp/stateless_memory/server_log:/var/log/clickhouse-server --cap-add=SYS_PTRACE -e MAX_RUN_TIME=9720 -e S3_URL="https://clickhouse-datasets.s3.amazonaws.com" -e RUN_BY_HASH_NUM=1 -e RUN_BY_HASH_TOTAL=3 -e ADDITIONAL_OPTIONS="--hung-check --print-time" clickhouse/stateless-test:0-a2aa0bc6cc2454801d7e7a0e589446fa22fd56aa

So broken image is clickhouse/stateless-test:0-a2aa0bc6cc2454801d7e7a0e589446fa22fd56aa. Nice, at least I can look inside it:

$ docker run -it clickhouse/stateless-test:0-a2aa0bc6cc2454801d7e7a0e589446fa22fd56aa bash
Unable to find image 'clickhouse/stateless-test:0-a2aa0bc6cc2454801d7e7a0e589446fa22fd56aa' locally
0-a2aa0bc6cc2454801d7e7a0e589446fa22fd56aa: Pulling from clickhouse/stateless-test
f3ef4ff62e0d: Already exists 
3d1a8d4d93fb: Already exists 
462f6959c661: Pull complete 
b6c3f9606c06: Pull complete 
0819924d87b0: Pull complete 
e0a205380810: Pull complete 
2a1a19031c05: Pull complete 
82b024a930f7: Pull complete 
8d8a78a6b2c3: Pull complete 
93122f251cd7: Pull complete 
935c96713a56: Pull complete 
e8fe7237deee: Pull complete 
04ae740202e2: Pull complete 
49282087d3d9: Pull complete 
557c42f3ba25: Pull complete 
f29129b9a12b: Pull complete 
5f30820f10e6: Pull complete 
Digest: sha256:a76776cec3fecc34bc834118fadc6de467b61752b7cf97f7271f96863297c516
Status: Downloaded newer image for clickhouse/stateless-test:0-a2aa0bc6cc2454801d7e7a0e589446fa22fd56aa
root@8d756f052a93:/# ls -lha
total 130M
drwxr-xr-x   1 root root 4.0K Dec 10 19:57 .
drwxr-xr-x   1 root root 4.0K Dec 10 19:57 ..
-rwxr-xr-x   1 root root    0 Dec 10 19:57 .dockerenv
lrwxrwxrwx   1 root root    7 Sep 21 19:48 bin -> usr/bin
drwxr-xr-x   2 root root 4.0K Apr 15  2020 boot
drwxr-xr-x   5 root root  360 Dec 10 19:57 dev
-rwxr-xr-x   1 root root 769K Nov 29 13:18 dpkg-deb
drwxr-xr-x   1 root root 4.0K Dec 10 19:57 etc
drwxr-xr-x   2 root root 4.0K Apr 15  2020 home
lrwxrwxrwx   1 root root    7 Sep 21 19:48 lib -> usr/lib
lrwxrwxrwx   1 root root    9 Sep 21 19:48 lib32 -> usr/lib32
lrwxrwxrwx   1 root root    9 Sep 21 19:48 lib64 -> usr/lib64
lrwxrwxrwx   1 root root   10 Sep 21 19:48 libx32 -> usr/libx32
-rwxr-xr-x   1 root root  21M Dec 10 03:16 mc
drwxr-xr-x   2 root root 4.0K Sep 21 19:48 media
-rwxr-xr-x   1 root root 108M Dec  9 10:37 minio
drwxr-xr-x   2 root root 4.0K Sep 21 19:48 mnt
drwxr-xr-x   2 root root 4.0K Sep 21 19:48 opt
dr-xr-xr-x 450 root root    0 Dec 10 19:57 proc
-rwxrwxr-x   1 root root 5.4K Oct 13 11:34 process_functional_tests_result.py
drwx------   1 root root 4.0K Dec 10 18:21 root
drwxr-xr-x   1 root root 4.0K Nov 29 13:18 run
-rwxrwxr-x   1 root root 8.5K Dec 10 17:56 run.sh
lrwxrwxrwx   1 root root    8 Sep 21 19:48 sbin -> usr/sbin
-rwxrwxr-x   1 root root 1.5K Dec 10 17:56 setup_minio.sh
drwxr-xr-x   2 root root 4.0K Sep 21 19:48 srv
dr-xr-xr-x  13 root root    0 Dec 10 19:57 sys
drwxrwxrwt   1 root root 4.0K Dec 10 18:21 tmp
drwxr-xr-x   1 root root 4.0K Nov 29 13:18 usr
drwxr-xr-x   1 root root 4.0K Sep 21 20:00 var

And... file exists. It's some kind of obscure magic. I've also checked commits in my PR with these changes (for example https://github.com/ClickHouse/ClickHouse/runs/4482963254?check_suite_focus=true):

+ ls -lha
total 130M
drwxr-xr-x   1 root root 4.0K Dec 10 08:43 .
drwxr-xr-x   1 root root 4.0K Dec 10 08:43 ..
-rwxr-xr-x   1 root root    0 Dec 10 08:42 .dockerenv
lrwxrwxrwx   1 root root    7 Sep 21 13:48 bin -> usr/bin
drwxr-xr-x   2 root root 4.0K Apr 15  2020 boot
drwxr-xr-x   5 root root  340 Dec 10 08:43 dev
-rwxr-xr-x   1 root root 769K Nov 29 07:18 dpkg-deb
drwxr-xr-x   1 root root 4.0K Dec 10 08:43 etc
drwxr-xr-x   2 root root 4.0K Apr 15  2020 home
lrwxrwxrwx   1 root root    7 Sep 21 13:48 lib -> usr/lib
lrwxrwxrwx   1 root root    9 Sep 21 13:48 lib32 -> usr/lib32
lrwxrwxrwx   1 root root    9 Sep 21 13:48 lib64 -> usr/lib64
lrwxrwxrwx   1 root root   10 Sep 21 13:48 libx32 -> usr/libx32
-rwxr-xr-x   1 root root  21M Dec  9 21:16 mc
drwxr-xr-x   2 root root 4.0K Sep 21 13:48 media
-rwxr-xr-x   1 root root 108M Dec  9 04:37 minio
drwxr-xr-x   2 root root 4.0K Sep 21 13:48 mnt
drwxr-xr-x   2 root root 4.0K Sep 21 13:48 opt
drwxrwxr-x   2 1000 1000 4.0K Dec 10 08:42 package_folder
dr-xr-xr-x 307 root root    0 Dec 10 08:43 proc
-rwxrwxr-x   1 root root 5.4K Oct 13 05:34 process_functional_tests_result.py
drwx------   1 root root 4.0K Dec 10 08:11 root
drwxr-xr-x   1 root root 4.0K Dec 10 08:43 run
-rwxrwxr-x   1 root root 8.5K Dec 10 08:02 run.sh
lrwxrwxrwx   1 root root    8 Sep 21 13:48 sbin -> usr/sbin
-rwxrwxr-x   1 root root 1.5K Dec 10 08:02 setup_minio.sh
drwxr-xr-x   2 root root 4.0K Sep 21 13:48 srv
dr-xr-xr-x  13 root root    0 Dec 10 08:43 sys
drwxrwxr-x   2 1000 1000 4.0K Dec 10 08:42 test_output
drwxrwxrwt   1 root root 4.0K Dec 10 08:11 tmp
drwxr-xr-x   1 root root 4.0K Nov 29 07:18 usr
drwxr-xr-x   1 root root 4.0K Sep 21 14:00 var

The result of investigation is 0_o.

@alesapin
Copy link
Member Author

So I'll just wait and if everything Ok try to merge one more time. No other ideas.

@alexey-milovidov
Copy link
Member

"No such file or directory" is just a translation of ENOENT errno. And Linux kernel gives ENOENT in case of:

ENOENT The file pathname or a script or ELF interpreter does not exist, or a shared library needed for the file or interpreter cannot be found.

or a shared library needed for the file or interpreter cannot be found

And probably in the case if the dynamic loader is not executable or one of the shared libraries have some capabilities set or some obscure filesystem flags set.

@alesapin
Copy link
Member Author

Ok, everything works. I'm going to try one more time.

@alesapin alesapin merged commit 5a54251 into master Dec 10, 2021
@alesapin alesapin deleted the revert-32514-revert-32496-smaller_checks branch December 10, 2021 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants