Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins build sometimes fails in postgres #5380

Open
svarnau opened this issue Aug 11, 2020 · 5 comments
Open

Jenkins build sometimes fails in postgres #5380

svarnau opened this issue Aug 11, 2020 · 5 comments
Assignees

Comments

@svarnau
Copy link
Member

svarnau commented Aug 11, 2020

Standard error from external program {{ make MAKELEVEL=0 -j 80 }} running in '/nfusr/centos-gcp-cloud/jenkins-worker-4a1/jenkins/jenkins-github-yugabyte-db-centos-master-clang-release-1176/build/release-clang-dynamic-ninja/postgres_build', saving stdout to {{ /nfusr/centos-gcp-cloud/jenkins-worker-4a1/jenkins/jenkins-github-yugabyte-db-centos-master-clang-release-1176/build/release-clang-dynamic-ninja/postgres_build/make.out }}, stderr to {{ /nfusr/centos-gcp-cloud/jenkins-worker-4a1/jenkins/jenkins-github-yugabyte-db-centos-master-clang-release-1176/build/release-clang-dynamic-ninja/postgres_build/make.err }}:
cat: access/objfiles.txt: No such file or directory
cat: bootstrap/objfiles.txt: No such file or directory
cat: catalog/objfiles.txt: No such file or directory
cat: parser/objfiles.txt: No such file or directory
cat: commands/objfiles.txt: No such file or directory
cat: executor/objfiles.txt: No such file or directory
cat: foreign/objfiles.txt: No such file or directory
cat: lib/objfiles.txt: No such file or directory
...

@svarnau svarnau self-assigned this Aug 11, 2020
@svarnau svarnau changed the title Parallel build sometimes fails in postgres Jenkins build sometimes fails in postgres Aug 12, 2020
@svarnau
Copy link
Member Author

svarnau commented Aug 12, 2020

I looked through a number of build logs looking for patterns. Found several builds that had real compile errors. Found one that seemed to be a network issue (no route to host). I also found a few that seemed to be incorrect dependency files, such as:

.deps/fmgrtab.Po:139: *** missing separator. Stop.

These errors seem to most likely be parallelism or network issues.

@svarnau
Copy link
Member Author

svarnau commented Aug 13, 2020

The generation of dependency files could be causing this error with parallel make, if a .Po dependency file is being written while another make process is trying to read it, we could see makefile format error.

The postgres makefile has an option to not generate dependencies, which might be good for jenkins running a clean build, but it is not clear if that changes other compiler configuration.

@svarnau
Copy link
Member Author

svarnau commented Aug 13, 2020

Trial build with autoupdate=no looks very promising to build postgres without dependency files. Would need to make that optional flag that jenkins can use.

@svarnau
Copy link
Member Author

svarnau commented Aug 14, 2020

Changing build to not enable dependencies works okay with release/debug builds but not with asan/tsan builds. Alternative approach is to leave dependencies alone and retry postgres build if it hits this error.

svarnau added a commit that referenced this issue Apr 14, 2021
Summary:
Re-try postgres build if transient error message is found in the error log.
For now, the only message we look for is due to creation of dependency files that can cause a make
error:
   .deps/fmgrtab.Po:139: *** missing separator. Stop.

Test Plan:
yb_build.sh, seeding a postgres code error that looks like the transient error.
Jenkins: compile only

Reviewers: mbautin, jharveysmith

Reviewed By: jharveysmith

Subscribers: rsami, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D9158
@svarnau svarnau closed this as completed Apr 15, 2021
@mbautin mbautin reopened this Apr 15, 2021
@mbautin
Copy link
Collaborator

mbautin commented Apr 15, 2021

I think we should not close this because the current workaround does not address the root cause.

YintongMa pushed a commit to YintongMa/yugabyte-db that referenced this issue May 26, 2021
…ent error.

Summary:
Re-try postgres build if transient error message is found in the error log.
For now, the only message we look for is due to creation of dependency files that can cause a make
error:
   .deps/fmgrtab.Po:139: *** missing separator. Stop.

Test Plan:
yb_build.sh, seeding a postgres code error that looks like the transient error.
Jenkins: compile only

Reviewers: mbautin, jharveysmith

Reviewed By: jharveysmith

Subscribers: rsami, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D9158
mbautin added a commit that referenced this issue Jun 30, 2021
Summary:
Improving `compile_commands.json` generation to achieve fully error-free offline indexing of our C/C++ code using clangd-indexer. Soon clangd-indexer will be packaged with the LLVM build that we distribute and use, and then we can run clangd-indexer during our CI/CD and verify that the compile_commands.json file remains correct.

Also fixing a bug in the `build_postgres.py` script where we are detecting and retrying transient errors during PostgreSQL build. It looks like these retries were not happening because we were not splitting postgres build stderr into lines.

Removing logic to re-execute the remote compilation script on the same host with Bash debugging turned on compiler-wrapper.sh in case of empty stderr. This was causing errors from which the retry loop was failing to recover, while making the logic more complicated and not providing any benefits in diagnosing issues. Instead of this, simply retrying all non-zero error codes in case stderr is empty.

Also fixing various shellcheck issues.

Test Plan:
Jenkins

```
/opt/yb-build/llvm/yb-llvm-v12.0.0-1624139287-d28af7c6/bin/clangd-indexer --executor=all-TUs compile_commands.json >clangd.dex
```

Then start Visual Studio Code with the following settings:

```
    "clangd.path": "/opt/yb-build/llvm/yb-llvm-v12.0.0-1624139287-d28af7c6/bin/clangd",
    "clangd.arguments": [
        "-index-file=/my/home/directory/code/yugabyte-db/clangd.dex",
        "--background-index=false"
    ],

```

and verify that code navigation works and that the Clangd server initializes properly and loads the static index:
```
I[18:58:31.050] clangd version 12.0.0 (https://github.com/llvm/llvm-project.git d28af7c654d8db0b68c175db5ce212d74fb5e9bc)
I[18:58:31.050] PID: 554546
I[18:58:31.050] Working directory: /my/home/directory/code/yugabyte-db
I[18:58:31.050] argv[0]: /opt/yb-build/llvm/yb-llvm-v12.0.0-1624139287-d28af7c6/bin/clangd
I[18:58:31.050] argv[1]: -index-file=/my/home/directory/code/yugabyte-db/clangd.dex
I[18:58:31.050] argv[2]: --background-index=false
I[18:58:31.050] Starting LSP over stdin/stdout
I[18:58:31.050] <-- initialize(0)
I[18:58:31.050] Client supports legacy semanticHighlights notification and standard semanticTokens request, choosing the latter (no notifications).
I[18:58:31.051] --> reply:initialize(0) 0 ms
I[18:58:31.101] <-- initialized
I[18:58:31.103] <-- textDocument/didOpen
I[18:58:31.196] Loaded compilation database from /my/home/directory/code/yugabyte-db/compile_commands.json
```

Also `./yb_build.sh --shellcheck`

Reviewers: rsami, rskannan, jharveysmith, steve.varnau

Reviewed By: steve.varnau

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D12036
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants