Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build of bindings-common fails because of early file removal #2

Open
quentin-ag opened this issue Aug 28, 2020 · 5 comments
Open

Build of bindings-common fails because of early file removal #2

quentin-ag opened this issue Aug 28, 2020 · 5 comments

Comments

@quentin-ag
Copy link

quentin-ag commented Aug 28, 2020

Level

Minor

Component

bindings-common (build)

Environment

  • COMPSs version: 2.7
  • Java version: 1.8.0
  • Python version: 3.8.5
  • GCC version: 10.2.0
  • libtool version: 2.4.6
  • Operating System: Linux (Manjaro)[1]

[1] successfully reproduced on Debian 10, CentOS 7 and Arch Linux.

Description

The build process of bindings-common fails because a necessary file cannot be found. The logs show that it is removed too early.
This issue seems similar to #1, although I do not see how they could have the same cause.

Minimal example to reproduce

cd ${compss_src}/builders
./buildlocal ${compss_target}

The error can then be reproduced with the same command as executed by buildlocal:

cd ${compss_src}/builders/tmp/compss/programming_model/bindings/bindings-common
./install_common "${compss_target}/Bindings/bindings-common"

Exception

The script ${compss_src}/compss/programming_model/bindings/bindings-common/install-common – called by ${compss_src}/builders/buildlocal – exits during the make clean install instruction with the error message

libtool:   error: 'libbindings_common_la-BindingDataManager.lo' is not a valid libtool object

Below is the relevant excerpt from the full output [2]:

/bin/sh ../libtool  --tag=CXX   --mode=link g++  -g -O2 -shared -L/usr/lib/jvm/java-8-openjdk/jre/lib/amd64/server -ljvm  -o libbindings_common.la -rpath ${compss_target}/Bindings/bindings-common/lib libbindings_common_la-BindingDataManager.lo libbindings_common_la-BindingExecutor.lo libbindings_common_la-JavaNioConnStreamBuffer.lo libbindings_common_la-AbstractCache.lo libbindings_common_la-compss_worker.lo libbindings_common_la-common.lo libbindings_common_la-GS_compss.lo  
libtool:   error: 'libbindings_common_la-BindingDataManager.lo' is not a valid libtool object
make[1]: *** [Makefile:436: libbindings_common.la] Error 1
make[1]: Leaving directory '${compss_src}/builders/tmp/compss/programming_model/bindings/bindings-common/src'
make: *** [Makefile:404: install-recursive] Error 1

The error message occasionally mentions another libbindings_common_la_*.lo file instead of libbindings_common_la-BindingDataManager.lo.

N.B. I have overwritten some file paths, such as ${compss_src}.

[2] Output of

cd ${compss_src}/builders/tmp/compss/programming_model/bindings/bindings-common
./install_common "${compss_target}/Bindings/bindings-common"

Expected behaviour and workaround

The build should not fail. Supposedly, libbindings_common_la-BindingDataManager.lo should not be removed until it is no longer necessary.

I worked around the issue by not removing any of these files:

diff --git a/compss/programming_model/bindings/bindings-common/src/Makefile.am b/compss/programming_model/bindings/bindings-common/src/Makefile.am
index 529340b5b..84ab6f144 100644
--- a/compss/programming_model/bindings/bindings-common/src/Makefile.am
+++ b/compss/programming_model/bindings/bindings-common/src/Makefile.am
@@ -25,4 +25,4 @@ libbindings_common_la_LDFLAGS = -shared -L$(JAVA_LIB_DIR) -ljvm
 ACLOCAL_AMFLAGS =-I m4

 clean:
-	rm -f *.o *.lo *~
+	rm -f *.o *~

This modification solved the issue for me. However, it does not look like a clean fix.

@jorgee
Copy link
Member

jorgee commented Aug 31, 2020

Looking at the full log. It seems the installation is done twice! First one is working second one is failing.

@quentin-ag
Copy link
Author

quentin-ag commented Aug 31, 2020

Your observation is right. Unfortunately, the log file was wrong.

I have just reproduced the error, and what I get is only the second part. I must have made a mistake when I captured the log, and appended to the log file instead of overwriting it (>> instead of >), I imagine.

I have updated the link to point to a new, correct log file (of the same command [2]). I apologise for the mistake.

@jorgee
Copy link
Member

jorgee commented Sep 1, 2020

Do you have a portable environment where I can test this installation with the OS which is failing (such as a docker image)?

@quentin-ag
Copy link
Author

Unfortunately, no. I have tried to reproduce the issue in an environment that you could use, especially with the same GCC and libtool versions, but the problem did not appear.

I have tried again with the latest version of branch 2.7 (commit b2e235f). The issue is unchanged on my Arch-based distributions (Manjaro and Arch Linux) and the workaround still works. There is no such problem on my Debian and CentOS environment, but I am not sure what to deduce from this.

At the moment I do not have much time to try to reproduce the issue in a portable environment. I propose to keep this ticket open so that I can give updates when I have more information, or if the situation changes.

@jorgee
Copy link
Member

jorgee commented Sep 16, 2020

I have seen an official arch-linux docker image. I was looking a Manjaro one. I will try to install there to see if the problem is happening there.

compsuperscalar pushed a commit that referenced this issue Feb 10, 2021
Resolve "Stop MN agents on application end"

Closes #2

See merge request wdc/compss/framework!5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants