Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation issues on Alpine Linux #102

Closed
brebs-gh opened this issue Aug 8, 2021 · 13 comments
Closed

Compilation issues on Alpine Linux #102

brebs-gh opened this issue Aug 8, 2021 · 13 comments

Comments

@brebs-gh
Copy link

brebs-gh commented Aug 8, 2021

Below is an APKBUILD file, to package swi-prolog for Alpine Linux (which uses Musl instead of glibc). It has ".txt" appended to its filename, to upload it as a file here.

APKBUILD.txt

With swi-prolog 8.3.27 (and many previous versions), these tests fail:

53 - semweb:con (SEGFAULT)
58 - semweb:rdf_db (SEGFAULT)
59 - semweb:subprop (SEGFAULT)

With swi-prolog 8.3.28, the "26 - swipl:thread (SEGFAULT)" test also fails, and the test phase never finishes.

@JanWielemaker
Copy link
Member

This issue has been mentioned on SWI-Prolog. There might be relevant details there:

https://swi-prolog.discourse.group/t/how-to-build-a-docker-image-with-swipl-and-jpl/4255/9

@brebs-gh
Copy link
Author

Running with:
USE_PUBLIC_NETWORK_TESTS=false ctest -j 16 --output-on-failure || true

shows:

` File "/home/brebs/apk/swipl/src/swipl-8.3.28/packages/language_server/python/test_prologserver.py", line 650, in test_server_options_and_shutdown
self.assertEqual(afterShutdownThreads, initialThreads)
AssertionError: Lists differ: ['mai[107 chars]er1_conn1_comm:running', 'language_server3_conn3_goal:running'] != ['mai[107 chars]er1_conn1_comm:running']

First list contains 1 additional elements.
First extra element 5:
'language_server3_conn3_goal:running'

['main:running',
'gc:running',
'language_server1:running',
'language_server1_conn1_goal:running',

  • 'language_server1_conn1_comm:running',
    ? ^
  • 'language_server1_conn1_comm:running']
    ? ^
  • 'language_server3_conn3_goal:running']`

@brebs-gh
Copy link
Author

Compiling without -DCMAKE_BUILD_TYPE=PGO stops the test phase from hanging. Still has the usual 3 test failures:

 54 - semweb:con (SEGFAULT)
 59 - semweb:rdf_db (SEGFAULT)
 60 - semweb:subprop (SEGFAULT)

@JanWielemaker
Copy link
Member

The hang probably relates to https://swi-prolog.discourse.group/t/difficult-to-reproduce-problems-while-running-tests/4266/20?u=jan. We expect a proper patch for that soon. The others are unclear.

@brebs-gh
Copy link
Author

brebs-gh commented Sep 2, 2021

With swi-prolog 8.3.29, PGO compilation combined with testing runs without hanging, with the 4 test failures as above:

46 - mqi:mqi (Failed)
54 - semweb:con (SEGFAULT)
59 - semweb:rdf_db (SEGFAULT)
60 - semweb:subprop (SEGFAULT)

The mqi test error is:
AssertionError: Lists differ: ['mai[59 chars]running', 'mqi1_conn1_comm:running', 'mqi3_conn3_goal:running'] != ['mai[59 chars]running', 'mqi1_conn1_comm:running']

@JanWielemaker
Copy link
Member

I'm trying to have a look using Docker. I think this is running Alpine 3.14 (how to verify). Your deps refer to openjdk15, while openjdk11 seems the latest here. The package ossp-uuid-dev lacks as well (and is needed by mqi). Am I missing something?

@EricZinda
Copy link

EricZinda commented Sep 2, 2021

I'm investigating the language_server failure above. It is happening on the last line of code below and means that the goal thread for the connection created using Unix Domain sockets did not go away.

  1. When the test calls stop_language_server/1 the predicate eventually calls thread_signal(Thread_ID, abort) on the goal thread.
  2. The test then calls the Python thread_list to get a list of active threads using Prolog thread_property and compares them to the list before the test to see if any are hanging around that should have been aborted.

I believe there could be a race in this test where the goal thread either hasn't reacted to the abort yet, or thread_property hasn't been updated yet. Not entirely sure how the Prolog system works so I can't say for sure.

I added a couple of comments below that could test this assertion. If you put them into test_prologserver.py and uncomment the sleep(5) line you can see if that fixes it to test the race condition suspicion.

                    # unixDomainSocket() should be used if supplied (non-windows).
                    socketPath = mkdtemp()
                    unixDomainSocket = PrologServer.unix_domain_socket_file(socketPath)
                    result = monitorThread.query("language_server([unix_domain_socket('{}'), password(testpassword), server_thread(ServerThreadID)])".format(unixDomainSocket))
                    serverThreadID = result[0]["ServerThreadID"]
                    with PrologServer(launch_server=False, unix_domain_socket=unixDomainSocket, password="testpassword", prolog_path=self.prologPath) as newServer:
                        with newServer.create_thread() as prologThread:
                            result = prologThread.query("true")
                            self.assertEqual(result, True)
                    result = monitorThread.query("stop_language_server({})".format(serverThreadID))
                    self.assertEqual(result, True)
                    # Uncomment to see if it fixes it
                    # sleep(5)
                    afterShutdownThreads = self.thread_list(monitorThread)
                    self.assertEqual(afterShutdownThreads, initialThreads)

@JanWielemaker
Copy link
Member

Running it verbose in the Alpine Docker indicates that, because ossp-uuid is lacking, uuid(UUID, [format(integer)]) fails. I extended the pure Prolog emulation of the UUID library and now the mqi test passes.

@brebs-gh
Copy link
Author

brebs-gh commented Sep 2, 2021

To clarify - I run Alpine "Edge" (basically the bleeding-edge), which becomes the next version of Alpine. Kinda like e.g. Debian Unstable.

To show the OS version, run:
cat /etc/os-release

ossp-uuid is only in Edge, in the testing repo (which is not enabled by default), so if you have removed its dependency then that's ideal :-)

@brebs-gh
Copy link
Author

brebs-gh commented Sep 2, 2021

For the openjdk 15, please change it to 11, or basically the latest available. I don't think Alpine's package manager has the concept of the latest version. In Edge I've got openjdk 9 to 16 available, visible with e.g.:

apk search openjdk | grep src | sort

@JanWielemaker
Copy link
Member

Pushed SWI-Prolog/swipl-devel@836862a, which seems to work around the crashes. Before, pushed an update to library(uuid) to provide the UUID services mqi requires in pure Prolog, so it works if ossp-uuid is not around.

Now trying the whole build on our Ci environment. It seems to build well, but there is a small issue with the reporting, so the result pages do not update properly. Will look at that tomorrow.

@JanWielemaker
Copy link
Member

Enjoy 100% success at https://dev.swi-prolog.org/ci/home. You find the dependencies and config at https://github.com/SWI-Prolog/docker-swipl-linux-ci/tree/master/alpine/3.14

@JanWielemaker
Copy link
Member

This issue has been mentioned on SWI-Prolog. There might be relevant details there:

https://swi-prolog.discourse.group/t/how-to-build-a-docker-image-with-swipl-and-jpl/4255/10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants