Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add gcompat for the recognize app #1569

Merged
merged 1 commit into from
Dec 21, 2022
Merged

add gcompat for the recognize app #1569

merged 1 commit into from
Dec 21, 2022

Conversation

szaimen
Copy link
Collaborator

@szaimen szaimen commented Dec 21, 2022

It is needed for the recognize app IIRC.

Signed-off-by: Simon L szaimen@e.mail.de

Signed-off-by: Simon L <szaimen@e.mail.de>
@szaimen szaimen added 3. to review Waiting for reviews enhancement New feature or request labels Dec 21, 2022
@szaimen
Copy link
Collaborator Author

szaimen commented Dec 21, 2022

@marcelklehr I read that gcompat is required for the recognize app to work. Is this correct? :)

@thomasverelst
Copy link

thomasverelst commented Dec 21, 2022

Hi, this is based on my gist https://gist.github.com/thomasverelst/60403f22e7a9f5200b0dc43615c54720
where I included it.
I'll repeat here:
"I'm running my setup on ARM64 architecture for now (oracle free tier), so maybe that's why I needed gcompat. Recognize seems to include a nodejs binary, but it doesnt run in the docker (tried to run in the docker shell), and gcompat was required. I also tried to include nodejs and npm is apks in the master docker command but - even when pointing the path in Recognize to the right nodejs, it failed to execute. Maybe gcompat is not needed for x86-64, I did not test."

@marcelklehr
Copy link
Member

marcelklehr commented Dec 21, 2022

I read that gcompat is required for the recognize app to work.

Huh, I never read that anywhere. I've been running it fine inside docker on an x86_64 machine (until the setup destroyed itself, but that's another story...).

EDIT: Ah, you mean for the alpine package, yeah, glibc is necessary if you don't want to run in WASM mode

@szaimen szaimen added this to the next milestone Dec 21, 2022
@szaimen szaimen merged commit ce54d6d into main Dec 21, 2022
@delete-merged-branch delete-merged-branch bot deleted the enh/noid/add-gcompat branch December 21, 2022 22:21
@szaimen
Copy link
Collaborator Author

szaimen commented Dec 23, 2022

This is now released with v4.0.1 Beta. Testing and feedback is welcome! See https://github.com/nextcloud/all-in-one#how-to-switch-the-channel

@erwanlpfr
Copy link

Hello,

With the actual beta tag from docker hub, I still have the issue mentioned here : nextcloud/recognize#604

Do you have any idea how to debug it?
Since I can't run npm ci.

Thanks !

@marcelklehr
Copy link
Member

@erwanlpfr Hello! Can you try running .../recognize/bin/node .../recognize/src/test_libtensorflow.js where .../recognize is the path to recognize on your server?

@erwanlpfr
Copy link

Hmm, I waited one day and it seems to work.
Maybe it was slow for some reasons.

Thanks for your reply!

@marcelklehr
Copy link
Member

Great! I'm happy it works now :)

@fabianbees
Copy link

fabianbees commented Jan 19, 2023

@erwanlpfr Hello! Can you try running .../recognize/bin/node .../recognize/src/test_libtensorflow.js where .../recognize is the path to recognize on your server?

@marcelklehr For me recognize still doesn't work in nextcloud-aio container, gcompat doesn't solve the issue with libtensorflow not beeing compatible with Alpine/musl.

The following command never finishes. That explains the loading icon of recognize described here: nextcloud/recognize#609

bash-5.1# pwd
/var/www/html/custom_apps/recognize
bash-5.1# ./bin/node ./src/test_libtensorflow.js

If I install Alpine's musl version of nodejs (v16.17.1) via apk, the test_libtensorflow.js finishes, but with an error.
gcompat does not help with the incompatibility of libtensorflow with musl, see:

bash-5.1# apk add nodejs npm gcompat
(1/3) Installing c-ares (1.18.1-r0)
(2/3) Installing nodejs (16.17.1-r0)
(3/3) Installing npm (8.10.0-r0)
Executing busybox-1.35.0-r17.trigger
OK: 371 MiB in 205 packages
bash-5.1# which node
/usr/bin/node
bash-5.1# /usr/bin/node --version
v16.17.1
bash-5.1# /usr/bin/node ./src/test_libtensorflow.js
node:internal/modules/cjs/loader:1210
  return process.dlopen(module, path.toNamespacedPath(filename));
                 ^

Error: Error relocating /var/www/html/custom_apps/recognize/node_modules/@tensorflow/tfjs-node/lib/napi-v8/../../deps/lib/libtensorflow.so.2: __memcpy_chk: symbol not found
    at Object.Module._extensions..node (node:internal/modules/cjs/loader:1210:18)
    at Module.load (node:internal/modules/cjs/loader:1004:32)
    at Function.Module._load (node:internal/modules/cjs/loader:839:12)
    at Module.require (node:internal/modules/cjs/loader:1028:19)
    at require (node:internal/modules/cjs/helpers:102:18)
    at Object.<anonymous> (/var/www/html/custom_apps/recognize/node_modules/@tensorflow/tfjs-node/dist/index.js:72:16)
    at Module._compile (node:internal/modules/cjs/loader:1126:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1180:10)
    at Module.load (node:internal/modules/cjs/loader:1004:32)
    at Function.Module._load (node:internal/modules/cjs/loader:839:12) {
  code: 'ERR_DLOPEN_FAILED'
}

without gcompat, libtensorflow complains about the missing glibc library:

bash-5.1# apk del gcompat
(1/3) Purging gcompat (1.0.0-r4)
(2/3) Purging musl-obstack (1.2.3-r0)
(3/3) Purging libucontext (1.2-r0)
OK: 371 MiB in 202 packages
bash-5.1# /usr/bin/node ./src/test_libtensorflow.js
node:internal/modules/cjs/loader:1210
  return process.dlopen(module, path.toNamespacedPath(filename));
                 ^

Error: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /var/www/html/custom_apps/recognize/node_modules/@tensorflow/tfjs-node/lib/napi-v8/../../deps/lib/libtensorflow.so.2)
    at Object.Module._extensions..node (node:internal/modules/cjs/loader:1210:18)
    at Module.load (node:internal/modules/cjs/loader:1004:32)
    at Function.Module._load (node:internal/modules/cjs/loader:839:12)
    at Module.require (node:internal/modules/cjs/loader:1028:19)
    at require (node:internal/modules/cjs/helpers:102:18)
    at Object.<anonymous> (/var/www/html/custom_apps/recognize/node_modules/@tensorflow/tfjs-node/dist/index.js:72:16)
    at Module._compile (node:internal/modules/cjs/loader:1126:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1180:10)
    at Module.load (node:internal/modules/cjs/loader:1004:32)
    at Function.Module._load (node:internal/modules/cjs/loader:839:12) {
  code: 'ERR_DLOPEN_FAILED'
}

@marcelklehr
Copy link
Member

In that case I'm out of ideas.

@szaimen
Copy link
Collaborator Author

szaimen commented Jan 20, 2023

@fabianbees thanks for testing! Can you check if adding libc6-compat additionally makes it work?

@fabianbees
Copy link

@fabianbees thanks for testing! Can you check if adding libc6-compat additionally makes it work?

That I have also tested already, for me it didn't make any difference, unfortunately still the same issue.

Maybe the only viable long term solution for enabling compatibility with recognize would be to move this container back to debian userspace which uses glibc natively.

@szaimen
Copy link
Collaborator Author

szaimen commented Jan 20, 2023

No, moving back to debian is not an option for AIO. So we need to find a way to make it work with alpine linux.

@szaimen
Copy link
Collaborator Author

szaimen commented Jan 20, 2023

What I wonder: wasnt recognize already running in wasm mode on alpine linux in the past? Why is it not running at all anymore? Should we remove gcompat again?

@szaimen
Copy link
Collaborator Author

szaimen commented Jan 20, 2023

Wow, is this the final answer? tensorflow/tfjs#1425 I thought recognize was able to work in wasm mode at some point?

@marcelklehr
Copy link
Member

wasnt recognize already running in wasm mode on alpine linux in the past? Why is it not running at all anymore? Should we remove gcompat again?

I believe it still works in WASM mode. WIth gcompat the check whether libtensorflow can be loaded just never ends. WIthout gcompat it simply fails.

@szaimen
Copy link
Collaborator Author

szaimen commented Jan 21, 2023

I see. Then removing gcompat again probably makes sense since it at least finishes the test then and allows to enable the wasm mode.

@szaimen
Copy link
Collaborator Author

szaimen commented Jan 21, 2023

Will be done in #1816

@szaimen
Copy link
Collaborator Author

szaimen commented Feb 2, 2023

This is now released with v4.3.0 Beta. Testing and feedback is welcome! See https://github.com/nextcloud/all-in-one#how-to-switch-the-channel

@thomasverelst
Copy link

thomasverelst commented Feb 4, 2023

I think the beta update (Nextcloud AIO v4.3.1) broke my recognize (reinstalled v3.4.0) (WASM toggle enabled in Recognize settings)

Could not execute the Node.js binary. You may need to set the path to a working binary manually.

In logs:

Exception: Failed to install Tensorflow.js: sh: /var/www/html/custom_apps/recognize/bin/node: not found

It used to work before (in WASM mode on ARM), so it seems that some setups require gcompat (luckily its easy to add manually). I hope I'm not looking at this in the wrong way, just reporting what I see :-)

@szaimen
Copy link
Collaborator Author

szaimen commented Feb 4, 2023

There seems to be no way to make it work for everyone so we removed gcompat again (as it breaks recognize on other instances). But if you need it you can add it back as you did 👍

@szaimen
Copy link
Collaborator Author

szaimen commented Feb 19, 2023

Looks like this should fix the requirement on gcompat for arm64 in a future update: nextcloud/recognize@30602dc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review Waiting for reviews enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants