Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server crash when hedgedoc container infrastructure is missing avx cpu flag with "Illegal instruction" #5229

Open
Guilamb opened this issue Nov 8, 2023 · 5 comments
Labels
type: bug Something isn't working

Comments

@Guilamb
Copy link

Guilamb commented Nov 8, 2023

Description

Issue Description


We are currently transitioning from Etherpad to Hedgedoc as our primary note taking tool.
During the migration, we encountered an issue where copying and pasting bullet points from Etherpad into a Hedgedoc note, as well as uploading images, have been consistently resulting in a server crash.

As an example the following text always triggers a server crash :

    Mysql extended version 3 salles sous bosh et non extended version en cours de migration sur k3s

For the images, they are correctly uploaded on the server before the crash, but not displayed when the server restart.

We have successfully used the migration tool for old notes, and although the final layout requires some adjustments, no crashes occurred during the migration process.

Steps to reproduce

  1. Copy the specified text or bullet points from an Etherpad document.
  2. Attempt to paste it into a Hedgedoc note or upload an image to a note.
  3. Observe that the server crashes upon performing these actions.

Expected behaviour

Content from Etherpad should be pasted into Hedgedoc notes without causing server crashes, and image uploads should work as expected.

Logs

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:19.547Z info:     Executing (default): SELECT `sid`, `expires`, `data`, `createdAt`, `updatedAt` FROM `Sessions` AS `Session` WHERE `Session`.`sid` = 'olFtjwSmXX

U4_Q4DB9islT6X-SI4Xnbc';                                                                                                                                                                                    

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:19.547Z info:     Executing (default): SELECT count(*) AS `count` FROM `Notes` AS `Note`;                                                                        

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:19.550Z info:     Executing (default): SELECT count(*) AS `count` FROM `Users` AS `User`;                                                                        

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:19.551Z info:     Executing (default): INSERT INTO `Sessions` (`sid`,`expires`,`data`,`createdAt`,`updatedAt`) VALUES (?,?,?,?,?);                               

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:19.552Z info:     10.16.18.69 - - [08/Nov/2023:13:49:19 +0000] "GET /status HTTP/1.1" 200 266 "-" "kube-probe/1.24"                                              

hedgedoc-859b57d5cb-nq5qc                                                                                                                                                                                   

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:19.554Z info:     10.16.18.69 - - [08/Nov/2023:13:49:19 +0000] "GET /_health HTTP/1.1" 200 14 "-" "kube-probe/1.24"                                              

hedgedoc-859b57d5cb-nq5qc                                                                                                                                                                                   

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:24.547Z info:     Executing (default): SELECT count(*) AS `count` FROM `Notes` AS `Note`;                                                                        

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:24.547Z info:     Executing (default): SELECT `sid`, `expires`, `data`, `createdAt`, `updatedAt` FROM `Sessions` AS `Session` WHERE `Session`.`sid` = 'CL11YqZ2k4

f7ozsX5GlsM7gV30RfqlfE';                                                                                                                                                                                    

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:24.550Z info:     Executing (default): SELECT count(*) AS `count` FROM `Users` AS `User`;                                                                        

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:24.551Z info:     Executing (default): INSERT INTO `Sessions` (`sid`,`expires`,`data`,`createdAt`,`updatedAt`) VALUES (?,?,?,?,?);                               

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:24.552Z info:     10.16.18.69 - - [08/Nov/2023:13:49:24 +0000] "GET /status HTTP/1.1" 200 266 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:24.553Z info:     10.16.18.69 - - [08/Nov/2023:13:49:24 +0000] "GET /_health HTTP/1.1" 200 14 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:27.485Z debug:     SERVER received [a6825b47-7c22-45b7-b2d8-811d2d366af0] user status from [_h4At88jS-rBJ7vEAAAA]: {"idle":false,"type":"lg"}

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:29.546Z info:     Executing (default): SELECT `sid`, `expires`, `data`, `createdAt`, `updatedAt` FROM `Sessions` AS `Session` WHERE `Session`.`sid` = 'Xwc2NHwCXX

YDsFRH9P2vMqrCxdmBqmYR';

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:29.547Z info:     Executing (default): SELECT count(*) AS `count` FROM `Notes` AS `Note`;

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:29.550Z info:     Executing (default): INSERT INTO `Sessions` (`sid`,`expires`,`data`,`createdAt`,`updatedAt`) VALUES (?,?,?,?,?);

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:29.551Z info:     Executing (default): SELECT count(*) AS `count` FROM `Users` AS `User`;

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:29.552Z info:     10.16.18.69 - - [08/Nov/2023:13:49:29 +0000] "GET /status HTTP/1.1" 200 266 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:29.553Z info:     10.16.18.69 - - [08/Nov/2023:13:49:29 +0000] "GET /_health HTTP/1.1" 200 14 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:34.546Z info:     Executing (default): SELECT `sid`, `expires`, `data`, `createdAt`, `updatedAt` FROM `Sessions` AS `Session` WHERE `Session`.`sid` = 'HXNfFq3oTG

cJFs_pjy1OPWlpJKWjT_MS';

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:34.547Z info:     Executing (default): SELECT count(*) AS `count` FROM `Notes` AS `Note`;

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:34.550Z info:     Executing (default): INSERT INTO `Sessions` (`sid`,`expires`,`data`,`createdAt`,`updatedAt`) VALUES (?,?,?,?,?);

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:34.550Z info:     Executing (default): SELECT count(*) AS `count` FROM `Users` AS `User`;

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:34.552Z info:     10.16.18.69 - - [08/Nov/2023:13:49:34 +0000] "GET /status HTTP/1.1" 200 266 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:34.552Z info:     10.16.18.69 - - [08/Nov/2023:13:49:34 +0000] "GET /_health HTTP/1.1" 200 14 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:34.552Z info:     10.16.18.69 - - [08/Nov/2023:13:49:34 +0000] "GET /_health HTTP/1.1" 200 14 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:39.547Z info:     Executing (default): SELECT `sid`, `expires`, `data`, `createdAt`, `updatedAt` FROM `Sessions` AS `Session` WHERE `Session`.`sid` = 'v9rM1YtDlM

R8fECzX95Xi7PUGDRUmGee';

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:39.547Z info:     Executing (default): SELECT count(*) AS `count` FROM `Notes` AS `Note`;

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:39.550Z info:     Executing (default): SELECT count(*) AS `count` FROM `Users` AS `User`;

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:39.550Z info:     Executing (default): INSERT INTO `Sessions` (`sid`,`expires`,`data`,`createdAt`,`updatedAt`) VALUES (?,?,?,?,?);

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:39.552Z info:     10.16.18.69 - - [08/Nov/2023:13:49:39 +0000] "GET /status HTTP/1.1" 200 266 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:39.553Z info:     10.16.18.69 - - [08/Nov/2023:13:49:39 +0000] "GET /_health HTTP/1.1" 200 14 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:44.547Z info:     Executing (default): SELECT count(*) AS `count` FROM `Notes` AS `Note`;

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:44.547Z info:     Executing (default): SELECT `sid`, `expires`, `data`, `createdAt`, `updatedAt` FROM `Sessions` AS `Session` WHERE `Session`.`sid` = 'Z9JdDqQJ7G

tKmFY3l6TMnC99PRuJEcUT';

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:44.552Z info:     Executing (default): SELECT count(*) AS `count` FROM `Users` AS `User`;

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:44.552Z info:     Executing (default): INSERT INTO `Sessions` (`sid`,`expires`,`data`,`createdAt`,`updatedAt`) VALUES (?,?,?,?,?);

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:44.555Z info:     10.16.18.69 - - [08/Nov/2023:13:49:44 +0000] "GET /status HTTP/1.1" 200 266 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

hedgedoc-859b57d5cb-nq5qc 2023-11-08T13:49:44.558Z info:     10.16.18.69 - - [08/Nov/2023:13:49:44 +0000] "GET /_health HTTP/1.1" 200 14 "-" "kube-probe/1.24"

hedgedoc-859b57d5cb-nq5qc 

Stream closed EOF for zz-test-hedgedoc-cnpg/hedgedoc-859b57d5cb-nq5qc (hedgedoc)

Config

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: hedgedoc

spec:
  interval: 3m
  chart:
    spec:
      chart: charts/hedgedoc
      version: 0.3.1
      sourceRef:
        kind: GitRepository
        name: adfinis-helmcharts-git-repo
      interval: 1m
  values:
    image:
      tag: "1.9.9"
    postgresql:
      enabled: false
    externalDatabase:
      type: "mysql"
      username: "XXX"
      password: ${db_password}
      host: ${db_host}
      port: 3306
      database: "XXX"
    hedgedoc:
      persistentStorage:
        storageClass: "local-path"
      env:
        CMD_DOMAIN: hedgedoc.${cloudfoundry_ops_domain}
        CMD_PROTOCOL_USESSL: true
        CMD_DOCUMENT_MAX_LENGTH: 500000
        CMD_ALLOW_FREEURL: true
        CMD_DB_DIALECT: mariadb
        CMD_DB_HOST: ${db_host}
        CMD_DB_PORT: 3306
        CMD_DB_DATABASE: XXX
        CMD_DB_USERNAME: XXX
        CMD_DB_PASSWORD: ${db_password}
        CMD_LOGLEVEL: debug
        DEBUG: true

## MariaDB
---
+--------------+-----------------------------------------+----------------------------+------------------------+----------+----------------+

| CATALOG_NAME | SCHEMA_NAME                             | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME | SQL_PATH | SCHEMA_COMMENT |

+--------------+-----------------------------------------+----------------------------+------------------------+----------+----------------+

| def          | information_schema                      | utf8mb3                    | utf8mb3_general_ci     | NULL     |                |

| def          | cf_b7c7ea3f_bcbe_4c50_a9a6_888427c79d4d | utf8mb4                    | utf8mb4_unicode_ci     | NULL     |                |

+--------------+-----------------------------------------+----------------------------+------------------------+----------+----------------+

Your Setup

The chart use the official docker image : https://quay.io/repository/hedgedoc/hedgedoc
as we can see here : https://github.com/adfinis/helm-charts/blob/44765d55692c21a7e08b34a88bafde6dfc69e4e2/charts/hedgedoc/values.yaml#L9

Same error encountered using this other chart :
https://artifacthub.io/packages/helm/truecharts/hedgedoc/9.0.9

Additional context

More information about our intended use of hedgedoc : https://github.com/orange-cloudfoundry/paas-templates/issues/1979

@Guilamb Guilamb added the type: bug Something isn't working label Nov 8, 2023
@Guilamb
Copy link
Author

Guilamb commented Nov 8, 2023

I've tried to get a dump core of the app using the flag : "--abort-on-uncaught-exception", but without success.
I have also tried a bunch of other options in my search for useful traces to debug the application without getting anything that seemed relevant to me.

NODE_DEBUG=cluster,net,http,fs,tls,module,timers nodejs --report-uncaught-exception --report-on-signal --report-on-fatalerror --report-compact --abort-on-uncaught-exception --report-filename="/hedgedoc/public/uploads/report.json" app.js >public/uploads/logs.txt 2>&1

Here a tail of the 50 last lines before crashing with the previous flags :

NET 7: _read - n 16384 isConnecting? false hasHandle? true
NET 7: onconnection
NET 7: _read - n 16384 isConnecting? false hasHandle? true
NET 7: Socket._handle.readStart
HTTP 7: SERVER new http connection
NET 7: onconnection
NET 7: _read - n 16384 isConnecting? false hasHandle? true
NET 7: Socket._handle.readStart
HTTP 7: SERVER new http connection
HTTP 7: SERVER socketOnParserExecute 111
HTTP 7: write ret = true
HTTP 7: SERVER socketOnParserExecute 110
2023-11-08T15:30:03.765Z info:  Executing (default): SELECT `sid`, `expires`, `data`, `createdAt`, `updatedAt` FROM `Sessions` AS `Session` WHERE `Session`.`sid` = 'G9gb8WsrBXmlRZAoj-p99ktuaV5ozV04';
2023-11-08T15:30:03.766Z info:  Executing (default): SELECT count(*) AS `count` FROM `Notes` AS `Note`;
NET 7: _read - n 16384 isConnecting? false hasHandle? true
NET 7: _read - n 16384 isConnecting? false hasHandle? true
2023-11-08T15:30:03.767Z info:  Executing (default): SELECT count(*) AS `count` FROM `Users` AS `User`;
2023-11-08T15:30:03.768Z info:  Executing (default): INSERT INTO `Sessions` (`sid`,`expires`,`data`,`createdAt`,`updatedAt`) VALUES (?,?,?,?,?);
NET 7: _read - n 16384 isConnecting? false hasHandle? true
HTTP 7: write ret = true
HTTP 7: outgoing message end.
NET 7: _final: not ended, call shutdown()
2023-11-08T15:30:03.769Z info:  10.16.18.69 - - [08/Nov/2023:15:30:03 +0000] "GET /status HTTP/1.1" 200 266 "-" "kube-probe/1.24"

NET 7: afterShutdown destroyed=false
NET 7: destroy
NET 7: close
NET 7: close handle
NET 7: has server
NET 7: SERVER _emitCloseIfDrained
NET 7: SERVER handle? true   connections? 2
NET 7: _read - n 16384 isConnecting? false hasHandle? true
NET 7: emit close
HTTP 7: server socket close
HTTP 7: write ret = true
HTTP 7: outgoing message end.
NET 7: _final: not ended, call shutdown()
2023-11-08T15:30:03.770Z info:  10.16.18.69 - - [08/Nov/2023:15:30:03 +0000] "GET /_health HTTP/1.1" 200 14 "-" "kube-probe/1.24"

NET 7: afterShutdown destroyed=false
NET 7: destroy
NET 7: close
NET 7: close handle
NET 7: has server
NET 7: SERVER _emitCloseIfDrained
NET 7: SERVER handle? true   connections? 1
NET 7: emit close
HTTP 7: server socket close
NET 7: _read - n 16384 isConnecting? false hasHandle? true
Illegal instruction

@timmwille
Copy link

Wasn't that fixed some time back? I remember we had that issue about 2019 or so? I can research deeper if info needed

@Guilamb
Copy link
Author

Guilamb commented Nov 13, 2023

If you found the old issue maybe it could help, also this post from the forum seems to have similar symptoms : https://community.hedgedoc.org/t/cannot-cut-copy-paste/996/4

@Guilamb
Copy link
Author

Guilamb commented Nov 21, 2023

Wasn't that fixed some time back? I remember we had that issue about 2019 or so? I can research deeper if info needed

So, I looked up similar old issues to try to find out the issue you were talking about :

However, they seem to be related to configuration issues with unset CMD_DOMAIN, CMD_PROTOCOL_USESSL, and CMD_URL_ADDPORT. In my case, they are set up, as you can see in the configuration specified above. Only the CMD_URL_ADDPORT hasn't been specified because the default value is already false.

If you have any ideas, it would be helpful. On my side, I'll try to obtain more logs or traces from Hedgedoc.

@gberche-orange
Copy link

The issue we were observing seems related to hedgedoc container image provided at https://docs.hedgedoc.org/setup/docker/, likely requiring use of AVX extension in the CPU instruction set, see full details into https://github.com/orange-cloudfoundry/paas-templates/issues/2195

The problem does not reproduce after modifying the container infrastructure with a linux kernel properly exposing the CPU avx capabilities.

If other hedgedoc users are running old cpus without the avx capability, then it would be helpful to confirm which nodejs library is requiring avx to consider alternatives.

For our part, the lack of avx capability is rather an infrastructure-specific issue which can be best addressed at the infrastructure level.

@Guilamb Guilamb changed the title Server crash when using special characters and when uploading images Server crash when hedgedoc container infrastructure is missing avx cpu flag with "Illegal instruction" Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants