Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during upgrade from 5.8.7 to 5.9.x #4038

Closed
robert-pudlowski-mox opened this issue Aug 10, 2023 · 9 comments
Closed

Error during upgrade from 5.8.7 to 5.9.x #4038

robert-pudlowski-mox opened this issue Aug 10, 2023 · 9 comments
Assignees
Labels
question Further information is requested solved use to identify issue that has been solved (must be linked to the solving PR)

Comments

@robert-pudlowski-mox
Copy link

robert-pudlowski-mox commented Aug 10, 2023

Description

I have fully working OpenCTI on 5.8.7 version and I would love to upgrade it to latest version.

Unfortunately, during upgrade to 5.9.6 version, OpenCTI is not possible to start. Issue with tasks and migration. See errors below.
OpenCTI in version 5.8.7 is working without any issues.

Environment

  1. EKS
  2. OpenCTI version: 5.9.6
  3. Opensearch as managed service in AWS in 2.3 version

Reproducible Steps

Steps to create the smallest reproducible scenario:

  1. Run OpenCTI on EKS in version 5.8.7
  2. Bump OpenCTI version from 5.8.7 to 5.9.6 version

Expected Output

OpenCTI is able to run

Actual Output

{"category":"APP","level":"info","message":"[OPENCTI] Starting platform","timestamp":"2023-08-10T11:05:16.215Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[OPENCTI] Checking dependencies statuses","timestamp":"2023-08-10T11:05:16.217Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[SEARCH] OpenSearch (2.3.0) client selected / runtime sorting disabled","timestamp":"2023-08-10T11:05:16.306Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[CHECK] Search engine is alive","timestamp":"2023-08-10T11:05:16.307Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[CHECK] Minio is alive","timestamp":"2023-08-10T11:05:16.328Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[CHECK] RabbitMQ is alive","timestamp":"2023-08-10T11:05:16.392Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[REDIS] Redis 'base' client ready","timestamp":"2023-08-10T11:05:16.399Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[REDIS] Clients initialized in Single mode","timestamp":"2023-08-10T11:05:16.399Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[CHECK] Redis is alive","timestamp":"2023-08-10T11:05:16.400Z","version":"5.9.6"} {"category":"APP","level":"warn","message":"[CHECK] SMTP seems down, email notification will may not work","timestamp":"2023-08-10T11:05:16.408Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[CHECK] Python3 is available","timestamp":"2023-08-10T11:05:16.442Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[REDIS] Redis 'subscriber' client ready","timestamp":"2023-08-10T11:05:16.445Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[OPENCTI-MODULE] Cache manager pub sub listener initialized","timestamp":"2023-08-10T11:05:16.446Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[REDIS] Redis 'lock' client ready","timestamp":"2023-08-10T11:05:16.449Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[INIT] Starting platform initialization","timestamp":"2023-08-10T11:05:16.450Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[INIT] Existing platform detected, initialization...","timestamp":"2023-08-10T11:05:16.480Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[INIT] admin user initialized","timestamp":"2023-08-10T11:05:17.549Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[MIGRATION] Read 3 migrations from the database","timestamp":"2023-08-10T11:05:17.692Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[MIGRATION] 6 migrations will be executed","timestamp":"2023-08-10T11:05:17.694Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[MIGRATION] Triggers remove unused fields: recipients, user_ids, group_ids","timestamp":"2023-08-10T11:05:17.694Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[MIGRATION] Triggers remove unused fields: recipients, user_ids, group_ids > started","timestamp":"2023-08-10T11:05:17.695Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[MIGRATION] Triggers remove unused fields: recipients, user_ids, group_ids > elastic running task QOFSwWvIQHS87tX4eRdoOA:11339386","timestamp":"2023-08-10T11:05:17.707Z","version":"5.9.6"} {"category":"APP","error":{"context":{"category":"technical","error":{"meta":{"body":{"error":{"reason":"task [QOFSwWvIQHS87tX4eRdoOA:11339386] isn't running and hasn't stored its results","root_cause":[{"reason":"task [QOFSwWvIQHS87tX4eRdoOA:11339386] isn't running and hasn't stored its results","type":"resource_not_found_exception"}],"type":"resource_not_found_exception"},"status":404},"headers":{"access-control-allow-origin":"*","connection":"keep-alive","content-length":"305","content-type":"application/json; charset=UTF-8","date":"Thu, 10 Aug 2023 11:05:27 GMT"},"meta":{"aborted":false,"attempts":0,"connection":{"_openRequests":0,"deadCount":0,"headers":{},"id":"https://opensearch_endpoint","resurrectTimeout":0,"roles":{"data":true,"ingest":true},"status":"alive","url":"https://opensearch_endpoint/"},"context":null,"name":"opensearch-js","request":{"id":28,"options":{},"params":{"body":null,"headers":{"user-agent":"opensearch-js/2.3.0 (linux 5.4.242-155.348.amzn2.x86_64-x64; Node.js v20.4.0)"},"method":"GET","path":"/_tasks/QOFSwWvIQHS87tX4eRdoOA%3A11339386","querystring":"","timeout":30000}}},"statusCode":404},"name":"ResponseError"},"http_status":500,"reason":"Error updating elastic"},"message":"A database error has occurred","name":"DatabaseError","stack":"DatabaseError: A database error has occurred\n at error (/opt/opencti/build/src/config/errors.js:8:10)\n at DatabaseError (/opt/opencti/build/src/config/errors.js:54:48)\n at /opt/opencti/build/src/migrations/1687529127415-triggers-remove-unused-field.js:47:11\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\n at oqs.up (/opt/opencti/build/src/migrations/1687529127415-triggers-remove-unused-field.js:52:3)"},"level":"error","message":"[MIGRATION] Error during migration","timestamp":"2023-08-10T11:05:27.718Z","version":"5.9.6"} {"category":"APP","level":"info","message":"[INIT] Platform initialization done","timestamp":"2023-08-10T11:05:27.720Z","version":"5.9.6"} {"category":"APP","error":{"context":{"category":"technical","error":{"_error":{},"_showLocations":false,"_showPath":false,"_stack":"DatabaseError: A database error has occurred\n at error (/opt/opencti/build/src/config/errors.js:8:10)\n at DatabaseError (/opt/opencti/build/src/config/errors.js:54:48)\n at /opt/opencti/build/src/migrations/1687529127415-triggers-remove-unused-field.js:47:11\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\n at oqs.up (/opt/opencti/build/src/migrations/1687529127415-triggers-remove-unused-field.js:52:3)","data":{"category":"technical","error":{"meta":{"body":{"error":{"reason":"task [QOFSwWvIQHS87tX4eRdoOA:11339386] isn't running and hasn't stored its results","root_cause":[{"reason":"task [QOFSwWvIQHS87tX4eRdoOA:11339386] isn't running and hasn't stored its results","type":"resource_not_found_exception"}],"type":"resource_not_found_exception"},"status":404},"headers":{"access-control-allow-origin":"*","connection":"keep-alive","content-length":"305","content-type":"application/json; charset=UTF-8","date":"Thu, 10 Aug 2023 11:05:27 GMT"},"meta":{"aborted":false,"attempts":0,"connection":{"_openRequests":0,"deadCount":0,"headers":{},"id":"https://opensearch_endpoint/","resurrectTimeout":0,"roles":{"data":true,"ingest":true},"status":"alive","url":"https://opensearch_endpoint/"},"context":null,"name":"opensearch-js","request":{"id":28,"options":{},"params":{"body":null,"headers":{"user-agent":"opensearch-js/2.3.0 (linux 5.4.242-155.348.amzn2.x86_64-x64; Node.js v20.4.0)"},"method":"GET","path":"/_tasks/QOFSwWvIQHS87tX4eRdoOA%3A11339386","querystring":"","timeout":30000}}},"statusCode":404},"name":"ResponseError"},"http_status":500,"reason":"Error updating elastic"},"internalData":{},"name":"DatabaseError","time_thrown":"2023-08-10T11:05:27.718Z"},"http_status":500,"reason":"[OPENCTI] Platform initialization fail"},"message":"An unknown error has occurred","name":"UnknownError","stack":"UnknownError: An unknown error has occurred\n at error (/opt/opencti/build/src/config/errors.js:8:10)\n at UnknownError (/opt/opencti/build/src/config/errors.js:68:47)\n at platformInit (/opt/opencti/build/src/initialization.js:393:13)\n at processTicksAndRejections (node:internal/process/task_queues:95:5)\n at platformStart (/opt/opencti/build/src/boot.js:183:5)"},"level":"error","message":"[OPENCTI] Platform start fail","timestamp":"2023-08-10T11:05:27.720Z","version":"5.9.6"}

Additional information

Screenshots (optional)

@robert-pudlowski-mox robert-pudlowski-mox added the bug use for describing something not working as expected label Aug 10, 2023
@adel-akloul-mox
Copy link

@SamuelHassine any recommendation on how to upgrade from 5.8.7 to 5.9.6?
Should we upgrade to an intermediary version, instead of going straight from 5.8.7 to 5.9.6 ?

@robert-pudlowski-mox
Copy link
Author

@Archidoit You mean before = 5.8.7 --> 5.9.0 -->5.9.1 --> 5.9.2 etc..?

@Archidoit
Copy link
Member

Maybe try to run the migrations until the one that fails (triggers-remove-unused-field). Then run the triggers-remove-unused-field migration and the following ones.

@adel-akloul-mox
Copy link

adel-akloul-mox commented Aug 10, 2023

@Archidoit how do we proceed with running individual migration one-by-one? I am not familiar with any documentation or playbook for this operation.

I found the mentionned triggers in opencti-platform/opencti-graphql/src/migrations but not sure how to execute one by one since The documentation only describe automated migration: image

@adel-akloul-mox
Copy link

adel-akloul-mox commented Aug 10, 2023

The migration scripts 1687529127415-triggers-remove-unused-field.js and 1687529127420-trigger-type-modifications.js are the one failing in our case; could this be because we don't have any trigger as shown by below query:

curl -su "$ESUSER:$ESPWD" -XPOST $URL/_search -H 'Content-Type:application/json' -d '
{
    "query": {
            "term": {
              "entity_type.keyword": {
                "value": "trigger"
              }
            }
          }
  }
' | jq
{
  "took": 43,
  "timed_out": false,
  "_shards": {
    "total": 68,
    "successful": 68,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

After removing those two migration script, and rebuilding opencti, the instance starts normally.

  1. Please appreciate your confirmation this is an acceptable workaround to SKIP scripts 1687529127415-triggers-remove-unused-field.js and 1687529127420-trigger-type-modifications.js given we don't have trigger entities in our instance
  2. Also confirm whether it is expected that the migration scripts fail when there is no trigger entity in the database or whether this is an anomaly in our database?
/opt/opencti # yarn serv
{"category":"APP","level":"info","message":"[OPENCTI] Starting platform","timestamp":"2023-08-10T18:38:39.768Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[OPENCTI] Checking dependencies statuses","timestamp":"2023-08-10T18:38:39.771Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[SEARCH] OpenSearch (2.3.0) client selected / runtime sorting disabled","timestamp":"2023-08-10T18:38:40.067Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[CHECK] Search engine is alive","timestamp":"2023-08-10T18:38:40.068Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[CHECK] Minio is alive","timestamp":"2023-08-10T18:38:40.102Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[CHECK] RabbitMQ is alive","timestamp":"2023-08-10T18:38:40.146Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[REDIS] Redis 'base' client ready","timestamp":"2023-08-10T18:38:40.160Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[REDIS] Clients initialized in Single mode","timestamp":"2023-08-10T18:38:40.161Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[CHECK] Redis is alive","timestamp":"2023-08-10T18:38:40.162Z","version":"5.9.6"}
{"category":"APP","level":"warn","message":"[CHECK] SMTP seems down, email notification will may not work","timestamp":"2023-08-10T18:38:40.172Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[CHECK] Python3 is available","timestamp":"2023-08-10T18:38:40.218Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[REDIS] Redis 'subscriber' client ready","timestamp":"2023-08-10T18:38:40.224Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[OPENCTI-MODULE] Cache manager pub sub listener initialized","timestamp":"2023-08-10T18:38:40.226Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[REDIS] Redis 'lock' client ready","timestamp":"2023-08-10T18:38:40.231Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[INIT] Starting platform initialization","timestamp":"2023-08-10T18:38:40.235Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[INIT] Existing platform detected, initialization...","timestamp":"2023-08-10T18:38:40.302Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[INIT] admin user initialized","timestamp":"2023-08-10T18:38:42.120Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Read 3 migrations from the database","timestamp":"2023-08-10T18:38:42.340Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] 4 migrations will be executed","timestamp":"2023-08-10T18:38:42.342Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > started","timestamp":"2023-08-10T18:38:42.343Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > Migration of entity setting","timestamp":"2023-08-10T18:38:42.344Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > Migration open vocabularies","timestamp":"2023-08-10T18:38:42.517Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > Create 0 vocabularies for category threat_actor_individual_type_ov","timestamp":"2023-08-10T18:38:43.538Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[REDIS] Redis 'publisher' client ready","timestamp":"2023-08-10T18:38:44.244Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > Create 0 vocabularies for category threat_actor_individual_role_ov","timestamp":"2023-08-10T18:38:46.496Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > Create 0 vocabularies for category threat_actor_individual_sophistication_ov","timestamp":"2023-08-10T18:38:47.741Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > Migrating threat actors 0/39","timestamp":"2023-08-10T18:38:49.188Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > Migrating threat actors 39/39","timestamp":"2023-08-10T18:39:02.370Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual > done","timestamp":"2023-08-10T18:39:02.371Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Saving current configuration, 1688674984685-threat-actor-split.js","timestamp":"2023-08-10T18:39:03.622Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Adding default order value to opinion open vocabulary","timestamp":"2023-08-10T18:39:03.622Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Adding default order value to opinion open vocabulary done in 761 ms","timestamp":"2023-08-10T18:39:04.383Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Saving current configuration, 1688710489709-add_order_to_opinion_ov.js","timestamp":"2023-08-10T18:39:05.273Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Adding trigger scope and authorized_capabilities > started","timestamp":"2023-08-10T18:39:05.273Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Adding trigger scope and authorized_capabilities > done","timestamp":"2023-08-10T18:39:05.343Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Saving current configuration, 1688769447643-trigger-knowledge-migration.js","timestamp":"2023-08-10T18:39:06.230Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual (inferred) > started","timestamp":"2023-08-10T18:39:06.231Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual (inferred) > Migrating threat actors 0/39","timestamp":"2023-08-10T18:39:06.286Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual (inferred) > Migrating threat actors 39/39","timestamp":"2023-08-10T18:39:08.631Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Threat-actors to group and individual (inferred) > done","timestamp":"2023-08-10T18:39:08.631Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Saving current configuration, 1689722008218-threat-actor-split-inferred.js","timestamp":"2023-08-10T18:39:10.051Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Migration process completed","timestamp":"2023-08-10T18:39:10.051Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[MIGRATION] Platform version updated to 5.9.6","timestamp":"2023-08-10T18:39:10.286Z","version":"5.9.6"}
{"category":"APP","level":"info","message":"[INIT] Platform initialization done","timestamp":"2023-08-10T18:39:10.630Z","version":"5.9.6"}

@adel-akloul-mox
Copy link

adel-akloul-mox commented Aug 11, 2023

@richard-julien , I believe this ticket is a follow-up issue of #3999 ...
we somehow managed to fix the issue by rebuilding the frontend after stripping out two scripts that were failing in our case (1687529127415-triggers-remove-unused-field.js and 1687529127420-trigger-type-modifications.js).

It is still unclear to us why those scripts were failing the upgrade since we have NO entities of type trigger, as shown on my previous post. Since the MIGRATION scripts are expecting such entity, I am wondering whether this is a bug or whether my opensearch database is missing expected objects, for whatever reason.

Also I would like to confirm that before and while attempting upgrade, when running GET /_tasks there is no task [QOFSwWvIQHS87tX4eRdoOA:11339386]; is it expected?

@richard-julien
Copy link
Member

Like i try to explain this is an issue with opensearch.
This 2 migration script use the background task principle of elastic/opensearch. We create a task and follow the execution. The error you have is due to the fact that opensearch answer that there is no task running with the previous id of creation that opensearch just give us.
The workflow is:

  1. Start the platform
  2. check migration to execute
  3. Find migration triggers
  4. Trigger migration execute an updateByQuery with wait=false > opensearch answer with the ID of the task QOFSwWvIQHS87tX4eRdoOA:11339386
  5. 10 sec later we check the status of the task with "method":"GET","path":"/_tasks/QOFSwWvIQHS87tX4eRdoOA:11339386"
  6. This check fail with task [QOFSwWvIQHS87tX4eRdoOA:11339386] isn't running and hasn't stored its results

This is clearly opensearch that fail to get information from a task that is just created. Maybe the task is started on a node that then can not answer the query in the cluster ... i dont know.

Can you ask the AWS opensearch support if they are aware of this kind of situation?

For info, rebuilding and removing some migrations have a huge potential of causing serious problem on your platform in the future, and so of course highly NOT recommanded :)

@adel-akloul-mox
Copy link

Yep, happy to close the ticket, understand we have to follow-up with AWS support

@richard-julien
Copy link
Member

Thanks for the update. Please keep us informed about your discussion with AWS. Maybe I miss understand something and we need to do something differently in the product.

@Archidoit Archidoit self-assigned this Aug 18, 2023
@Archidoit Archidoit closed this as not planned Won't fix, can't repro, duplicate, stale Aug 18, 2023
@SamuelHassine SamuelHassine added question Further information is requested solved use to identify issue that has been solved (must be linked to the solving PR) and removed bug use for describing something not working as expected labels Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested solved use to identify issue that has been solved (must be linked to the solving PR)
Projects
None yet
Development

No branches or pull requests

5 participants