Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Migration fails from 4.0.5 to 4.1 #1861

Closed
mphbig opened this issue Mar 21, 2021 · 5 comments
Closed

[Bug] Migration fails from 4.0.5 to 4.1 #1861

mphbig opened this issue Mar 21, 2021 · 5 comments
Assignees
Labels
bug TheHive4 TheHive4 related issues
Milestone

Comments

@mphbig
Copy link

mphbig commented Mar 21, 2021

Request Type

Bug

Work Environment

Question Answer
OS version (server) Debian 9
OS version (client) 10
TheHive version / git hash 4.1.0-1
Package Type DEB
Browser type & version Firefox 86.0.1

Problem Description

The reindexation step fails when upgrading from TheHive 4.0.5. It seems to come from an unwanted character in one of my cases. The reindexation process seems to go on forever, I let it run for 4 hours with the thehive process showing no activity in htop.
I use the lucene index mechanism.
I tried to restart TheHive but it keeps trying to index everything and gets stuck at the same point.

Steps to Reproduce

  1. Upgrade from version 4.0.5 to 4.1
  2. Restart TheHive, wait until TheHive reaches the reindexation step
  3. Watch the logs containing only "Reindex job is running" and htop showing 0.7% CPU usage for TheHive at best.

Possible Solutions

As I said, I think it is from a weird character in one of my cases (see logs below). It seems everything stops after the error occurs.

Complementary information

Mar 21 14:53:49 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in Thread-12 [|] [457975760:doc-updates] Incremented by 1
Mar 21 14:53:49 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in Thread-12 [|] [457975760:success-tx] Incremented by 1
Mar 21 14:53:49 [ERROR] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScannerExecutor in Thread-12 [|] Unexpected error processing data: {}
    java.lang.IllegalArgumentException: Document contains at least one immense term in field="description_____s" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[85, 115, 101, 114, 32, 76, 111, 103, 105, 110, 32, 83, 117, 99, 99, 101, 115, 115, 32, 40, 73, 80, 118, 54, 32, 77, 105, 99, 114, 111]...', original message: bytes can be at most 32766 in length; got 105149
    at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:853)
    at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430)
    at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:394)
    at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:251)
    at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:494)
    at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1616)
    at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1608)
    at org.janusgraph.diskstorage.lucene.LuceneIndex.restore(LuceneIndex.java:305)
    at org.janusgraph.diskstorage.indexing.IndexTransaction.restore(IndexTransaction.java:128)
    at org.janusgraph.graphdb.olap.job.IndexRepairJob.workerIterationEnd(IndexRepairJob.java:201)
    at org.janusgraph.graphdb.olap.VertexJobConverter.workerIterationEnd(VertexJobConverter.java:118)
    at org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScannerExecutor$Processor.run(StandardScannerExecutor.java:285)
    Caused by: org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes can be at most 32766 in length; got 105149
    at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:265)
    at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:151)
    at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:843)
    ... 11 common frames omitted
Mar 21 14:53:49 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in Thread-12 [|] [457975760:success-tx] Incremented by 1
Mar 21 14:53:50 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in application-akka.actor.default-dispatcher-15 [|] [457975760:adds] Returning zero by default (was null)
Mar 21 14:53:50 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-15 [|] Reindex job is running: 0 record(s) indexed
Mar 21 14:53:51 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in application-akka.actor.default-dispatcher-15 [|] [457975760:adds] Returning zero by default (was null)
Mar 21 14:53:51 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-15 [|] Reindex job is running: 0 record(s) indexed
Mar 21 14:53:52 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in application-akka.actor.default-dispatcher-15 [|] [457975760:adds] Returning zero by default (was null)
Mar 21 14:53:52 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-15 [|] Reindex job is running: 0 record(s) indexed
Mar 21 14:53:52 [DEBUG] from com.datastax.driver.core.RequestHandler in thp-timeouter-0 [|] onTimeout triggered but the response was completed by another thread, cancelling (retryCount = 0, queryState = QueryState(count
=0, inProgress=false, cancelled=false), queryStateRef = QueryState(count=0, inProgress=false, cancelled=false))
Mar 21 14:53:52 [DEBUG] from com.datastax.driver.core.RequestHandler in thp-timeouter-0 [|] onTimeout triggered but the response was completed by another thread, cancelling (retryCount = 0, queryState = QueryState(count
=0, inProgress=false, cancelled=false), queryStateRef = QueryState(count=0, inProgress=false, cancelled=false))
Mar 21 14:53:53 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in application-akka.actor.default-dispatcher-15 [|] [457975760:adds] Returning zero by default (was null)
Mar 21 14:53:53 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-15 [|] Reindex job is running: 0 record(s) indexed
Mar 21 14:53:54 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in application-akka.actor.default-dispatcher-15 [|] [457975760:adds] Returning zero by default (was null)
Mar 21 14:53:54 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-15 [|] Reindex job is running: 0 record(s) indexed
Mar 21 14:53:55 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in application-akka.actor.default-dispatcher-15 [|] [457975760:adds] Returning zero by default (was null)
Mar 21 14:53:55 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-15 [|] Reindex job is running: 0 record(s) indexed
Mar 21 14:53:56 [DEBUG] from org.janusgraph.diskstorage.keycolumnvalue.scan.StandardScanMetrics in application-akka.actor.default-dispatcher-15 [|] [457975760:adds] Returning zero by default (was null)
Mar 21 14:53:56 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-15 [|] Reindex job is running: 0 record(s) indexed
@mphbig mphbig added TheHive4 TheHive4 related issues bug labels Mar 21, 2021
@To-om
Copy link
Contributor

To-om commented Mar 22, 2021

This error occurs because Lucene nor Elasticsearch can't index fields bigger than 32kB. You have a 105149 character long description.

@To-om To-om added this to the 4.1.1 milestone Mar 22, 2021
@KaanSK
Copy link
Contributor

KaanSK commented Mar 22, 2021

I would like to request prioritizing this as reverting thehive back to 4.0.5/ 4.0.4/ 4.0.3 still shows Reindex job is running: 0 record(s) indexed.
Removing index config from application.conf did not do any change. Indexing process still occurs and gets stuck.
In my case, my assumption is that long description fields on some alert types caused this issue.

@LaZyDK
Copy link

LaZyDK commented Mar 22, 2021

I am having the exact same issue.

@nadouani
Copy link
Contributor

4.1.1 is planned for tomorrow

@nadouani
Copy link
Contributor

The fix is ready, we just need to test it before shooting a new patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug TheHive4 TheHive4 related issues
Projects
None yet
Development

No branches or pull requests

5 participants