Skip to content

[pull] master from crate:master#111

Merged
pull[bot] merged 7 commits intodolfly:masterfrom
crate:master
Oct 20, 2025
Merged

[pull] master from crate:master#111
pull[bot] merged 7 commits intodolfly:masterfrom
crate:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Oct 20, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

matriv and others added 7 commits October 20, 2025 05:35
After running benchmarks they showed that having both table and col
stats in one index, lead to slower loading of stats, especially
when trying to load only a specific table or col (future plan).
```
Benchmark
TableStatsServiceBenchmark.measureLoadColStatsFromDisk                  avgt   25  447.783 ± 2.687  ms/op
TableStatsServiceBenchmark.measureLoadColStatsFromDisk_separateIndices  avgt   25  378.624 ± 2.228  ms/op
TableStatsServiceBenchmark.measureLoadColStatsFromDisk_withoutType      avgt   25  401.765 ± 3.324  ms/op
```
```
Benchmark                                                          Mode  Cnt   Score   Error  Units
TableStatsServiceBenchmark.measureLoadAllFromDisk                  avgt   25  23.900 ± 0.101  ms/op
TableStatsServiceBenchmark.measureLoadAllFromDisk_separateIndices  avgt   25  20.978 ± 0.251  ms/op
```

The separate indices comes with a small hit on save though:
```
Benchmark                                                               Mode  Cnt   Score   Error  Units
TableStatsServiceBenchmark.measureSaveTableStatsToDisk                  avgt   25  34.159 ± 0.415  ms/op
TableStatsServiceBenchmark.measureSaveTableStatsToDisk_separateIndices  avgt   25  53.407 ± 3.433  ms/op
```
After some benchmarking it seems that MMap performs better,
so change to `MMapDirectory.open()` which fallbacks to NIOFS
depending on the underlying system:

```
Benchmark                                                    Mode  Cnt   Score   Error  Units
TableStatsServiceBenchmark.measureSaveTableStatsToDisk_MMap  avgt   25  49.791 ± 0.660  ms/op
TableStatsServiceBenchmark.measureSaveTableStatsToDisk_NIO   avgt   25  51.251 ± 1.962  ms/op
```
```
Benchmark                                                    Mode  Cnt   Score   Error  Units
TableStatsServiceBenchmark.measureLoadAllStatsFromDisk_MMap  avgt   25  15.983 ± 0.130  ms/op
TableStatsServiceBenchmark.measureLoadAllStatsFromDisk_NIO   avgt   25  23.275 ± 0.178  ms/op
```
Follow up to #18021

ByteArrayOutputStream resizing can sometimes double the size of the underlying array.
We must be precise in accounting and don't rely on eventual catch up as
double the size of already big structure is significant.

We have seen a heap dump with byte array of ByteArrayOutputStream occupying 998 MB.

  at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
  at java.util.Arrays.copyOf([BI)[B (Arrays.java:3540)
  at java.io.ByteArrayOutputStream.ensureCapacity(I)V (ByteArrayOutputStream.java:100)
  at java.io.ByteArrayOutputStream.write([BII)V (ByteArrayOutputStream.java:132)
  at io.crate.rest.action.SqlHttpHandler$RamAccountingOutputStream.write([BII)V (SqlHttpHandler.java:347)
  at com.fasterxml.jackson.core.json.UTF8JsonGenerator._flushBuffer()V (UTF8JsonGenerator.java:2243)
  at com.fasterxml.jackson.core.json.UTF8JsonGenerator.writeNumber(J)V (UTF8JsonGenerator.java:1010)
  at org.elasticsearch.common.xcontent.json.JsonXContentGenerator.writeNumber(J)V (JsonXContentGenerator.java:193)
  at org.elasticsearch.common.xcontent.XContentBuilder.value(J)Lorg/elasticsearch/common/xcontent/XContentBuilder; (XContentBuilder.java:478)
  at org.elasticsearch.common.xcontent.XContentBuilder.value(Ljava/lang/Long;)Lorg/elasticsearch/common/xcontent/XContentBuilder; (XContentBuilder.java:474)
  at org.elasticsearch.common.xcontent.XContentBuilder.lambda$static$9(Lorg/elasticsearch/common/xcontent/XContentBuilder;Ljava/lang/Object;)V (XContentBuilder.java:81)
  at org.elasticsearch.common.xcontent.XContentBuilder$$Lambda+0x000000001d21ebf0.write(Lorg/elasticsearch/common/xcontent/XContentBuilder;Ljava/lang/Object;)V (Unknown Source)
  at org.elasticsearch.common.xcontent.XContentBuilder.unknownValue(Ljava/lang/Object;Ljava/util/Map;)V (XContentBuilder.java:787)
  at org.elasticsearch.common.xcontent.XContentBuilder.mapContents(Ljava/util/Map;ZLjava/util/Map;)Lorg/elasticsearch/common/xcontent/XContentBuilder; (XContentBuilder.java:837)
  at org.elasticsearch.common.xcontent.XContentBuilder.map(Ljava/util/Map;Ljava/util/Map;)Lorg/elasticsearch/common/xcontent/XContentBuilder; (XContentBuilder.java:820)
  at org.elasticsearch.common.xcontent.XContentBuilder.unknownValue(Ljava/lang/Object;Ljava/util/Map;)V (XContentBuilder.java:794)
  at org.elasticsearch.common.xcontent.XContentBuilder.value(Ljava/lang/Object;)Lorg/elasticsearch/common/xcontent/XContentBuilder; (XContentBuilder.java:771)
  at io.crate.rest.action.ResultToXContentBuilder.addRow(Lio/crate/data/Row;I)Lio/crate/rest/action/ResultToXContentBuilder; (ResultToXContentBuilder.java:123)
apache/lucene@9b185b9

Parts where already applied, the remaining changes are mostly
clean ups to keep it in-sync with the original Lucene implementaion.
@pull pull bot locked and limited conversation to collaborators Oct 20, 2025
@pull pull bot added the ⤵️ pull label Oct 20, 2025
@pull pull bot merged commit e40c657 into dolfly:master Oct 20, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants