/
CHANGES
6626 lines (4055 loc) · 236 KB
/
CHANGES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1.10.0 | 2024-03-04 16:33:50 +0100
* Release 1.10.0.
* Bump freebsd13 CI job to freebsd-13.3. (Benjamin Bannier, Corelight)
(cherry picked from commit 31866cae8ecd5df2dcd07c0657a6792857157f16)
1.10.0-dev.150 | 2024-02-26 12:17:07 +0100
* Fix stray Python escape sequence. (Benjamin Bannier, Corelight)
At least on my platform with python-3.12.2 the previous code produced a
warning
../scripts/autogen-type-erased:137: SyntaxWarning: invalid escape sequence '\{'
i = re.search("(.*) *with *default *(\{.*)$", s)
(cherry picked from commit 09bc76d9dd815567fcd767b707462eff5d9ffe9f)
1.10.0-dev.149 | 2024-02-22 09:48:01 +0100
* GH-1585: Put closing of unit sinks behind feature guard. (Benjamin Bannier, Corelight)
This code gets emitted, regardless of whether a sink was actually
connected or not. Put it behind a feature guard so it does not enable
the feature on its own.
Closes #1585.
1.10.0-dev.147 | 2024-02-21 10:30:22 +0100
* GH-1667: Always advance input before attempting resynchronization. (Benjamin Bannier, Corelight)
When we enter resynchronization after hitting a parse error we
previously would have left the input alone, even though we know it fails
to parse. We then relied fully on resynchronization to advance the
input.
While this just pushed work downstream when synchronizing on literals,
it could cause us loosing input if synchronizing on regular expressions
if we happened to fail parsing due to a gap which is now at the front of
the input (parse errors from gaps are the most likely resynchronization
scenario when parsing genuine traffic); in this case the regular
expression would synchronize at the second byte after the input and we
would synchronize only at a later position.
With this patch we always forcibly advance the input to the next non-gap
position. This has no effect for synchronization on literals, but allows
it to happen earlier for regular expressions.
Closes #1667.
* Refactor test `spicy.types.unit.synchronize-on-gap`. (Benjamin Bannier, Corelight)
This refactoring cleans up how we feed gaps into the parser to testing
with more inputs simpler.
1.10.0-dev.144 | 2024-02-14 15:55:35 +0100
* GH-1652: Fix filters consuming too much data. (Benjamin Bannier, Corelight)
We would previously assume that a filter would consume all available
data. This only holds if the filter is attached to a top-level unit, but
in general not if some sub-unit uses a filter. With this patch we
explicitly compute how much data is consumed.
Closes #1652.
1.10.0-dev.142 | 2024-02-08 17:00:53 +0100
* GH-1668: Fix incorrect data consumption for `&max-size`. (Benjamin Bannier, Corelight)
We would previously handle `&size` and `&max-size` almost identical
with the only difference that `&max-size` sets up a slightly larger view
to accommodate a sentinel. In particular, we also used identical code to
set up the position where parsing should resume after such a field.
This was incorrect as it is in general impossible to tell where parsing
continues after a field with `&max-size` since it does not signify a
fixed view like `&size`. In this patch we now compute the next position
for a `&max-size` field by inspecting the limited view to detect how
much data was extracted.
Closes #1668.
1.10.0-dev.140 | 2024-02-08 13:22:52 +0100
* GH-1522: Drop overzealous validator. (Benjamin Bannier, Corelight)
This validator was intended to reject incorrect parsing of vectors but instead
ending up rejecting all vector parsing if the vector elements itself produced
vectors. Since this code has no test and it seems to have no clear purpose this
patch drops this validation.
Closes #1522.
1.10.0-dev.138 | 2024-02-08 13:20:31 +0100
* Remove now unused testing `random.seed`. (Benjamin Bannier, Corelight)
* GH-1659: Lift requirement that `bytes` forwarded from filter be mutable. (Benjamin Bannier, Corelight)
Since we did not declare the `bytes` argument to `unit::forward`
constant it was implicitly mutable so that it was impossible to e.g.,
pass literal bytes. This patch uses the originally intended type.
Closes #1659.
1.10.0-dev.135 | 2024-01-30 12:29:28 +0100
* GH-1665: Fix running of `codebase` tests. (Benjamin Bannier, Corelight)
1.10.0-dev.133 | 2024-01-29 10:59:23 +0100
* GH-1489: Deprecate &bit-order on bit ranges. (Arne Welzel, Corelight)
This does not appear to have any effect and allowing it may be
confusing to users. Deprecate it with the idea of eventual
removal.
* Add extensive bitfield test including endianness behavior. (Arne Welzel, Corelight)
* Add bitfield examples. (Arne Welzel, Corelight)
1.10.0-dev.129 | 2024-01-23 17:48:19 +0100
* Adjust end of Bison-generated locations. (Robin Sommer, Corelight)
The Bison-side end column points one beyond the element, which isn't
that nice in error messages, so we adjust them now.
* Extend location printing to include single-line ranges. (Robin Sommer, Corelight)
For a location of, e.g., "line 1, column 5 to 10", we now print
`1:5-10`, whereas we used to print it as only `1:5`, hence dropping
information.
* Fix Bison locations. (Robin Sommer, Corelight)
If a Bison rule started with an optional element, its location would
start at the end of the *previous* token if that optional element
wasn't present. We now skip token locations that are zero-sized at the
beginning of a rule.
1.10.0-dev.125 | 2024-01-23 16:27:43 +0100
* Bump 3rdparty/any from `e88b1bf` to `7c76129`
* Fix incorrect check_suite type in GH actions [skip CI]. (Benjamin Bannier, Corelight)
1.10.0-dev.122 | 2024-01-18 08:39:05 +0100
* Update documentation of `offset()`. (Benjamin Bannier, Corelight)
The originally documented caveats do not apply anymore, so remove them.
* Always store a valid begin iterator in units. (Benjamin Bannier, Corelight)
* Always pass a valid begin iterator to parse methods. (Benjamin Bannier, Corelight)
* GH-1648: Provide meaningful unit `__begin` value when parsing starts. (Benjamin Bannier, Corelight)
We previously would not provide `__begin` when starting the initial
parse. This meant that e.g., `offset()` was not usable if nothing ever
got parsed.
With this patch we provide a meaningful value now.
Closes #1648.
1.10.0-dev.117 | 2024-01-17 16:43:22 +0100
* GH-1640: Implement skipping for any field with known size. (Benjamin Bannier, Corelight)
This patch adds `skip` support for fields with `&size` attribute or of
builtin type with known size. If a unit has a known size and it is
specified in a `&size` attribute this also allows to skip over unit
fields.
* Add method to field to optionally return its size. (Benjamin Bannier, Corelight)
* Fix skipping of literal fields with condition. (Benjamin Bannier, Corelight)
1.9.0-111 | 2024-01-11 11:55:07 +0100
* GH-1645: Fix `&size` check. (Robin Sommer, Corelight)
The current parsing offset could legitimately end up just beyond the
`&size` amount.
1.9.0-109 | 2024-01-11 09:00:08 +0100
* GH-1634: Fix infinite loop in regular expression parsing. (Robin Sommer, Corelight)
1.9.0-107 | 2024-01-11 08:59:05 +0100
* Bump justrx to pull in bugfix. (Robin Sommer, Corelight)
* CI: Re-enable `clang17_ubuntu_debug_task`. (Robin Sommer, Corelight)
1.10.0-dev.103 | 2024-01-09 14:00:44 +0100
* Remove references to Cirrus cron task. (Benjamin Bannier, Corelight)
We do not schedule CI runs with cron anymore.
* Run CI ASAN task also for PRs. (Benjamin Bannier, Corelight)
* Skip tests around stack size on macos with ASAN. (Benjamin Bannier, Corelight)
These tests produce false positives on macos with ASAN which seem hard
to get rid of. Disable them on the smallest possible subset of setups.
* Fix ASAN false positives introduces with d332827aee7d70cdb642631d7a289751e1d8a36a. (Benjamin Bannier, Corelight)
Since the logging changes of d332827aee7d70cdb642631d7a289751e1d8a36a
ASAN reports false positives on e.g., macos-13.6.2 with its
clang-1500.0.40.1. This patch slightly reorganizes the code so these
false positiives are not triggered.
1.10.0-dev.98 | 2024-01-09 13:55:27 +0100
* Bump dependencies. (Benjamin Bannier, Corelight)
* GH-1632: Bump justrx to pull in bugfix. (Robin Sommer, Corelight)
Closes #1632.
* CI: drop freebsd-12, add freebsd-14. (Benjamin Bannier, Corelight)
* GH-1500: Add `+=` operator for `string`. (Benjamin Bannier, Corelight)
This allows appending to a `string` without having to allocate a new
string. This might perform better most of the time.
Closes #1500.
1.9.0-97 | 2024-01-09 10:13:30 +0100
* GH-1632: Bump justrx to pull in bugfix. (Robin Sommer, Corelight)
1.10.0-dev.91 | 2024-01-04 12:44:32 +0100
* Add unit tests for extracting from expanding Views. (Benjamin Bannier, Corelight)
* Add unit tests for extracting from Views with gaps. (Benjamin Bannier, Corelight)
* Add fast pass to noop unsafe stream iterator increment/decrement. (Benjamin Bannier, Corelight)
We already had this optimization for safe, but not for unsafe iterators.
* GH-1628: Avoid potentially inefficient data access in `View::extract`. (Benjamin Bannier, Corelight)
While we already had a fast path to extract data out of `View`s
consisting of a single chunk we still would use a potentially
inefficient approach when extracting from `View`s over multiple chunks.
This patch uses the same optimized handling for both cases.
Closes #1628.
1.10.0-dev.86 | 2024-01-02 16:38:08 +0100
* Suppress clang-tidy `misc-include-cleaner` lint. (Benjamin Bannier, Corelight)
This currently triggers a lot of issues. While it seems to be useful
getting us to a passing state seems to require substantial work.
* Remove outdated workaround for clang-tidy. (Benjamin Bannier, Corelight)
With newer clang-tidy versions this is not needed anymore.
* Reenable clang-tidy `readability-simplify-boolean-expr` lint. (Benjamin Bannier, Corelight)
This currently triggers no issues, and the lint seems to be generally
useful.
* Drop `;` after `#pragma`. (Benjamin Bannier, Corelight)
* Fix clang-tidy `readability-redundant-string-init` lints. (Benjamin Bannier, Corelight)
* Fix clang-tidy `bugprone-exception-escape` lints. (Benjamin Bannier, Corelight)
* Fix clang-tidy `misc-header-include-cycle` lints. (Benjamin Bannier, Corelight)
* Fix clang-tidy `bugprone-use-after-move` lints. (Benjamin Bannier, Corelight)
* Suppress clang-tidy `bugprone-switch-missing-default-case` lints. (Benjamin Bannier, Corelight)
* Fix clang-tidy `readability-avoid-unconditional-preprocessor-if` lints. (Benjamin Bannier, Corelight)
* Fix clang-tidy `modernize-use-emplace` lints. (Benjamin Bannier, Corelight)
* Suppress clang-tidy `modernize-macro-to-enum` lints. (Benjamin Bannier, Corelight)
* Fix clang-tidy `performance-avoid-endl` lints. (Benjamin Bannier, Corelight)
* Fix clang-tidy `modernize-type-traits` lints. (Benjamin Bannier, Corelight)
* Update clang-tidy suppression for newer clang-tidy versions. (Benjamin Bannier, Corelight)
* Remove nightly task. (Benjamin Bannier, Corelight)
This task originally tested against Zeek's `master` branch. We do not do
that anymore since some time so this task has lost its purpose.
* Remove `--rpath` flag to `ci/run-ci`. (Benjamin Bannier, Corelight)
Uses of this flag have been mirroring external setup of
`LD_LIBRARY_PATH` in `.cirrus.yml` for some time with no added benefit.
Drop it instead of investing work in keeping it consistent.
* Disable building benchmarks in CI. (Benjamin Bannier, Corelight)
On some platforms the CMake configure phase of 3rdparty/benchmark fails
with errors like
```
CMake Error at 3rdparty/benchmark/CMakeLists.txt:301 (message):
Failed to determine the source files for the regular expression backend
```
Since it is not executed in CI anyway disable it.
* Refresh LLVM version in CI Docker image. (Benjamin Bannier, Corelight)
* Remove unused Docker image args. (Benjamin Bannier, Corelight)
* Bump CI image to ubuntu-22.04. (Benjamin Bannier, Corelight)
1.10.0-dev.64 | 2024-01-02 15:50:26 +0100
* Add dependency of generated file `spicy-build` on source file. (Benjamin Bannier, Corelight)
Without this dependency we never updated the generated file.
* Remove use of `IntrusivePtr` over `const`. (Benjamin Bannier, Corelight)
It looks like using a `IntrusivePtr<const T>` interferes with default'ed
move constructors and assignment operators so that the non-move versions
are called. This can incur additional overhead.
This patch simply gets rid all `IntrusivePtr<const T>` in favor of
`IntrusivePtr<T>` which does not seem to show this issue. It might be
possible to instead adjust something e.g., `ManagedObject` so
`IntrusivePtr` does not regress when holding semantically `const`
values though.
For a big internal parser this shows runtime improvements up to 2%.
1.10.0-dev.61 | 2024-01-02 15:50:05 +0100
* Fix docs namespace for symbols from `filter` module. (Benjamin Bannier, Corelight)
We previously would document these symbols to be in `spicy` even though
they are in `filter`.
1.10.0-dev.59 | 2023-12-15 13:29:41 +0100
* Add move ctr for `stream::SafeConstIterator`. (Benjamin Bannier, Corelight)
* Use unchecked operations for `View::unsafeEnd`. (Benjamin Bannier, Corelight)
We previously would call `end()` to compute `unsafeEnd` which incurrs
overheads this function is explicitly designed to avoid. We now switch
this implementation of this function to completely unsafe code avoiding
these overheads.
* Add move ctr for `stream::View`. (Benjamin Bannier, Corelight)
In C++17 move ctors are not by default generated, add them explicitly.
We also get rid of the `View` dtr since it adds nothing (e.g., the class
is declared `final`), but declaring them could e.g., make the class not
POD anymore (this is not the case here though).
1.10.0-dev.55 | 2023-12-12 18:38:24 +0100
* GH-1617: Fix handling of `%synchronize-*` attributes for units in lists. (Benjamin Bannier, Corelight)
We previously would not detect `%synchronize-at` or `%synchronize-from`
attributes if the unit was not directly in a field, i.e., we mishandled
the common case of synchronizing on a unit in a list.
With this patch we now handle these attributes, regardless of how the
unit appears.
1.10.0-dev.53 | 2023-12-12 13:05:41 +0100
* Avoid unnecessary copy when constructing `Stream` from `Bytes`. (Benjamin Bannier, Corelight)
* Back Chunk with a std::string instead of std::variant. (Arne Welzel, Corelight)
This is probably a bit funky due to the trailing \0 and a Chunk not
being a string, but hey, hilti::rt::Bytes is backed by std::string, too.
Looking at the creation of Chunk::Chunk(), there's actually a memset()
operation visible due to the variant holding some 40 bytes that are
zero initialized upon construction. std::string supports SSO, so we
can leverage even if the actual size is out of our control. The
std::get_if()s do show up very marginally for the View::size() calls.
1.10.0-dev.50 | 2023-12-11 12:08:12 +0100
* GH-1615: Optimize C++ tuple element coercions. (Benjamin Bannier, Corelight)
This patch introduces an optimization to pass C++ tuples through if they
do not need C++-side coercion. This emits better code if a tuple ctor
already has the right types. We also optimize codegen if we are coercing
from a temporary tuple in that we now capture it in a temporary and
coerce elements from that instead of repeatedly emitting the same tuple
ctor (we scale with the number of tuple elements now not its square
anymore).
1.10.0-dev.48 | 2023-12-08 15:55:12 +0100
* Use deterministic destruction of Spicy runtime in unit tests. (Benjamin Bannier, Corelight)
We previously would shut down the runtimes in units tests with
library destructors. Due to recent changes ASAN correctly flags this as
potential use-after-free errors (e.g., it might run after the
configuration global has already been destructed).
With this patch implement our own doctest main function and
deterministically tear down runtimes from that.
1.10.0-dev.46 | 2023-12-08 09:56:14 +0100
* Remove redundant forward decl. (Benjamin Bannier, Corelight)
* GH-1611: Silence ASAN warning. (Benjamin Bannier, Corelight)
1.10.0-dev.43 | 2023-12-08 09:55:13 +0100
* Allocate Vector::_control lazily. (Arne Welzel, Corelight)
* Allocate Bytes::_control lazily. (Arne Welzel, Corelight)
1.10.0-dev.40 | 2023-12-06 12:31:27 +0100
* GH-1605: Allow for unresolved types for set `in` operator. (Benjamin Bannier, Corelight)
1.10.0-dev.38 | 2023-12-06 12:28:27 +0100
* Avoid reallocating Bytes when appending from view (Arne Welzel, Corelight)
When copying a full view into a Bytes instance, avoid potential
reallocations and memcpy() by pre-allocating enough capacity in
the underlying string.
* Allocate vector of correct size right away when constructing Chunks. (Benjamin Bannier, Corelight)
* Avoid one extra std::string copy in Stream::append. (Arne Welzel, Corelight)
For a chunk of significant size, seems that would result in an extra
malloc and copy of the input data.
* Reduce safe iterator use in internal code for `Bytes`. (Benjamin Bannier, Corelight)
The safe iterator for bytes dynamically allocates which can cause
overhead. Use index-based or at least string iterators to reduce that
overhead where possible.
* Use unsafe iterators in `View::extract`. (Benjamin Bannier, Corelight)
* Avoid unneeded parameter copy in parse functions. (Benjamin Bannier, Corelight)
* Streamline `View::firstBlock`. (Benjamin Bannier, Corelight)
* Reduce allocations when connecting sinks by mime-type. (Benjamin Bannier, Corelight)
* Reduce allocations for `Sink` debug logging. (Benjamin Bannier, Corelight)
* Deprecate `builder::string`. (Benjamin Bannier, Corelight)
* Reduce allocations when creating exceptions. (Benjamin Bannier, Corelight)
* Reduce allocations in `builder::addAssert`. (Benjamin Bannier, Corelight)
* Reduce allocations in `ParserBuilder::waitForInput`. (Benjamin Bannier, Corelight)
* Reduce allocations in `Builder::startProfiler`. (Benjamin Bannier, Corelight)
* Reduce allocatons in `Builder::addDebug*` methods. (Benjamin Bannier, Corelight)
* Reduce allocations in `ParserBuilder::parseError`. (Benjamin Bannier, Corelight)
* Reducing copying when forwarding data to sinks. (Benjamin Bannier, Corelight)
* Reduce lookup overhead in indent/dedent logger functions. (Benjamin Bannier, Corelight)
* Emit C++ string literals for HILTI string literals. (Benjamin Bannier, Corelight)
When emitting literals for HILTI strings (string ctors) we would
previously explicitly force creation of `std::string`. This was almost
always an unnecessary pessimisation over emitting string literals since
even if their C++ uses expected `std::string` string literals can
convert to this type implicitly; at the same time it made it impossible
to make effective use of APIs accepting `std::string_view`.
With this patch we now emit C++ string literals for HILTI string
literals.
* Emit locations as generated strings. (Benjamin Bannier, Corelight)
* Avoid allocations when creating parsers. (Benjamin Bannier, Corelight)
* Use non-owning strings for `fmt`. (Benjamin Bannier, Corelight)
* Use non-owning strings for `printParserState`. (Benjamin Bannier, Corelight)
* GH-1591: Use non-owning strings in the logging framework. (Benjamin Bannier, Corelight)
Closes #1591.
* Add `to_string_for_print` for string literals. (Benjamin Bannier, Corelight)
* GH-1589: Reduce string allocations on hot parse path. (Benjamin Bannier, Corelight)
When waiting for input we pass down strings for a possible error message
and the triggering location. In generated code these are always
literals.
With this patch we do not take them as owning strings, but instead as
views into existing strings to minimize allocations. In the case of
error messages the created low-level exception objects already had used
string_views, so this also aligns the APIs.
Closes #1589.
* Bump pre-commit hooks (Benjamin Bannier, Corelight)
1.10.0-dev.10 | 2023-12-06 10:40:22 +0100
* Allow unsafe Vector iteration over underlying std::vector. (Arne Welzel, Corelight)
1.10.0-dev.8 | 2023-11-17 14:04:16 +0100
* doc: Fix typo and outdated info for host applications (Anthony VEREZ)
* Add placeholder section for 1.10 to NEWS. (Benjamin Bannier, Corelight)
1.9.0 | 2023-10-24 16:28:18 +0200
* Release 1.9.0.
1.9.0-dev.159 | 2023-10-24 16:26:42 +0200
* Bump 3rdparty/utf8proc (dependabot[bot])
1.9.0-dev.157 | 2023-10-23 10:58:57 +0200
* Fix spicy-build to correctly infer library directory. (Robin Sommer, Corelight)
Closes https://github.com/zeek/zeek/issues/3384.
1.9.0-dev.155 | 2023-10-23 10:54:13 +0200
* GH-1565: Disable capturing backtraces with HILTI exceptions in non-debug builds. (Robin Sommer, Corelight)
They can be expensive to capture, and aren't used anywhere by default
unless explicitly requested.
We change it so that the exception class still remains ABI
compatible between release and debug builds. It's the compilation
settings of the code including `exception.h` that determines if a
backtrace it captured.
Closes #1565.
1.9.0-dev.152 | 2023-10-23 10:49:19 +0200
* GH-1567: Speed up runtime calls to start profilers. (Robin Sommer, Corelight)
We now cache the profiler tags to avoid frequent re-computation.
Closes #1567.
1.9.0-dev.150 | 2023-10-23 10:48:59 +0200
* GH-1568: Fix bitfield's lack of declaring its C++-side type dependencies. (Robin Sommer, Corelight)
Closes #1568.
1.9.0-dev.148 | 2023-10-23 10:10:39 +0200
* Remove check of Zeek docs in Cirrus cron config. (Benjamin Bannier, Corelight)
This is a follow-up to 2b92da596631ff1a29b7deac05e00faaad9305f3.
1.9.0-dev.146 | 2023-10-13 12:26:24 +0200
* Remove Zeek-specific documentation. (Robin Sommer, Corelight)
This now lives Zeek-side.
This keeps the section structure so that we make things easier to
find, and put in pointers to the new Zeek-side documentation
It leaves everything in the development section as is; hard to piece
apart and seems fine to leave it here.
1.9.0-dev.144 | 2023-10-13 11:47:35 +0200
* GH-1571: Remove trimming inside individual chunks. (Benjamin Bannier, Corelight)
Trimming `Chunk`s (always from the left) causes a lot of internal work
with only limited benefit since we manage visibility with `stream::View`s
on top of `Chunk`s anyway.
This patch removes trimming inside `Chunk`s so now any trimming only
removes `Chunk`s from `Chain`s, but does not internally change
individual `Chunk`s anymore. This might lead to slightly increased memory
use, but callers usually have that data in memory anyway.
Closes #1571.
1.9.0-dev.142 | 2023-10-09 11:10:35 +0200
* Explicitly set ASM compiler. (Benjamin Bannier, Corelight)
When not setting an ASM compiler we might end up configuring Clang as C
compiler, but GCC as ASM compiler. The code in `3rdparty/fiber` then
causes Clang-only flags like `-Weverything` to be passed to GCC (which
does not understand it).
The way we need to do this is subtle. We need to rely on
`3rdparty/fiber` to explicitly `enable_language(ASM)` and not set it as
a project language to work around `find_package(BISON)` breaking if
`CMAKE_ASM_COMPILER` is set. By removing ASM from project languages and
setting the ASM config after Bison was found, but before we configure
`3rdparty/fiber` things seem to work as intended.
1.9.0-dev.140 | 2023-10-06 12:34:02 +0200
* Set `ParserState::begin` on all possible paths into a parser. (Benjamin Bannier, Corelight)
We previously would sometimes leave `ParserState::begin` unset, e.g.,
when parsing beginning parsing of a top-level production. With this
patch we now set its value on all possible paths.
* Fix spicy-rt-tests for freebsd-12. (Benjamin Bannier, Corelight)
* GH-1089: Make `offset()` independent of random access functionality. (Benjamin Bannier, Corelight)
With this patch we store the value returned by `offset()` directly in
the unit instead of computing it on the fly when requested from `cur -
begin`. With that `offset()` can be used without enabling random access
functionality on the unit.
Closes #1089.
* Move `offset()` into its own feature. (Benjamin Bannier, Corelight)
This patch makes `__position` depend on `uses_offset` so uses of it
(`position()`, `offset()`) can be tracked separately. Since `offset()`
still requires random access `uses_offset` enables `uses_random_access`
for that use case; we plan to relax that in a follow-up patch.
* Consistently wrap `ParserState::begin` in optional. (Benjamin Bannier, Corelight)
1.9.0-dev.134 | 2023-10-04 11:20:46 +0200
* Bump 3rdparty/fiber from `f75b2f9` to `ada36b2` (dependabot[bot])
Bumps [3rdparty/fiber](https://github.com/simonfxr/fiber) from `f75b2f9` to `ada36b2`.
- [Commits](https://github.com/simonfxr/fiber/compare/f75b2f93aa312b9922f9c977021e5470ff7715fa...ada36b254c10d487eb4d8d108a3cb156538e1885)
---
updated-dependencies:
- dependency-name: 3rdparty/fiber
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
* Bump 3rdparty/benchmark from `0d98dba` to `7736df0` (dependabot[bot])
Bumps [3rdparty/benchmark](https://github.com/google/benchmark) from `0d98dba` to `7736df0`.
- [Release notes](https://github.com/google/benchmark/releases)
- [Commits](https://github.com/google/benchmark/compare/0d98dba29d66e93259db7daa53a9327df767a415...7736df03049c362c7275f7573de6d6a685630e0a)
---
updated-dependencies:
- dependency-name: 3rdparty/benchmark
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
* Bump 3rdparty/SafeInt from `4cafc91` to `925235c` (dependabot[bot])
Bumps [3rdparty/SafeInt](https://github.com/dcleblanc/SafeInt) from `4cafc91` to `925235c`.
- [Release notes](https://github.com/dcleblanc/SafeInt/releases)
- [Commits](https://github.com/dcleblanc/SafeInt/compare/4cafc9196c4da9c817992b20f5253ef967685bf8...925235c06de490865f871218fa2bd8ac38241a95)
---
updated-dependencies:
- dependency-name: 3rdparty/SafeInt
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
* GH-1549: GH-1554: Fix potential infinite loop when trimming data before stream. (Benjamin Bannier, Corelight)
Previously we would trigger an infinite loop if one tried to trim before
the head chunk of a stream. In praxis this seem to have been no issue
due to #1549 and us emitting way less calls to trim than possible.
This patch adds an explicit check whether we need to trim anything, and
exits the low-level function early for such cases.
Closes #1554.
1.9.0-dev.128 | 2023-09-29 13:27:28 +0200
* GH-1550: Replace recursive deletion with explicit loop to avoid stack overflow. (Benjamin Bannier, Corelight)
1.9.0-dev.126 | 2023-09-29 13:26:53 +0200
* GH-1549: Add feature guards to accesses of a unit's `__position`. (Benjamin Bannier, Corelight)
Access of `__position` triggers a random access functionality. In order
to distinguish our internal uses from accesses due to user code, most
access in our generated code should be guarded with a feature constant
(`if` or ternary).
In this patch add proper guards for a couple instances where we did not
do that correctly. That mishap caused all units with containers to be
random access (even the root unit) which in turn could have lead to
e.g., unbounded memory growth, or runtime overhead due to generation and
execution of unneeded code, or expensive cleanup on very large untrimmed
inputs.
Closes #1549.
* Refactor feature guard helper to directly take unit ID instead of type. (Benjamin Bannier, Corelight)
1.9.0-dev.123 | 2023-09-27 09:41:04 +0200
* Escape rendered bitfield fields. (Benjamin Bannier, Corelight)
* Enforce that `Bitfield`s always own their fields. (Benjamin Bannier, Corelight)
* Fix typos. (Benjamin Bannier, Corelight)
1.9.0-dev.119 | 2023-09-27 09:39:35 +0200
* GH-1542: GH-1547: Bump dependencies. (Benjamin Bannier, Corelight)
1.9.0-dev.111 | 2023-09-26 16:05:25 +0200
* Add GH dependabot config. (Benjamin Bannier, Corelight)
1.9.0-dev.109 | 2023-09-22 16:35:02 +0200
* Add explicit dependency for local pre-commit hook. (Benjamin Bannier, Corelight)
* Fix file selection in clang-format pre-commit hook. (Benjamin Bannier, Corelight)
1.9.0-dev.106 | 2023-09-22 12:19:21 +0200
* GH-1533: Fix access to anonymous bitfield element through a constant value. (Robin Sommer, Corelight)
1.9.0-dev.104 | 2023-09-22 11:04:19 +0200
* Artificially limit the number of open files. (Benjamin Bannier, Corelight)
This works around a silent failure in reproc where it would refuse to
run on systems which huge rlimits for the number of open files. We have
seen this hit on huge production boxes.
1.9.0-dev.102 | 2023-09-22 11:04:01 +0200
* GH-1478: Add regression test for #1478. (Benjamin Bannier, Corelight)
* Add begin to parser state. (Benjamin Bannier, Corelight)
This patch adds the current begin position to the parser state, and
makes the corresponding changes to generated parser functions so it is
passed down.
We already modelled the semantic beginning of the input in the unit, but
had no reliable way to keep this up-to-date across non-unit contexts
like `&parse-from`. This would then for certain setups lead to generated
code where `input` and `position` would point to different inputs which in
turn caused `offset` (modelled as `position - input`) to be incorrect.
* Factor computation of feature const into separate function. (Benjamin Bannier, Corelight)
* Expand validator error message. (Benjamin Bannier, Corelight)
* Add trait class to bitfield types. (Robin Sommer, Corelight)
This lets C++ templates test if a class T is a bitfield.
1.9.0-dev.95 | 2023-09-20 12:14:50 +0200
* Add trait class to bitfield types. (Robin Sommer, Corelight)
1.9.0-dev.93 | 2023-09-20 09:26:29 +0200
* Declare Spicy pygments extension as parallel-safe. [skip CI] (Benjamin Bannier, Corelight)
1.9.0-dev.91 | 2023-09-19 11:50:32 +0200
* Skip validating links to https://www.icir.org/hilti. (Benjamin Bannier, Corelight)
* Document using anonymous field for extracting TCP messages. (Benjamin Bannier, Corelight)
1.9.0-dev.88 | 2023-09-15 10:47:53 +0200
* Use find_package(Python) with version. (Arne Welzel, Corelight)
Zeek's configure sets Python_EXECUTABLE has hint, but Spicy is using
find_package(Python3) and would only use Python3_EXECUTABLE as hint.
This results in Spicy finding a different (the default) Python executable
when configuring Zeek with --with-python=/opt/custom/bin/python3.
Switch Spicy over to use find_package(Python) and add the minimum
version so it knows to look for Python3.
1.9.0-dev.86 | 2023-09-14 10:36:25 +0200
* Add support for passing arbitrary C++ compiler flags. (Benjamin Bannier, Corelight)
This adds a magic environment variable `HILTI_CXX_FLAGS` which if set
specifies compiler flags which should be passed during C++ compilation
after implicit flags. This could be used to e.g., set defines, or set
low-level compiler flags.
Even with this flag, for passing include directories one should still
use `HILTI_CXX_INCLUDE_DIRS` since they are searched before any
implicitly added paths.
1.9.0-dev.84 | 2023-09-07 17:12:00 +0200
* GH-1467: Support bitfield constants in Spicy for parsing. (Robin Sommer, Corelight)
One can now define bitfield "constants" for parsing by providing
integer expressions with fields:
type Foo = unit {
x: bitfield(8) {
a: 0..3 = 2;
b: 4..7;
c: 7 = 1;
};
This will first parse the bitfield as usual and then enforce that the
two bit ranges that are coming with expressions (i.e., `a` and `c`)
indeed containing the expected values. If they don't, that's a parse
error.
We also support using such bitfield constants for look-ahead parsing:
type Foo = unit {
x: uint8[];
y: bitfield(8) {
a: 0..3 = 4;
b: 4..7;
};
};
This will parse uint8s until a value is discovered that has its bits
set as defined by the bitfield constant.
(We use the term "constant" loosely here: only the bits with values
are actually enforced to be constant, all others are parsed as usual.)
Closes #1467.
* Extend bitfield type with per-item storage for constant values. (Robin Sommer, Corelight)
This allows to associate an expression with each bit value. We don't
use this from HILTI because there's isn't a good syntax to do so
(and/or: it's not worth adding), but we'll use (and test) this from
Spicy in a subsequent commit.
* Add bitfield constants. (Robin Sommer, Corelight)
It's now possible to initialize a bitfield value through an assignment
from a struct constructor expression:
type BF = bitfield(8) {
a: 0..3;
b: 4..7;
c: 4..5;
};
global BF bf = [$a = 1, $c = 2];
* Change internal tuple representation of bitfield to store optional values. (Robin Sommer, Corelight)
This will allow us to create bitfield constants where not at all
elements are set. We also add an operator to test if an element is
set (test forthcoming in a subsequent commit).
1.9.0-dev.79 | 2023-09-07 11:15:05 +0200
* GH-1520: Fix handling of `spicy-dump --enable-print`. (Benjamin Bannier, Corelight)
This flag was understood, but not handled since at least v1.1.0.
1.9.0-dev.77 | 2023-09-06 14:38:55 +0200
* Clarify error handling docs [skip CI]. (Benjamin Bannier, Corelight)
The docs previously made it sound as if exceptions could potentially be
swallowed with `on %error` handlers. Since this is currently not
supported, but a often requested feature, clarify the docs to not stir
up too much anticipation.
1.9.0-dev.75 | 2023-09-05 16:15:17 +0200
* Merge branch 'topic/bbannier/rtd-search-suppress-doxygen-results' (Benjamin Bannier, Corelight)
* GH-1516: Do not include generated Doxygen documentation in RTD search results [skip CI]. (Benjamin Bannier, Corelight)
Doxygen documentation is likely not relevant for users and hides results
they are looking for. Also, Doxygen already comes with its own search
which is reachable the developer documentation.
Closes #1516.
* Reduce ccache cache size in CI. (Benjamin Bannier, Corelight)
1.9.0-dev.71 | 2023-09-01 10:33:35 +0200
* Drop removed RTD config key. (Benjamin Bannier, Corelight)
1.9.0-dev.69 | 2023-09-01 09:57:22 +0200
* GH-1503: Handle anonymous bitfields inside `switch` statements. (Robin Sommer, Corelight)
We now map items of anonymous bitfields inside a `switch` cases into
the unit namespace, just like we already do for top-level fields. We
also catch if two anonymous bitfields inside those cases carry the
same name, which would make accesses ambiguous.
So the following works now:
```
switch (self.n) {
0 -> : bitfield(8) {
A: 0..7;
};
* -> : bitfield(8) {
B: 0..7;
};
};
```
Whereas this does not work:
```
switch (self.n) {
0 -> : bitfield(8) {
A: 0..7;
};
* -> : bitfield(8) {
A: 0..7;
};
};
```
The latter not working is reasonable I think, rather than trying to
figure out that the two cases have the same type and hence could be
mapped to a single data member in the resulting code (which we do, and
support, for cases of the same *name* and type).
Closes #1503.
* Polish string rendering of anonymous bitfields in structs. (Robin Sommer, Corelight)
We now print their ID as `<anon>`. This isn't perfect, it would be
nicer if we printed out the IDs of the bitfield's items at the
top-level (like `spicy-dump` does). However we don't have the
necessary type information available when rendering as a string, and
this seems good enough.
* Support `.?` and `?.` for items of anonymous bitfields. (Robin Sommer, Corelight)
1.9.0-dev.64 | 2023-08-28 17:05:18 +0200
* GH-1508: Fix returned value for `<unit>.position()`. (Benjamin Bannier, Corelight)
We declared this method to return an iterator, but potentially returned
an optional iterator. This patch add an additional deref so we get the
correct value.
1.9.0-dev.62 | 2023-08-28 12:11:40 +0200
* GH-1504: Use user-inaccessible chars for encoding `::` in feature variables. (Benjamin Bannier, Corelight)
When setting up feature tracking variables for the optimizer we
previously would normalize `:` as `_`, e.g., `mod::Unit` would lead to
feature variables `__feat%mod__Unit%...`. In various places in the