Add size constraints to CTIM collection fields (XDR-46972)#483
Add size constraints to CTIM collection fields (XDR-46972)#483ereteog wants to merge 4 commits intothreatgrid:masterfrom
Conversation
yogsototh
left a comment
There was a problem hiding this comment.
I have only two concerns:
-
why these numbers, to me they are a bit like "magic numbers" and it will eventually breaks in PROD refusing the creation of many entities suddenly without a warning which may create issues with already working integrations. Personally, I think it may be a bit too much to constrain statically here instead of directly into the CTIA application where we could apply different size constraints depending on the source for example. Which will give the time for an integration to adapt to the new constraints. And perhaps instead of a hard reject, we may also integrate a mechanism to just cut the strings/arrays to keep the first one.
-
We should check if adding these constraints does not affect Swagger and Swagger UI. From memory plumatic/schemas and swagger do not support very well certain kind of external constraints like this one. So while the API will work, Swagger UI may be broken, or not show the correct schemas anymore.
Add pred/max-len constraints to seq-of/set-of collection fields in sighting, incident, and common schemas to prevent oversized documents causing ES bulk write failures. - Bump flanders to 1.1.1-SNAPSHOT (adds :spec support on collection types) - Add default-collection-max-len (500) for common collection fields - Add specific limits for sighting fields (observables, relations, targets, data tables) - Constrain external_ids and external_references in base entity entries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace f/any-str with ShortString (1024), MedString (2048), or LongString (5000) across all CTIM schemas based on field semantics and production data analysis (831K sightings, 388K incidents from NAM). - ShortString: identifiers, names, IPs, labels, type fields - MedString: indicator specs (snort, SIOC, OpenIOC), metadata values - LongString: casebook text content - seq-of ShortString: hashes, variables, permissible_IPs Observable :value intentionally kept as f/any-str — production data shows process_args up to 32K chars (6376 values > 1024). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…onstraint - observables: 2000 → 5000 (P99.9 = 4128) - targets: 1000 → 2000 (P99.9 = 1841) - relations: 10000 (unchanged, P99.9 = 9814) - relation_info.actions: new limit of 1000 (addresses 142K action accumulation pattern) - pred/max-len: add metadata for downstream 413 error reporting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bfb1769 to
8237a23
Compare
Add
pred/max-lenconstraints to collection fields in sighting, incident, and common schemas.Changes
1.1.1-SNAPSHOT(adds:specsupport onSequenceOfType,SetOfType, andMapType)default-collection-max-len(500) constant incommon.cljcfor shared collection fieldsexternal_idsandexternal_referencesin base entity entriescolumns: 100,rows: 10,000targets: 2,000,observables: 5,000,relations: 10,000categories,assignees,tactics,techniques) to 500relation_info.actionslimit of 1,000{:max-len len}metadata topred/max-lenfor downstream 413 error reportingf/any-strwith typed string constraints (ShortString,MedString,LongString)§ QA
No QA is needed. All 139 existing tests (including generative tests) pass.
§ Release Notes
§ Squashed Commits