Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduct DetectInfo fields memory #806

Merged
merged 16 commits into from
Nov 16, 2022
Merged

Conversation

fukusuket
Copy link
Collaborator

@fukusuket fukusuket commented Nov 14, 2022

What Changed

  • Reduced memory usage of DetectInfo Memory usage has almost doubled after introducing custom profiles #789
  • added Profile Enum in option/profile.rs and remove lazy_static PRELOAD_xx.
    • to store the following data.
      • profiles.yaml key-value (before convertion)
      • converted value (ex. %Timestamp% -> 2013-10-24 01:16:13.843 +09:00)
    • for the pattern matching of profile feature
  • changed DetectInfo fileds and behaviors as follows
    • change ext_field type Nested<Vec<CompactString>> -> Vec<(CompactString, Profile)>(tuple)
    • add is_condition to check whether the detected rule was [condition] based.
    • clear details and record_information when the moment stored details string in ext_field
      • I think this change has the most affected on memory reduction in this PR.

Evidence

I got the benchmark with this environment and procedure.
Compared to version 1.8.0, I confirmed reduction of memory usage : -200MB(avg) -510MB(max peak).

version avg mem max mem Elapsed Time
v180 3.13GB peak 5.64GB 00:13:22.948
v190-dev 2.93GB peak 5.13GB 00:13:18.767

v190-dev output is as follows.

Results Summary:

Events with hits / Total events: 1,593,663 / 4,817,181 (Data reduction: 3,223,518 events (66.92%))

Total | Unique detections: 1,629,377 | 143
Total | Unique critical detections: 0 (0.00%) | 0 (0.00%)
Total | Unique high detections: 12,041 (0.74%) | 22 (15.38%)
Total | Unique medium detections: 9,941 (0.61%) | 35 (24.48%)
Total | Unique low detections: 1,056,244 (64.83%) | 40 (27.97%)
Total | Unique informational detections: 551,151 (33.83%) | 46 (32.17%)
...
Elapsed time: 00:13:18.767
Saved file: out.csv (576.3 MB)

- add: Profile Enum for profile feature
  - remove: lazy_static value LOADED_PROFILE_ALIAS, PRELOAD_PROFILE, PRELOAD_PROFILE_REGEX.
- chg: DetectInfo.ext_field Nested<Vec<CompactString>> to Vec<(CompactString, Profile)>
- chg: clear DetectInfo.detail when convert detail in message.rs::insert
@hitenkoku hitenkoku added the enhancement New feature or request label Nov 14, 2022
@hitenkoku hitenkoku linked an issue Nov 14, 2022 that may be closed by this pull request
@YamatoSecurity
Copy link
Collaborator

@fukusuket
対応ありがとうございます!
Intel Macで14GBのデータに対して、-P super-verboseプロファイルで比較してみました。

1.8.0:
メモリ: 20 GB
処理時間: 00:30:04.921

1.9.0-dev:
メモリ: 18.5 GB (7.5%削減)
処理時間: 00:27:24.418 (9%加速)

検知数は一緒ですが、Events with hitsとData reductionの合計が変わっています。
1.8.0:

Events with hits / Total events: 2,705,252 / 14,213,054 (Data reduction: 11,507,802 events (80.97%))

Total | Unique detections: 3,925,540 | 749
Total | Unique critical detections: 104 (0.00%) | 21 (2.80%)
Total | Unique high detections: 451,919 (11.51%) | 303 (40.45%)
Total | Unique medium detections: 185,392 (4.72%) | 249 (33.24%)
Total | Unique low detections: 1,201,680 (30.61%) | 105 (14.02%)
Total | Unique informational detections: 2,086,445 (53.15%) | 71 (9.48%)

1.9.0-dev:

Events with hits / Total events: 2,706,807 / 14,213,054 (Data reduction: 11,506,247 events (80.96%))

Total | Unique detections: 3,925,540 | 749
Total | Unique critical detections: 104 (0.00%) | 21 (2.80%)
Total | Unique high detections: 451,919 (11.51%) | 303 (40.45%)
Total | Unique medium detections: 185,392 (4.72%) | 249 (33.24%)
Total | Unique low detections: 1,201,680 (30.61%) | 105 (14.02%)
Total | Unique informational detections: 2,086,445 (53.15%) | 71 (9.48%)

確認して頂けますか?

@YamatoSecurity
Copy link
Collaborator

YamatoSecurity commented Nov 14, 2022

-P minimalプロファイルでも比較してみました。

1.8.0:
メモリ: 9.26 GB
処理時間: 00:27:20.853

1.9.0-dev:
メモリ: 8 GB (約15%削減)
処理時間: 00:26:57.496 (約2%加速)

minimalプロファイルの方がメモリ使用が削減されますが、処理時間はそんな変わらないようですね。

参考に1.4.3も同じルールで試しました:
メモリ: 100%時に5.3GB、スキャン後に6.75GBに増えた (37%削減)
処理時間: 00:33:47.368 (25%減速)

Total events: 10012719
Data reduction: 7306619 events (72.97%)

Total detections: 3929895
Total critical detections: 104
Total high detections: 452326
Total medium detections: 189474
Total low detections: 1201564
Total informational detections: 2086427

Unique detections: 751
Unique critical detections: 21
Unique high detections: 305
Unique medium detections: 250
Unique low detections: 104
Unique informational detections: 71

@fukusuket
Copy link
Collaborator Author

@YamatoSecurity

検証いただき、ありがとうございます🙇 Events with hitsのリグレッションすみません、修正いたします!
現在の修正は、%Details%のメモリ削減を見込んでいるのですが(そのため、minimalのほうが効果が大きく見えると思われます)、%AllFieldInfo%も同じ手法で削減できないか確認いたします!

@fukusuket
Copy link
Collaborator Author

fukusuket commented Nov 15, 2022

Events with hits件数不具合修正と、%AllFieldInfo%改善版作成いたしました!
%AllFieldInfo%でも%Details%と同様に途中でStringをクリアする処理を入れたところ、修正前後の結果は以下の通りでした。

version profile avg mem max mem elasped time
v180 default 3.13GB 5.64GB 00:13:22.948
v190-dev 03411da (%AllFieldInfo%改善後) default 2.92GB 4.85GB 00:13:07.053
v190-dev 3dc9e80 (%AllFieldInfo%改善前) super-verbose 4.91GB 9.88GB 00:13:08.920
v190-dev 03411da (%AllFieldInfo%改善後) super-verbose 4.33GB 8.66GB 00:13:08.769

また%Details%の処理ももう少し改善(不要なvec/hashmapアクセスの削除)を加えたところ、少し速度改善したようです :)
手元で機能面リグレッションないか、もう少しテストいたします!

@YamatoSecurity
Copy link
Collaborator

@fukusuket あれ、Githubチェックは無事のようですが、cargoをアップデートしたら、ローカル環境で以下の警告が出ていました:

warning: use of deprecated associated function `chrono::TimeZone::timestamp`: use `timestamp_opt()` instead
   --> src/afterfact.rs:955:32
    |
955 |         let offset_sec = Local.timestamp(0, 0).offset().local_minus_utc();
    |                                ^^^^^^^^^
    |
    = note: `#[warn(deprecated)]` on by default

warning: use of deprecated associated function `chrono::TimeZone::ymd`: use `with_ymd_and_hms()` instead
   --> src/detections/detection.rs:258:32
    |
258 |         let default_time = Utc.ymd(1970, 1, 1).and_hms(0, 0, 0);
    |                                ^^^

warning: use of deprecated associated function `chrono::Date::<Tz>::and_hms`: Use and_hms_opt() instead
   --> src/detections/detection.rs:258:48
    |
258 |         let default_time = Utc.ymd(1970, 1, 1).and_hms(0, 0, 0);
    |                                                ^^^^^^^

warning: use of deprecated associated function `chrono::TimeZone::ymd`: use `with_ymd_and_hms()` instead
  --> src/detections/rule/count.rs:31:28
   |
31 |     let default_time = Utc.ymd(1977, 1, 1).and_hms(0, 0, 0);
   |                            ^^^

warning: use of deprecated associated function `chrono::Date::<Tz>::and_hms`: Use and_hms_opt() instead
  --> src/detections/rule/count.rs:31:44
   |
31 |     let default_time = Utc.ymd(1977, 1, 1).and_hms(0, 0, 0);
   |                                            ^^^^^^^

warning: use of deprecated associated function `chrono::TimeZone::timestamp`: use `timestamp_opt()` instead
   --> src/options/update.rs:223:44
    |
223 |         let mut latest_update_date = Local.timestamp(0, 0);
    |                                            ^^^^^^^^^

warning: `hayabusa` (lib) generated 6 warnings

念の為、確認をお願いできますか?

@YamatoSecurity
Copy link
Collaborator

-P super-verboseでもう一回試したら、20GB -> 18.5 -> 更に 16GBまで減りました!
とても良い感じです!

Events数も直りました。ありがとうございました!

Copy link
Collaborator

@YamatoSecurity YamatoSecurity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
@hitenkoku どうですか?

@fukusuket
Copy link
Collaborator Author

fukusuket commented Nov 15, 2022

@YamatoSecurity
パフォーマンスさっそく見ていただきありがとうございます🙇

warning: use of deprecated associated function `chrono::TimeZone::timestamp`

Chrono 0.4.23リリース(3days ago)で現在使っているいくつかの関数がdeprecatedになったことが原因のようです :(
別issueでChronoのdeprecate関数の削除を対応するのがよさそうかと思いました!
https://docs.rs/chrono/0.4.23/chrono/offset/trait.TimeZone.html#method.ymd
https://github.com/chronotope/chrono/releases/tag/v0.4.23

@YamatoSecurity
Copy link
Collaborator

なるほど、良いと思います。では、とりあえずChronoを0.4.22に固定しました。

Chrono 0.4.23リリース(3days ago)で現在使っているいくつかの関数がdeprecatedになったことが原因のようです :( 別issueでChronoのdeprecate関数の削除を対応するのがよさそうかと思いました! https://docs.rs/chrono/0.4.23/chrono/offset/trait.TimeZone.html#method.ymd https://github.com/chronotope/chrono/releases/tag/v0.4.23

@fukusuket
Copy link
Collaborator Author

すみません...🙇 1点まだ不具合があり、%AllFieldInfo%に値ないときは、- (ハイフン)を出力するのがもともとの仕様だったため、そちら修正いたします!

@fukusuket
Copy link
Collaborator Author

fukusuket commented Nov 15, 2022

%AllFieldInfo%に値ないときは、- (ハイフン)を出力する

↑の修正および
v180と比較して、-j, -J, -P minimal,-P (standard),-P all-field-info オプションの出力結果に不要な差分がないことを確認しました。手元でのテストは完了です。

@YamatoSecurity @hitenkoku d5636ca にてレビュー頂けますと幸いです、よろしくお願いいたします🙇

@hitenkoku
Copy link
Collaborator

この件は既にmainにマージされている 7266abf で対応済みです。

@fukusuket あれ、Githubチェックは無事のようですが、cargoをアップデートしたら、ローカル環境で以下の警告が出ていました:

warning: use of deprecated associated function `chrono::TimeZone::timestamp`: use `timestamp_opt()` instead
   --> src/afterfact.rs:955:32
    |
955 |         let offset_sec = Local.timestamp(0, 0).offset().local_minus_utc();
    |                                ^^^^^^^^^
    |
    = note: `#[warn(deprecated)]` on by default

warning: use of deprecated associated function `chrono::TimeZone::ymd`: use `with_ymd_and_hms()` instead
   --> src/detections/detection.rs:258:32
    |
258 |         let default_time = Utc.ymd(1970, 1, 1).and_hms(0, 0, 0);
    |                                ^^^

warning: use of deprecated associated function `chrono::Date::<Tz>::and_hms`: Use and_hms_opt() instead
   --> src/detections/detection.rs:258:48
    |
258 |         let default_time = Utc.ymd(1970, 1, 1).and_hms(0, 0, 0);
    |                                                ^^^^^^^

warning: use of deprecated associated function `chrono::TimeZone::ymd`: use `with_ymd_and_hms()` instead
  --> src/detections/rule/count.rs:31:28
   |
31 |     let default_time = Utc.ymd(1977, 1, 1).and_hms(0, 0, 0);
   |                            ^^^

warning: use of deprecated associated function `chrono::Date::<Tz>::and_hms`: Use and_hms_opt() instead
  --> src/detections/rule/count.rs:31:44
   |
31 |     let default_time = Utc.ymd(1977, 1, 1).and_hms(0, 0, 0);
   |                                            ^^^^^^^

warning: use of deprecated associated function `chrono::TimeZone::timestamp`: use `timestamp_opt()` instead
   --> src/options/update.rs:223:44
    |
223 |         let mut latest_update_date = Local.timestamp(0, 0);
    |                                            ^^^^^^^^^

warning: `hayabusa` (lib) generated 6 warnings

念の為、確認をお願いできますか?

Copy link
Collaborator

@hitenkoku hitenkoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

確認しました。chronoのlintエラーに関連したバージョン固定は不要ですのでそちらを戻していただければ他は問題ありません

Cargo.toml Outdated Show resolved Hide resolved
Copy link
Collaborator

@hitenkoku hitenkoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM。ご対応ありがとうございます。

@hitenkoku hitenkoku merged commit bbbe65f into main Nov 16, 2022
@hitenkoku hitenkoku deleted the 789-reduct-detectinfo-memory branch November 16, 2022 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory usage has almost doubled after introducing custom profiles
3 participants