Skip to content

[GitLab] Add support for version 18.9.2#17870

Merged
moxarth-rathod merged 8 commits intoelastic:mainfrom
moxarth-rathod:gitlab-fix
Apr 7, 2026
Merged

[GitLab] Add support for version 18.9.2#17870
moxarth-rathod merged 8 commits intoelastic:mainfrom
moxarth-rathod:gitlab-fix

Conversation

@moxarth-rathod
Copy link
Copy Markdown
Contributor

Proposed commit message

gitlab: add support for version 18.9.2

Add support for the new fields in gitlab integration. add ignore missing in 
pipeline processor for better null handling.

Tested on gitlab 18.9.2 CE.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

How to test this PR locally

  • Clone integrations repo.
  • Install elastic package locally.
  • Start elastic stack using elastic-package.
  • Move to integrations/packages/gitlabdirectory.
  • Run the following command to run tests.

elastic-package test

@moxarth-rathod moxarth-rathod self-assigned this Mar 18, 2026
@moxarth-rathod moxarth-rathod requested a review from a team as a code owner March 18, 2026 06:29
@moxarth-rathod moxarth-rathod added enhancement New feature or request Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] Integration:gitlab GitLab Team:SDE-Crest Crest developers on the Security Integrations team [elastic/sit-crest-contractors] labels Mar 18, 2026
@elasticmachine
Copy link
Copy Markdown

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@github-actions
Copy link
Copy Markdown
Contributor

✅ Vale Linting Results

No issues found on modified lines!


The Vale linter checks documentation changes against the Elastic Docs style guide.

To use Vale locally or report issues, refer to Elastic style guide for Vale.

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

elastic-vault-github-plugin-prod bot commented Mar 18, 2026

🚀 Benchmarks report

Package gitlab 👍(1) 💚(0) 💔(6)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
api 3058.1 2032.52 -1025.58 (-33.54%) 💔
application 2732.24 1776.2 -956.04 (-34.99%) 💔
audit 8064.52 5847.95 -2216.57 (-27.49%) 💔
auth 8000 3558.72 -4441.28 (-55.52%) 💔
pages 16393.44 8771.93 -7621.51 (-46.49%) 💔
sidekiq 12658.23 2008.03 -10650.2 (-84.14%) 💔

To see the full report comment with /test benchmark fullreport

@andrewkroh andrewkroh added the documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. label Mar 18, 2026
Copy link
Copy Markdown
Contributor

@ShourieG ShourieG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI-Generated Review | Elastic Integration PR Review Bot

⚠️ This is an automated review generated by an AI assistant. Please verify all suggestions before applying changes. This review does not represent a human reviewer's opinion.


PR Review | elastic/integrations #17870

Field Mapping

Data Stream: api (package: gitlab)

File: packages/gitlab/data_stream/api/fields/fields.yml

Issue 1: time field stored as keyword instead of date
Severity: 🟠 High
Location: packages/gitlab/data_stream/api/fields/fields.yml line 163

Problem: The field gitlab.api.time is typed as keyword. A field named time almost certainly represents a timestamp and should use date type for proper time-series indexing and range queries.
Recommendation:

- name: time
  type: date
  description: Timestamp of the API request.

Issue 2: cpu_s field typed as long instead of float/double
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/api/fields/fields.yml line 91

Problem: gitlab.api.cpu_s uses type: long, but the _s suffix convention throughout this file denotes seconds (a duration), which is typically fractional. All other *_s fields use float or double. Using long truncates sub-second CPU time.
Recommendation:

- name: cpu_s
  type: float
  description: CPU time consumed by the request, in seconds.

Issue 3: Ambiguous short field names without descriptions (db, duration, view, etc.)
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/api/fields/fields.yml line 201

Problem: Newly added fields db, duration, gitaly_duration, queue_duration, and view are very short, ambiguous names with no descriptions. The file already contains db_duration_s, duration_s, gitaly_duration_s, queue_duration_s, and view_duration_s — making it unclear what these new fields represent or how they differ.
Recommendation:

- name: db
  type: double
  description: Total database time for the request (legacy field, in milliseconds).
- name: duration
  type: float
  description: Total request duration (legacy field, in milliseconds).

Issue 4: user_id typed as keyword, conflicts with meta.user_id typed as long
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/api/fields/fields.yml line 357

Problem: Newly added gitlab.api.user_id is typed as keyword, while the existing meta.user_id (line 331) is typed as long. These likely represent the same concept (a numeric user identifier) but with conflicting types.
Recommendation:

- name: user_id
  type: long
  description: Numeric ID of the user making the API request.

Issue 5: Flat dotted field names instead of nested groups (meta.* fields)
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/api/fields/fields.yml line 39

Problem: Fields like meta.feature_category, meta.client_id, meta.caller_id, meta.remote_ip, meta.user, meta.gl_user_id, meta.organization_id, meta.project, meta.root_namespace, and meta.user_id use flat dotted names instead of a proper nested type: group hierarchy. This violates the field naming convention.
Recommendation:

- name: meta
  type: group
  fields:
    - name: feature_category
      type: keyword
    - name: client_id
      type: keyword
    - name: remote_ip
      type: ip
    - name: user_id
      type: long

Issue 6: Missing descriptions on all custom fields
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/api/fields/fields.yml line 1

Problem: No custom field in this file has a description property. Every custom field definition requires a meaningful description explaining what the field contains.
Recommendation:

- name: db_count
  type: long
  description: Total number of database calls made during the request.
- name: correlation_id
  type: keyword
  description: Unique identifier used to correlate log entries for a single request.
- name: route
  type: keyword
  description: The matched route pattern for the API request.

Issue 7: rate_limiting_gates defined as empty group
Severity: 🔵 Low
Location: packages/gitlab/data_stream/api/fields/fields.yml line 284

Problem: rate_limiting_gates is defined as type: group with fields: [] (empty). An empty group definition is non-functional and likely indicates missing subfield definitions or a structural error.
Recommendation:

# Option A: dynamic structure
- name: rate_limiting_gates
  type: flattened
  description: Rate limiting gate information for the request.

# Option B: remove if unused

💡 Suggestions

  1. Standardize duration field types: new *_max_duration_s fields use double while existing *_duration_s fields use float. Consider standardizing all duration fields to double.
  2. gitaly_duration_s (line 169–170) uses double while existing gitaly_duration (line 207–208) uses float. If both represent the same metric in different units, document this clearly.
  3. params.secret_token and params.private_token (lines 228–263) store sensitive credential values. Consider documenting that these should be redacted/scrubbed in the pipeline before indexing.

Data Stream: application (package: gitlab)

File: packages/gitlab/data_stream/application/fields/fields.yml

Issue 1: namespace_id typed as keyword — should be long
Severity: 🟠 High
Location: packages/gitlab/data_stream/application/fields/fields.yml line 56

Problem: namespace_id is typed as keyword. GitLab's data model uses auto-increment integer primary keys for all entity IDs. The sibling field meta.user_id in the same file is already typed as long, making this an internal inconsistency. Storing as keyword prevents numeric range queries.
Recommendation:

- name: namespace_id
  type: long
  description: GitLab internal integer ID of the namespace (group or personal namespace).

Issue 2: user_id typed as keyword — should be long
Severity: 🟠 High
Location: packages/gitlab/data_stream/application/fields/fields.yml line 80

Problem: user_id is typed as keyword, inconsistent with meta.user_id (typed as long) in the same file. GitLab's Rails schema defines these as bigint columns.
Recommendation:

- name: user_id
  type: long
  description: GitLab internal integer ID of the user associated with the event.

Issue 3: Nested duration_s fields must use double, not long
Severity: 🟠 High
Location: packages/gitlab/data_stream/application/fields/fields.yml line 119

Problem: GitLab emits all duration_s values as floating-point seconds (e.g., 0.038, 1.204). Any nested duration_s field typed as long would silently truncate sub-second precision. This is a data-loss bug.
Recommendation:

- name: duration_s
  type: double
  description: |
    Duration of the operation in seconds. GitLab emits this as a
    floating-point value; sub-second precision is expected.

Issue 4: All custom fields missing descriptions
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/application/fields/fields.yml line 7

Problem: All custom fields — action, duration_s, exception.backtrace, exception.class, exception.message, path, path_traversal_check_duration_s, pid, correlation_id, meta.organization_id, params.user.login, params.user.password, ua — are missing descriptions. This is confirmed pervasive across all fields.
Recommendation:

- name: action
  type: keyword
  description: The Rails controller action that handled the request (e.g., "index", "show").
- name: pid
  type: long
  description: OS process ID of the GitLab Puma worker that handled the request.
- name: correlation_id
  type: keyword
  description: Unique identifier used to correlate log entries across GitLab services for a single request.

Data Stream: pages (package: gitlab)

File: packages/gitlab/data_stream/pages/fields/fields.yml

Issue 1: remote_addr typed as keyword instead of ip
Severity: 🟠 High
Location: packages/gitlab/data_stream/pages/fields/fields.yml line 49

Problem: gitlab.pages.remote_addr is typed as keyword, but the name strongly implies it stores an IP address. The sibling field remote_ip (line 51) already uses ip type. Storing IP addresses as keyword prevents IP-range queries and CIDR matching.
Recommendation:

- name: remote_addr
  type: ip
  description: Remote IP address of the client connection.

Issue 2: listen_addr.Zone uses PascalCase instead of lowercase
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/pages/fields/fields.yml line 32

Problem: The field gitlab.pages.listen_addr.Zone uses PascalCase (Zone) instead of the required lowercase snake_case naming convention. All field names must be lowercase.
Recommendation:

- name: zone
  type: keyword
  description: Network zone for the listen address.

Issue 3: All gitlab.pages custom fields missing descriptions
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/pages/fields/fields.yml line 7

Problem: All custom fields under gitlab.pages are missing the required description property.
Recommendation:

- name: config_addr
  type: keyword
  description: Address used for the GitLab Pages configuration listener.
- name: duration_ms
  type: long
  description: Duration of the request in milliseconds.
- name: remote_addr
  type: ip
  description: Remote IP address of the client connection.

💡 Suggestions

  1. gitlab.pages.msg (line 38–39) is typed as keyword. If this stores free-form log messages, consider using text with a keyword multi-field for aggregation.
  2. gitlab.pages.host (line 19–20) is typed as keyword. If this can contain IP addresses, consider using ip type or documenting that it stores hostnames only.

Data Stream: production (package: gitlab)

File: packages/gitlab/data_stream/production/fields/fields.yml

Issue 1: path_traversal_check_duration_s uses long instead of float/double
Severity: 🟠 High
Location: packages/gitlab/data_stream/production/fields/fields.yml line 248

Problem: path_traversal_check_duration_s uses type long, but the _s suffix indicates a duration in seconds with sub-second decimal precision. Every other *_duration_s field in this file uses float or double. Using long will silently truncate fractional seconds.
Recommendation:

- name: path_traversal_check_duration_s
  type: float
  description: Time spent on path traversal checks, in seconds.

Issue 2: Flat dotted field names instead of nested groups for meta.* fields
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/production/fields/fields.yml line 177

Problem: Fields like meta.user, meta.feature_category, meta.client_id, meta.organization_id, meta.caller_id, meta.remote_ip, meta.user_id, and meta.search.page use flat dotted names instead of a proper nested meta group structure. The newly added meta.organization_id (line 189) perpetuates this pattern.
Recommendation:

- name: meta
  type: group
  fields:
    - name: user
      type: keyword
    - name: feature_category
      type: keyword
    - name: organization_id
      type: keyword
    - name: remote_ip
      type: ip
    - name: user_id
      type: long

Issue 3: Missing descriptions on newly added custom fields
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/production/fields/fields.yml line 7

Problem: The vast majority of custom fields — including all newly added ones (action, correlation_id, meta.organization_id, ua, path, exception.*, params.user.*, db_*_write_count, db_*_txn_max_duration_s) — are missing description properties.
Recommendation:

- name: action
  type: keyword
  description: The controller action that handled the request.
- name: correlation_id
  type: keyword
  description: Unique identifier used to correlate log entries for a single request.
- name: ua
  type: keyword
  description: The User-Agent string of the client making the request.
- name: path
  type: keyword
  description: The URL path of the request.

Issue 4: Ambiguous time field type without description
Severity: 🔵 Low
Location: packages/gitlab/data_stream/production/fields/fields.yml line 163

Problem: The time field uses type keyword. If this field stores a timestamp or time value, it should use date type. Without a description, the intent is ambiguous.
Recommendation:

- name: time
  type: keyword
  description: Formatted time string (e.g., response time label). Not a timestamp.

💡 Suggestions

  1. exception.backtrace (line 167–168) stores stack trace data which can be very long. Consider adding ignore_above: 8192 or using text type.
  2. params.user.password (line 243–244) and params.new_user.password (line 228–229) store password values. Ensure these are redacted/masked at the pipeline level before indexing.
  3. The ua field (line 297–298) is a user agent string. Consider using user_agent.original (ECS) instead.

Data Stream: sidekiq (package: gitlab)

File: packages/gitlab/data_stream/sidekiq/fields/fields.yml

Issue 1: redis_cache_duration_s typed as keyword instead of double
Severity: 🔴 Critical
Location: packages/gitlab/data_stream/sidekiq/fields/fields.yml line 373

Problem: redis_cache_duration_s is typed as keyword but it is a duration in seconds. Every other *_duration_s field in this file is typed as double. This is almost certainly a copy-paste error and will cause indexing failures or silent type coercion.
Recommendation:

- name: redis_cache_duration_s
  type: double   # was: keyword — inconsistent with all other *_duration_s fields

Issue 2: exclusive_lock_wait_duration_s typed as long instead of double
Severity: 🟠 High
Location: packages/gitlab/data_stream/sidekiq/fields/fields.yml line 172

Problem: exclusive_lock_wait_duration_s is typed as long, but the _s suffix indicates a fractional-seconds duration. All other *_duration_s fields use double. The adjacent exclusive_lock_hold_duration_s (line 168) is correctly typed as double.
Recommendation:

- name: exclusive_lock_wait_duration_s
  type: double   # was: long — inconsistent with all other *_duration_s fields

Issue 3: Field name contains hyphen: sentry-trace
Severity: 🟠 High
Location: packages/gitlab/data_stream/sidekiq/fields/fields.yml line 472

Problem: The field name sentry-trace contains a hyphen (-), which violates Elasticsearch field naming conventions and the snake_case requirement. Hyphens in field names can cause query parsing issues and mapping conflicts.
Recommendation:

- name: sentry_trace   # was: sentry-trace
  type: keyword

Issue 4: Nearly all custom fields are missing descriptions
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/sidekiq/fields/fields.yml line 1

Problem: Only redis_calls has a description. All other custom fields — including the many db_* metrics, exclusive_lock_*, size_limiter_* — are missing descriptions, significantly reducing usability.
Recommendation:
Add description properties to all custom fields, prioritizing the most important and non-obvious ones.


💡 Suggestions

  1. exception.message (line 165): Consider using type: text with a keyword multi-field since exception messages are free-form text that benefits from full-text search.
  2. exception.backtrace (line 162): Consider text or wildcard type rather than keyword to avoid truncation of long stack traces.
  3. db.duration_m vs db.duration_s (lines 24–27): Having both minutes and seconds variants is unusual. Confirm these are genuinely separate source fields and not a duplication error.

Pipeline

Data Stream: api (package: gitlab)

File: packages/gitlab/data_stream/api/elasticsearch/ingest_pipeline/default.yml

Issue 1: foreach processor incorrectly references _ingest._value.name for IP conversion
Severity: 🟠 High
Location: packages/gitlab/data_stream/api/elasticsearch/ingest_pipeline/default.yml line 146

Problem: The foreach processor iterates over source.ip (a flat string array). The inner convert processor incorrectly references _ingest._value.name (treating each element as an object with a .name subfield). Since source.ip contains plain IP strings, _ingest._value IS the IP string — not an object. The on_failure block also removes _ingest._value.ip which doesn't exist. This entire block is broken and will never produce correct output.
Recommendation:

- foreach:
    field: source.ip
    if: ctx.source?.ip instanceof List
    processor:
      convert:
        field: _ingest._value
        type: ip
        ignore_missing: true

Issue 2: set processors copy from fields already consumed by earlier rename processors (host, method, path, pid)
Severity: 🟠 High
Location: packages/gitlab/data_stream/api/elasticsearch/ingest_pipeline/default.yml line 456

Problem: The set processors at lines 456–502 use copy_from to copy gitlab.api.host, gitlab.api.method, gitlab.api.path, and gitlab.api.pid — but these fields were already consumed by rename processors at lines 83–98 and 167–170. After a rename, the source field no longer exists, so these set processors silently do nothing. host.name, http.request.method, url.path, and process.pid will never be populated by these processors.
Recommendation:

# REMOVE these dead-code set processors (lines 456-502):
# - set: field: host.name, copy_from: gitlab.api.host  (already renamed)
# - set: field: http.request.method, copy_from: gitlab.api.method  (already renamed)
# - set: field: url.path, copy_from: gitlab.api.path  (already renamed)
# - set: field: process.pid, copy_from: gitlab.api.pid  (already renamed)

Issue 3: set processors for status, user_id, username are dead code — source fields already consumed by rename
Severity: 🟠 High
Location: packages/gitlab/data_stream/api/elasticsearch/ingest_pipeline/default.yml line 772

Problem: The set processors at lines 772–792 copy from gitlab.api.status, gitlab.api.user_id, and gitlab.api.username — all of which were already consumed by rename processors at lines 83–86, 171–174, and 179–182 respectively. These set processors are dead code.
Recommendation:

# REMOVE these dead-code set processors:
# - set: field: http.response.status_code, copy_from: gitlab.api.status  (already renamed at line 83)
# - set: field: user.id, copy_from: gitlab.api.user_id  (already renamed at line 171)
# - set: field: user.name, copy_from: gitlab.api.username  (already renamed at line 179)

Issue 4: event.kind set via append (creates array) and duplicates global on_failure logic
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/api/elasticsearch/ingest_pipeline/default.yml line 229

Problem: event.kind is set via append (creating an array) when error.message != null. event.kind is a scalar ECS keyword field — using append creates an array which is incorrect. Additionally, this duplicates the global on_failure handler's responsibility.
Recommendation:

- set:
    field: event.kind
    value: pipeline_error
    if: ctx.error?.message != null

Issue 5: date processor uses ignore_failure instead of on_failure with error reporting
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/api/elasticsearch/ingest_pipeline/default.yml line 29

Problem: The date processor uses ignore_failure: true which silently swallows timestamp parse failures. If gitlab.api.time contains an unparseable value, @timestamp falls back to ingest time with no indication in the document. The processor also lacks a tag.
Recommendation:

- date:
    field: gitlab.api.time
    target_field: '@timestamp'
    tag: date_parse_api_time
    formats:
      - ISO8601
    on_failure:
      - append:
          field: error.message
          value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} failed with message: {{{_ingest.on_failure_message}}}'

Issue 6: Inconsistent on_failure error message format in older convert processors
Severity: 🔵 Low
Location: packages/gitlab/data_stream/api/elasticsearch/ingest_pipeline/default.yml line 57

Problem: The on_failure error messages in older convert processors (lines 46–69, 119) use a truncated format omitting processor type, tag, and pipeline context. Newly added processors use the full recommended format.
Recommendation:

value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.on_failure_pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'

Issue 7: Global on_failure handler sets event.kind after error.message append
Severity: 🔵 Low
Location: packages/gitlab/data_stream/api/elasticsearch/ingest_pipeline/default.yml line 799

Problem: The global on_failure handler appends to error.message before setting event.kind: pipeline_error. The recommended order is to set event.kind first.
Recommendation:

on_failure:
  - set:
      field: event.kind
      value: pipeline_error
  - append:
      field: error.message
      value: 'Processor {{{_ingest.on_failure_processor_type}}} ...'
  - append:
      field: tags
      value: preserve_original_event
      allow_duplicates: false

💡 Suggestions

  1. Add a terminate processor after ecs.version to short-circuit agent error documents: if: ctx.error?.message != null.
  2. related.ip is never populated despite extracting source.ip and gitlab.api.meta.remote_ip. Consider adding append processors for correlation.
  3. http.response.status_code is renamed from gitlab.api.status but never converted to long. ECS defines this as long.
  4. process.pid should be long per ECS. No convert is present after the rename.

Data Stream: application (package: gitlab)

File: packages/gitlab/data_stream/application/elasticsearch/ingest_pipeline/default.yml

Issue 1: append used for event.kind: pipeline_error — creates invalid array
Severity: 🟠 High
Location: packages/gitlab/data_stream/application/elasticsearch/ingest_pipeline/default.yml line 292

Problem: event.kind is already set to the scalar string "event". Using append to add pipeline_error will produce an array ["event", "pipeline_error"], which is invalid — event.kind is a single-value keyword field.
Recommendation:

- set:
    field: event.kind
    value: pipeline_error
    if: ctx.error?.message != null

Issue 2: append to related.user for user_id missing null guard
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/application/elasticsearch/ingest_pipeline/default.yml line 169

Problem: The append to related.user for gitlab.application.user_id has no if guard. If the field is absent, an empty string or null representation will be appended to related.user, polluting the correlation array.
Recommendation:

- append:
    field: related.user
    tag: append_application_user_id_into_related_user
    value: '{{{gitlab.application.user_id}}}'
    allow_duplicates: false
    if: ctx.gitlab?.application?.user_id != null

Issue 3: on_failure error message for remote_ip convert missing full Mustache context
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/application/elasticsearch/ingest_pipeline/default.yml line 204

Problem: The on_failure block for the convert processor at line 194 appends only {{{_ingest.on_failure_message}}} to error.message, omitting processor type, tag, and pipeline context.
Recommendation:

on_failure:
  - remove:
      field: gitlab.application.meta.remote_ip
      ignore_missing: true
  - append:
      field: error.message
      value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag
        {{{_ingest.on_failure_processor_tag}}} in pipeline
        {{{_ingest.on_failure_pipeline}}} failed with message:
        {{{_ingest.on_failure_message}}}'

Issue 4: drop processor if condition lacks null guard on event.original
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/application/elasticsearch/ingest_pipeline/default.yml line 22

Problem: The drop processor calls ctx.event.original.startsWith('#') without a safe-navigation operator. If event.original is null, this will throw a NullPointerException and fail the pipeline unnecessarily.
Recommendation:

- drop:
    if: ctx.event?.original != null && ctx.event.original.startsWith('#')
    description: Drop if logline contains header(s), which startswith `#`.

Issue 5: Grok patterns not anchored with ^
Severity: 🟡 Medium
Location: packages/gitlab/data_stream/application/elasticsearch/ingest_pipeline/default.yml line 174

Problem: The grok patterns are not anchored with ^. Unanchored grok patterns cause the regex engine to scan the entire input string for a match, which can significantly degrade performance on long messages.
Recommendation:

patterns:
  - '^Failed%{SPACE}Login:%{SPACE}username=%{USERNAME:user.name}%{SPACE}ip=%{IP:source.ip}'
  - '^%{USERNAME:user.name}%{SPACE}created%{SPACE}a%{SPACE}new%{SPACE}project%{SPACE}"%{GREEDYDATA:gitlab.application.project_name}"'

Issue 6: convert for user_admin to boolean missing tag and on_failure
Severity: 🔵 Low
Location: packages/gitlab/data_stream/application/elasticsearch/ingest_pipeline/default.yml line 190

Problem: The convert processor for gitlab.application.user_admin (to boolean) has no on_failure handler. If the field contains an unexpected value, the pipeline will silently fail to the global handler without a targeted error message.
Recommendation:

- convert:
    field: gitlab.application.user_admin
    type: boolean
    tag: convert_user_admin_to_boolean
    ignore_missing: true
    on_failure:
      - append:
          field: error.message
          value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag
            {{{_ingest.on_failure_processor_tag}}} failed: {{{_ingest.on_failure_message}}}'

💡 Suggestions

  1. Add a terminate processor after ecs.version to short-circuit agent error documents.
  2. Consider appending iam to event.category for user creation/removal and group creation/removal events.
  3. For authentication events, event.type values start (login) and end (logout) are not set alongside info.
  4. source.ip extracted by grok is not appended to related.ip. Consider adding it.

Data Stream: audit (package: gitlab)

File: packages/gitlab/data_stream/audit/elasticsearch/ingest_pipeline/default.yml

Issue 1: Typo in if condition causes related.user append to never fire
Severity: 🟠 High
Location: packages/gitlab/data_stream/audit/elasticsearch/ingest_pipeline/default.yml line 182

Problem: Typo in the if condition: ctx.gitlab?.auadit?.entity_id — note the extra a in auadit. This condition will always evaluate to null/false, so gitlab.audit.entity_id will never be appended to related.user for User entity types, silently dropping correlation data.
Recommendation:

@moxarth-rathod moxarth-rathod requested a review from ShourieG March 23, 2026 10:19
@moxarth-rathod
Copy link
Copy Markdown
Contributor Author

@ShourieG some of the bot's comments are on code outside the scope of this PR, so I'll leave those as-is. The remaining comments have all been addressed.

@moxarth-rathod
Copy link
Copy Markdown
Contributor Author

@ShourieG do we need any other changes needed, the required comments are already addressed?

@ShourieG
Copy link
Copy Markdown
Contributor

@moxarth-rathod, can we add some of the updated logs into the system test ? Besides this a few nits, otherwise looks good atm.

…ipeline/default.yml

Co-authored-by: Shourie Ganguly <shourie.ganguly@elastic.co>
tag: set_host_name_from_pages_host
copy_from: gitlab.pages.host
ignore_empty_value: true
- convert:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails, we'll end up with a document that fails to map. Same for the converts below.

@moxarth-rathod moxarth-rathod requested review from ShourieG and efd6 March 30, 2026 08:17
Copy link
Copy Markdown
Contributor

@ShourieG ShourieG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my end but please wait for @efd6 to approve before merging.

@elasticmachine
Copy link
Copy Markdown

💚 Build Succeeded

History

cc @moxarth-rathod

@moxarth-rathod moxarth-rathod merged commit ddb87de into elastic:main Apr 7, 2026
13 checks passed
@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package gitlab - 2.6.0 containing this change is available at https://epr.elastic.co/package/gitlab/2.6.0/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. enhancement New feature or request Integration:gitlab GitLab Team:SDE-Crest Crest developers on the Security Integrations team [elastic/sit-crest-contractors] Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants