Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong field type for new gl2_processing_duration_ms field in existing indices #18387

Closed
bernd opened this issue Feb 26, 2024 · 6 comments
Closed
Assignees

Comments

@bernd
Copy link
Member

bernd commented Feb 26, 2024

The new gl2_processing_duration_ms message field has the wrong field type in existing indices.

We updated the index templates for the new gl2_processing_duration_ms field, but in existing write indices, the field will be mapped to keyword because of the dynamic template for gl2_ fields.

$ http :9200/graylog_4/_mapping
{
    "graylog_4": {
        "mappings": {
            "dynamic_templates": [
                {
                    "internal_fields": {
                        "mapping": {
                            "type": "keyword"
                        },
                        "match": "gl2_*"
                    }
                },
                {
                    "store_generic": {
                        "mapping": {
                            "type": "keyword"
                        },
                        "match_mapping_type": "string"
                    }
                }
            ],
            "properties": {
                // [...]
                "gl2_processing_duration_ms": {
                    "type": "keyword"
                },
                // [...]
            }
        }
    }
}

To fix this, we need to update the mapping of existing indices before we ingest any data that has the gl2_processing_duration_ms field.

Update

This is also an issue for the previously added fields:

                "gl2_processing_duration_ms": {
                    "type": "keyword"
                },
                "gl2_processing_timestamp": {
                    "type": "keyword"
                },
                "gl2_receive_timestamp": {
                    "type": "keyword"
                },

This is an index that was created in 2019.

Steps to Reproduce (for bugs)

  1. Update to the latest snapshot
  2. Ingest messages
  3. Check field type for the gl2_processing_duration_ms in existing indices

Your Environment

  • Graylog Version: 6.0.0-SNAPSHOT
@mpfz0r
Copy link
Contributor

mpfz0r commented Mar 6, 2024

Possible Fix for the new fields (gl2_processing_duration_ms):

  1. Apply current mappings from the index templates to all (open write) indices.
    • This could fail if we change a type.
    • New fields should not be a problem. Maybe we only add the mappings for new fields.

Possible Fix wrong mappings (gl2_message_id):

  1. On all indices:

    • If they have the wrong mapping (gl2_message_id != keyword):
      • Add a multi-field mapping: gl2_message_id.fixed
      • Also add a field alias: gl2_message_id_sortable -> gl2_message_id.fixed
        For every other index we can skip the multi-field, but
    • set a field alias: gl2_message_id_sortable -> gl2_message_id
  2. Add the gl2_message_id_sortable -> gl2_message_id to the default index template.

  3. Change the default sort order to use gl2_message_id_sortable

@mpfz0r
Copy link
Contributor

mpfz0r commented Mar 18, 2024

@bernd
This index of yours must be ancient:

$ http :9200/graylog_4/_mapping
{
    "graylog_4": {
        "mappings": {
            "dynamic_templates": [
                {
                    "internal_fields": {
                        "mapping": {
                            "type": "keyword"
                        },
                        "match": "gl2_*"
                    }
                },

We've been only mapping string types to keyword for 5 years:
d574b91
So gl2_processing_duration_ms would be a long for every index created since.

And all indices since Graylog 3.1 have the processing and receive timestamp fields:
540dad7

Given that fixing mappings on old indices is not possible. I don't see what we can improve here.

@bernd
Copy link
Member Author

bernd commented Mar 18, 2024

@bernd This index of yours must be ancient:

@mpfz0r Yeah, as mentioned in my description: 😄

This is an index that was created in 2019.

@mpfz0r
Copy link
Contributor

mpfz0r commented Mar 21, 2024

@bernd
But you also wrote but in existing write indices, the field will be mapped to keyword because of the dynamic template for gl2_ fields.
Which is only true for write indices that had been open since 2019.
Is this something we would find in the real world?

@bernd
Copy link
Member Author

bernd commented Mar 21, 2024

@bernd But you also wrote but in existing write indices, the field will be mapped to keyword because of the dynamic template for gl2_ fields. Which is only true for write indices that had been open since 2019. Is this something we would find in the real world?

@mpfz0r Probably only in tiny edge cases. I guess we can scrap the issue then.

@mpfz0r
Copy link
Contributor

mpfz0r commented Apr 10, 2024

ok 🤝

FWIW, I did start #18673

@mpfz0r mpfz0r closed this as completed Apr 10, 2024
@mpfz0r mpfz0r reopened this Apr 10, 2024
@mpfz0r mpfz0r closed this as not planned Won't fix, can't repro, duplicate, stale Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants