Skip to content

Conversation

@mrodm
Copy link
Contributor

@mrodm mrodm commented May 30, 2023

Closes #1253

This PR adds the support to check fields key from the documents retrieved from the Elasticsearch instead of _source.
For that, for each system test it is checked whether or not the synthetic mode is enabled:

  • This value is retrieved querying to the Component templates API

In case that data stream (component template) has synthetic enabled, there are some changes required:

  • validation in system tests:
    • disabled Normalization
    • update validation about dataset
      • all the elements in fields are arrays, and that's need to be taken into account when checking for exact values
  • update how sample event files are generated (so static tests keep working):
    • when synthetics source mode is enabled, this file is generated using the fields key from the documents with some modifications
      • if normalization is not set and there is just one element in the field, change the field to be just an scalar value.
      • if normalization is set, keep the the field as array

A test package has been included to test the synthetic mode:

  • sample_event.json is written using the fields key from the document
  • until 8.7.1 is not used as default version, synthetics is not going to be used

As part of this PR:

  • remove uses of errors.Wrap in the internal/fields/validate.go.
  • add a simple test for GetValue method from common.MapStr.

Considerations

As synthetic source mode is enabled (as part of the time series), _source field gets built from the documents and the fields on the document could be differnt.

For that, in case synthetic source mode is enabled, it is used the fields key from the documents retrieved from Elasticsearch instead of the _source field.

This makes that sample_event.json files must be re-generated for those data streams.

Packages (data streams) that do not use this mode are tested using the same _source field. No changes for them.

How to test

For this it can be used the apache_tomcat from the integrations repository

  1. Build elastic-package from this branch
  2. Go to the apache_tomcat package folder
  3. Update the manifest of the cache datastream to set the time_series mode. Add at the end of the manifest this definition:
    elasticsearch:
      index_mode: time_series
  4. Run the following commands:
    elastic-package stack down
    elastic-package build -v
    elastic-package stack up -v -d --version 8.7.1
    # regenerate sample_event.json files for the desired data streams
    elastic-package test system -v -g --data-streams cache,requests
    elastic-package test -v

@mrodm mrodm self-assigned this May 30, 2023
Comment on lines +271 to +274
if len(results.ComponentTemplates) == 0 {
logger.Debugf("no component template found for data stream %s", dataStream)
return false, nil
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current elastic stack version, there is no component template for sql_input package (input) for the Elastic stack version started (8.6.2).

In that scenario, I assume that synthetic is not enabled.

@mrodm
Copy link
Contributor Author

mrodm commented May 31, 2023

/test

@mrodm mrodm requested a review from a team June 1, 2023 09:41
@mrodm
Copy link
Contributor Author

mrodm commented Jun 1, 2023

The test package fails with these errors:

synthetic_mode/synthetic Verify sample_event.json:
[0] field "event.dataset" should have value "synthetic_mode.synthetic", it has "[synthetic_mode.synthetic]"
[1] field "data_stream.dataset" should have value "synthetic_mode.synthetic", it has "[synthetic_mode.synthetic]"

Probably it needs to wait until elastic-package runs by default 8.7.1 version so the synthetic source mode is detected.

@jlind23
Copy link
Contributor

jlind23 commented Jun 1, 2023

Probably it needs to wait until elastic-package runs by default 8.7.1 version so the synthetic source mode is detected.

Shall we update the stack version being used then?

@mrodm
Copy link
Contributor Author

mrodm commented Jun 1, 2023

Probably it needs to wait until elastic-package runs by default 8.7.1 version so the synthetic source mode is detected.

Shall we update the stack version being used then?

It is being done in this PR #1274
We could wait for this.

@jlind23
Copy link
Contributor

jlind23 commented Jun 1, 2023

It is being done in this PR #1274
We could wait for this.

I guess we are good now :)

@mrodm
Copy link
Contributor Author

mrodm commented Jun 1, 2023

It is being done in this PR #1274
We could wait for this.

I guess we are good now :)

There are still errors to be fixed in field validations when running static tests:

  • If the sample_event.json is written using the fields key:
Run static tests for the package
--- Test results for package: synthetic_mode - START ---
FAILURE DETAILS:
synthetic_mode/synthetic Verify sample_event.json:
[0] field "event.dataset" should have value "synthetic_mode.synthetic", it has "[synthetic_mode.synthetic]"
[1] field "data_stream.dataset" should have value "synthetic_mode.synthetic", it has "[synthetic_mode.synthetic]"


╭────────────────┬─────────────┬───────────┬──────────────────────────┬────────────────────────────────────────────┬──────────────╮
│ PACKAGE        │ DATA STREAM │ TEST TYPE │ TEST NAME                │ RESULT                                     │ TIME ELAPSED │
├────────────────┼─────────────┼───────────┼──────────────────────────┼────────────────────────────────────────────┼──────────────┤
│ synthetic_mode │ synthetic   │ static    │ Verify sample_event.json │ FAIL: one or more errors found in document │  53.778195ms │
╰────────────────┴─────────────┴───────────┴──────────────────────────┴────────────────────────────────────────────┴──────────────╯
  • If the sample_event.json is written using the _source key:
Run static tests for the package
--- Test results for package: synthetic_mode - START ---
FAILURE DETAILS:
synthetic_mode/synthetic Verify sample_event.json:
[0] field "event.type" is not normalized as expected: expected array, found "info" (string)
[1] field "event.category" is not normalized as expected: expected array, found "web" (string)


╭────────────────┬─────────────┬───────────┬──────────────────────────┬────────────────────────────────────────────┬──────────────╮
│ PACKAGE        │ DATA STREAM │ TEST TYPE │ TEST NAME                │ RESULT                                     │ TIME ELAPSED │
├────────────────┼─────────────┼───────────┼──────────────────────────┼────────────────────────────────────────────┼──────────────┤
│ synthetic_mode │ synthetic   │ static    │ Verify sample_event.json │ FAIL: one or more errors found in document │  53.692446ms │
╰────────────────┴─────────────┴───────────┴──────────────────────────┴────────────────────────────────────────────┴──────────────╯
--- Test results for package: synthetic_mode - END   ---

Trying to find a way to solve that yet.

@jlind23
Copy link
Contributor

jlind23 commented Jun 5, 2023

@mrodm can I also mark this as closing elastic/integrations#6099 ?

@mrodm
Copy link
Contributor Author

mrodm commented Jun 5, 2023

@mrodm can I also mark this as closing elastic/integrations#6099 ?

@jlind23 Not sure if it could be considered closed. If the current approach is approved, some changes in that integration would be needed so its tests are successful:

  • at least, it would be needed to update sample_event.json file for the data streams that require synthetic mode.

@mrodm mrodm marked this pull request as ready for review June 5, 2023 12:52
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job here, there are quite some tricky cases.

I wonder if we can make this code less dependant on synthetic mode being enabled or not. For example we could try to always merge the fields in the source and the fields in fields. And assume that we have to disable normalization validation for all fields coming from fields. Then maybe we wouldn't even need to check if synthetic is enabled. But we can revisit this later when supporting runtime fields.

Something I think we have to confirm in any case before merging this is the format of the sample event, objects used to be expanded, but they are not here for the synthetic mode.

Comment on lines +360 to +365
func (h hits) getDocs(syntheticsEnabled bool) []common.MapStr {
if syntheticsEnabled {
return h.Fields
}
return h.Source
}
Copy link
Member

@jsoriano jsoriano Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine for synthetics, but I wonder if we should actually merge fields and source, to support also the runtime fields cases. But we can think about this when supporting tests with runtime fields.

@mrodm
Copy link
Contributor Author

mrodm commented Jun 6, 2023

/test

@jlind23 jlind23 requested a review from jsoriano June 7, 2023 06:47
Expand objects when docs retrieved from Elasticsearch and synthetic mode
is enabled. That would keep the same format of the sample_event.json
when that file is generated.
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, the sample document looks better now, added a couple of suggestions more about it.

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mrodm

@mrodm mrodm requested a review from jsoriano June 7, 2023 13:20
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

@mrodm mrodm merged commit a28f696 into elastic:main Jun 8, 2023
@mrodm mrodm deleted the use_fields_synthetics branch June 8, 2023 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support fields validation when synthetic source is used

4 participants