Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest: dot_expander_processor prevent null add/append to source document #35106

Merged
merged 4 commits into from Nov 5, 2018

Conversation

jakelandis
Copy link
Contributor

@jakelandis jakelandis commented Oct 30, 2018

  • don't allow null values to be added to the source document
    if the field does not exist in the source document.
  • don't allow null values to be appended to an existing field
    if the field does not exist in the source document.

I noticed some peculiar behavior w/r/t the dot expander processor adding in null values when either 1) the field declared is missing from the source document 2) the source document already contains the nested objects. For example:

PUT _ingest/pipeline/expander
{
  "processors": [
    {
      "dot_expander": {
        "field": "network.name"
      }
    }
  ]
}
POST test/_doc/1?pipeline=expander
{
  "foo": "bar"
}

POST test/_doc/2?pipeline=expander
{
  "network": {
    "name" :"bar"
  }
}

GET test/_doc/_mget
{
  "docs": [
    {
      "_id": "1"
    },
    {
      "_id": "2"
    }
  ]
}

Results in:

{
  "docs" : [
    {
      "_index" : "test",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 21,
      "found" : true,
      "_source" : {
        "foo" : "bar",
        "network" : {
          "name" : null
        }
      }
    },
    {
      "_index" : "test",
      "_type" : "_doc",
      "_id" : "2",
      "_version" : 1,
      "found" : true,
      "_source" : {
        "network" : {
          "name" : [
            "bar",
            null
          ]
        }
      }
    }
  ]
}

With this change:

{
  "docs": [
    {
      "_index": "test",
      "_type": "_doc",
      "_id": "1",
      "_version": 1,
      "found": true,
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_index": "test",
      "_type": "_doc",
      "_id": "2",
      "_version": 1,
      "found": true,
      "_source": {
        "network": {
          "name": "bar"
        }
      }
    }
  ]
}

EDIT : updated text to match latest commit (examples did not need to change)

* support for ignore_missing, such that if the field does not exist in
source document the processor will be skipped

* fix bug such that if the location to expand to already exists as
nested objects, don't append a null value.
@jakelandis jakelandis added >bug >enhancement :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.0.0 v6.6.0 labels Oct 30, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@rjernst rjernst requested a review from talevy October 31, 2018 16:27
@jakelandis
Copy link
Contributor Author

After a some discussion with @talevy , I will remove the ignore_missing here since it isn't a good fit w/r/t how the other processors define it's behavior. We will however keep the behavior that prevents the null from appearing the IngestDocument, just will do so without introducing a new option.

@jakelandis jakelandis changed the title ingest: dot_expander_processor, ignore_missing and prevent null ingest: dot_expander_processor prevent null add/append to source document Nov 1, 2018
@jakelandis
Copy link
Contributor Author

@talevy - changes made as discussed. The GH diff is kinda ugly, but it really a 1 line change.

Copy link
Contributor

@talevy talevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some questions, otherwise looks good!

@@ -143,4 +143,41 @@ public void testEscapeFields_path() throws Exception {
assertThat(document.getFieldValue("field.foo.bar.baz", String.class), equalTo("value"));
}


public void testEscapeFields_doNotingIfFieldNotInSourceDoc() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/doNoting/doNothing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

//hasField returns false since it requires the expanded form, which is not expanded since we did not ask for it to be
assertFalse(document.hasField("foo.bar"));
//nothing has changed
assertEquals(document.getSourceAndMetadata().get("foo.bar"), source.get("foo.bar"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this necessary given the next line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, are any of the other assertions necessary given that we are checking that source has not changed at all in this case (and no metadata was added).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, comparing to source is mis-leading since source is the object that is getting changed, so assertEquals(document.getSourceAndMetadata(), source); is like asserting 1==1.

I removed the assertions against source and kept the specific assertions. Thanks for spotting this.

assertTrue(document.hasField("foo.bar"));
//nothing changed
assertEquals(document.getSourceAndMetadata().get("foo.bar"), source.get("foo.bar"));
assertEquals(document.getSourceAndMetadata(), source);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question as above, is this not a deep enough equals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed all misleading assertions against source

Copy link
Contributor

@talevy talevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks!

@jakelandis jakelandis merged commit 9150a93 into elastic:master Nov 5, 2018
jakelandis added a commit that referenced this pull request Nov 6, 2018
…ment (#35106)

* don't allow null values to be added or appended to the 
source document if the field does not exist.
@jakelandis
Copy link
Contributor Author

6.6.0 backport : 1e02754

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v6.6.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants