Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate timestamp when path is null #4718

Closed
khoan opened this Issue Jan 14, 2014 · 7 comments

Comments

Projects
None yet
4 participants
@khoan
Copy link

commented Jan 14, 2014

Shouldn't timestamp be generated when value of path is null ?

Mapping definition:

curl -X PUT  http://localhost:9200/twitter/ -d '{
    "mappings": {
        "_default_": {
            "_timestamp" : {
                "enabled" : "yes",
                "store": "yes",
                "path" : "post_date"
            },
            "properties": {
                "message": {
                    "type": "string"
                }
            }
        }
    }
}'

Get error when:

curl -X PUT http://127.0.0.1:9200/twitter/tweet/123 -d '{
  message: "bam bam"
}'

=>  {"error":"ElasticSearchParseException[failed to parse doc to extract routing/timestamp]; nested: TimestampParsingException[failed to parse timestamp [null]]; ","status":400}

curl -X PUT http://127.0.0.1:9200/twitter/tweet/123 -d '{
  message: "bam bam",
  post_date: "2009-11-15T14:12:12Z"
}'

=> {"ok":true,"_index":"twitter","_type":"tweet","_id":"123","_version":1}
@clintongormley

This comment has been minimized.

Copy link
Member

commented Jul 25, 2014

This should probably be an option that can be turned on, so that bad data isn't silently ignored.

@dadoonet

This comment has been minimized.

Copy link
Member

commented Jul 25, 2014

@clintongormley Where should we put that new option? Index settings? Mapping? Other?
Which name should we use: ignore_missing_timestamp or so?

@dadoonet

This comment has been minimized.

Copy link
Member

commented Jul 25, 2014

The more I think about it, the more I think we should define for _timestamp field a new option named default. default could be: now (by default) or a date which respect the format format or null.

I remember a use case on the mailing list where the user does not have a value for every document so I'd like to set it to 01/01/1970.

The new _timestamp field would look like this:

{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "path" : "post_date",
            "format" : "YYYY-MM-dd",
            "default" : "1970-01-01"
        }
    }
}

Or

{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "path" : "post_date",
            "format" : "YYYY-MM-dd",
            "default" : "now"
        }
    }
}

Or

{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "path" : "post_date",
            "format" : "YYYY-MM-dd",
            "default" : null
        }
    }
}

WDYT?

@clintongormley

This comment has been minimized.

Copy link
Member

commented Jul 25, 2014

Sounds reasonable to me

@dadoonet

This comment has been minimized.

Copy link
Member

commented Jul 25, 2014

PR Opened #7036.

dadoonet added a commit to dadoonet/elasticsearch that referenced this issue Jul 28, 2014

Generate timestamp when path is null
Index process fails when having `_timestamp` enabled and `path` option is set.
It fails with a `TimestampParsingException[failed to parse timestamp [null]]` message.

Reproduction:

```
DELETE test
PUT  test
{
    "mappings": {
        "test": {
            "_timestamp" : {
                "enabled" : "yes",
                "path" : "post_date"
            }
        }
    }
}
PUT test/test/1
{
  "foo": "bar"
}
```

You can define a default value for when timestamp is not provided
within the index request or in the `_source` document.

By default, the default value is `now` which means the date the document was processed by the indexing chain.

You can disable that default value by setting `default` to `null`. It means that `timestamp` is mandatory:

```
{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "default" : null
        }
    }
}
```

If you don't provide any timestamp value, indexation will fail.

You can also set the default value to any date respecting timestamp format:

```
{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "format" : "YYYY-MM-dd",
            "default" : "1970-01-01"
        }
    }
}
```

If you don't provide any timestamp value, indexation will fail.

Closes elastic#4718.

dadoonet added a commit that referenced this issue Jul 31, 2014

Generate timestamp when path is null
Index process fails when having `_timestamp` enabled and `path` option is set.
It fails with a `TimestampParsingException[failed to parse timestamp [null]]` message.

Reproduction:

```
DELETE test
PUT  test
{
    "mappings": {
        "test": {
            "_timestamp" : {
                "enabled" : "yes",
                "path" : "post_date"
            }
        }
    }
}
PUT test/test/1
{
  "foo": "bar"
}
```

You can define a default value for when timestamp is not provided
within the index request or in the `_source` document.

By default, the default value is `now` which means the date the document was processed by the indexing chain.

You can disable that default value by setting `default` to `null`. It means that `timestamp` is mandatory:

```
{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "default" : null
        }
    }
}
```

If you don't provide any timestamp value, indexation will fail.

You can also set the default value to any date respecting timestamp format:

```
{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "format" : "YYYY-MM-dd",
            "default" : "1970-01-01"
        }
    }
}
```

If you don't provide any timestamp value, indexation will fail.

Closes #4718.
Closes #7036.

(cherry picked from commit 85eb0ea)

@dadoonet dadoonet closed this in 85eb0ea Jul 31, 2014

dadoonet added a commit that referenced this issue Sep 8, 2014

Generate timestamp when path is null
Index process fails when having `_timestamp` enabled and `path` option is set.
It fails with a `TimestampParsingException[failed to parse timestamp [null]]` message.

Reproduction:

```
DELETE test
PUT  test
{
    "mappings": {
        "test": {
            "_timestamp" : {
                "enabled" : "yes",
                "path" : "post_date"
            }
        }
    }
}
PUT test/test/1
{
  "foo": "bar"
}
```

You can define a default value for when timestamp is not provided
within the index request or in the `_source` document.

By default, the default value is `now` which means the date the document was processed by the indexing chain.

You can disable that default value by setting `default` to `null`. It means that `timestamp` is mandatory:

```
{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "default" : null
        }
    }
}
```

If you don't provide any timestamp value, indexation will fail.

You can also set the default value to any date respecting timestamp format:

```
{
    "tweet" : {
        "_timestamp" : {
            "enabled" : true,
            "format" : "YYYY-MM-dd",
            "default" : "1970-01-01"
        }
    }
}
```

If you don't provide any timestamp value, indexation will fail.

Closes #4718.
Closes #7036.
@alfasin

This comment has been minimized.

Copy link

commented May 1, 2015

Can you do:
"enabled" : "yes", ???

I thought that the only valid values of "enabled" are true/false...

@dadoonet

This comment has been minimized.

Copy link
Member

commented May 1, 2015

Actually "enabled":"whateveryouwant" might be considered as true. But you're right, true is definitely better!
This boolean parsing might change in the future to be more strict.
More info http://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html#boolean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.