Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc values support to boolean fields. #7961

Merged
merged 1 commit into from Apr 2, 2015

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Oct 2, 2014

This pull request makes boolean handled like dates and ipv4 addresses: things
are stored as as numerics under the hood and aggregations add some special
formatting logic in order to return true/false in addition to 1/0.

For example, here is an output of a terms aggregation on a boolean field:

   "aggregations": {
      "top_f": {
         "doc_count_error_upper_bound": 0,
         "buckets": [
            {
               "key": 0,
               "key_as_string": "false",
               "doc_count": 2
            },
            {
               "key": 1,
               "key_as_string": "true",
               "doc_count": 1
            }
         ]
      }
   }

Sorted numeric doc values are used under the hood.

Close #4678
Close #7851

@s1monw
Copy link
Contributor

s1monw commented Oct 2, 2014

@jpountz this looks good to me but I think we need to make sure that then index.created.version is >= 1.5.0 ? Otherwise we get inconsistent results?

@s1monw s1monw removed the review label Oct 2, 2014
@jpountz
Copy link
Contributor Author

jpountz commented Oct 2, 2014

@s1monw I pushed a new commit that only switches to this new field data on new indices. That said, I'm wondering if that is the right trade-off:

  • if we only enable this behavior on new indices (what the last commit does) then requests across indices that have several versions will fail (even if all nodes have the same version) since some indices will serialize booleans as strings and others as numbers
  • if we enable this behavior all the time (what happens without the last commit) then requests will fail if not all nodes have the same version of elasticsearch

Another option is to only push this change to 2.0?

@jpountz jpountz added review and removed v1.5.0 labels Oct 2, 2014
@jpountz
Copy link
Contributor Author

jpountz commented Oct 8, 2014

After discussing this with @s1monw I made the change 2.0-only.

@jpountz jpountz force-pushed the enhancement/boolean_doc_values branch from 9845f29 to 9f44137 Compare October 12, 2014 15:56
@jpountz
Copy link
Contributor Author

jpountz commented Oct 12, 2014

@s1monw I updated the 2.0 migration guide with the breaking change for boolean fields.

@s1monw
Copy link
Contributor

s1monw commented Oct 12, 2014

Similar to what @rjernst did in #8013 I think I'd really like to see this to be tested against an old index. See https://github.com/elasticsearch/elasticsearch/pull/8013/files#diff-ec2b53cfb6bd9c938d5a48d14a34b609R43 otherwise LGTM

@jpountz
Copy link
Contributor Author

jpountz commented Oct 13, 2014

@s1monw There used to be a backward compatibility test (jpountz@9700deb#diff-8) but I removed it when I made this change 2.0-only since we cannot run clusters that contain both 1.x and 2.0 nodes.

@jpountz
Copy link
Contributor Author

jpountz commented Nov 3, 2014

@s1monw I'm wondering that your last comment was maybe about #7954 ?

@s1monw
Copy link
Contributor

s1monw commented Nov 3, 2014

I don't think so while it applies to that too :) I think we should check that we can upgrade an existing boolean field or at least handle it just fine if somebody does a full restart with 2.0

@jpountz
Copy link
Contributor Author

jpountz commented Apr 1, 2015

@s1monw This PR has been stalled for some time so I rebased it to a recent master and added the bw compat tests that you were asking for. Could you take another look?

Note that I only regenerated the static bw index for 1.5 for now but I will need to do it for all versions before pushing if I don't want to break the bw tests.

@jpountz
Copy link
Contributor Author

jpountz commented Apr 1, 2015

@rjernst You might want to have a look at this one too as I know you've done a lot of stuff recently around bw compat testing and mappings.

@s1monw
Copy link
Contributor

s1monw commented Apr 2, 2015

LGTM

This pull request makes boolean handled like dates and ipv4 addresses: things
are stored as as numerics under the hood and aggregations add some special
formatting logic in order to return true/false in addition to 1/0.

For example, here is an output of a terms aggregation on a boolean field:
```
   "aggregations": {
      "top_f": {
         "doc_count_error_upper_bound": 0,
         "buckets": [
            {
               "key": 0,
               "key_as_string": "false",
               "doc_count": 2
            },
            {
               "key": 1,
               "key_as_string": "true",
               "doc_count": 1
            }
         ]
      }
   }
```

Sorted numeric doc values are used under the hood.

Close elastic#4678
Close elastic#7851
@jpountz jpountz force-pushed the enhancement/boolean_doc_values branch from 7b9119b to 08f93cf Compare April 2, 2015 13:59
@jpountz jpountz removed the review label Apr 2, 2015
jpountz added a commit that referenced this pull request Apr 2, 2015
Fielddata: Add doc values support to boolean fields.

Close #7961
@jpountz jpountz merged commit e8a42c6 into elastic:master Apr 2, 2015
@jpountz jpountz deleted the enhancement/boolean_doc_values branch April 2, 2015 14:06
@clintongormley clintongormley changed the title Fielddata: Add doc values support to boolean fields. Add doc values support to boolean fields. Jun 6, 2015
@clintongormley clintongormley added the :Search/Mapping Index mappings, including merging and defining field types label Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking >enhancement :Search/Mapping Index mappings, including merging and defining field types v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Core type Boolean cannot be set as Doc_Value Create dedicated fielddata implementation for boolean fields
3 participants