Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

implement handling for other facet types #73

Open
willkg opened this Issue · 6 comments

2 participants

@willkg
Owner

On IRC we discussed some alternatives:

First, add a type argument to .facet() that changes the facet type it's creating. Then we'd end up with code like this:

s.facet('fielda', 'fieldb')  # terms facet
s.facet('fielda', type='date_histogram', interval='days')  # date histogram facet

This is harder to document and we know at least "interval" is an argument that affects multiple facet types, but has different shapes of values. Ew.

Second, we could have a Facet class with subclasses like DateHistogram where you provide the relevant bits:

s.facet(DateHistogram('fielda', interval='days'))

This is easy to document. We have F, so there's some precedence for this shape of things.

Third, we could have different methods:

s.facet_date_histogram('fielda', interval='days')

This is easy to document. We do different filter types as different methods, so there's precedence for this, too.

@willkg
Owner

Oops--the bit about having different filter types as different methods is bogus. We only have one .filter.

@willkg
Owner

If we look at doing multiple facet types, then the second approach looks better. Then you could do something like this:

s.facet(TermsFacet('fielda', 'fieldb'), Histogram('fielda', interval=100), DateHistogram('fielda', interval='days'))

I don't know how often that comes up in practice, but it nicely mirrors the resulting ES output:

{
    "query" : {
        "match_all" : {  }
    },
    "facets" : {
        "tag" : {
            "terms" : {
                "field" : "tag",
                "size" : 10
            }
        },
        "histo1" : {
            "histogram" : {
                "field" : "field_name",
                "interval" : 100
            }
        },
        "histo2" : {
            "date_histogram" : {
                "field" : "field_name",
                "interval" : "day"
            }
        }
    }
}

Also, it's interesting to point out that the value of facets is a dict of facet names to facets.

Maybe we should make it like this instead?:

s.facet(facetname1=DateHistogram('fielda', interval='days'))
@willkg
Owner

I like that last idea best so far. It's a little wordy, but it's closer to the ES api and it's explicit in important ways.

I'll try implementing that tomorrow morning (or late tonight--whichever has more free time) and see whether it tastes yucky or not.

@willkg
Owner

The last comment is from 8 months ago. Since then, we reimplemented filters and queries in an extensible way. Facets should follow suit.

@eire1130

Continuing from #146

I agree it's not a great idea, if for no other reason than it gets away practices set forth in django.

I'm not sure where you guys are at in this discussion, but I could see this going a could of different ways:

1: Using something like Django's aggregation framework, so something like from elasticutils.facets import Term, Histogram, Statistical, etc

And then from there do something like, S().query(lol='cats').facets(Statistical("field", options))

Or, have an options meta class that is passed in when declaring a facet, so something like this:

options = Options(options)

`S().query(lol='cats').facets(field__statistical=options)

I probably prefer the former to the later, keeps it closer to django and easier for new devs to pick up I think.

The thing is, different facet types have different options available. Those options should get exposed via the API and enforced as well.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.