You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I got it to work by combining multiple More Like This queries each with their own analyzer instead of trying to use per_field_analyzer. That worked out better anyway, allowing me to have separate settings (e.g. stop_words, min_word_length) for unigrams vs bigrams.
===
Flatgov discussion
From Daniel:
I think multi word phrases probably makes a lot of sense here than looking at single word phrases.
We can also probably limit small bill to bill comparisons based on how CRS/LOC categorizes them. There's only a handful of monster bills and those are the ones it's probably useful to identify when smaller bills are components. We can also potentially use the section headings as a clue to narrow that down.
We could also use the Library of Congress summaries. They are much smaller but have to identify the key concepts.... could be a way to winnow it down.
The text was updated successfully, but these errors were encountered:
For use of shingles (multi-word phrases) with 'more-like-this', see, e.g. https://discuss.elastic.co/t/more-like-this-and-shingles-phrases/100775
===
Flatgov discussion
From Daniel:
The text was updated successfully, but these errors were encountered: