add analyzer to specific field #9

aleha84 · 2017-03-27T13:27:36Z

Using version 3.0.0-SNAPSHOT
When executing command like this: GET /index/type/_mapping/field/content see this:

{
  "index": {
    "mappings": {
      "type": {
        "content": {
          "full_name": "content",
          "mapping": {
            "content": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        }
      }
    }
  }
}

is it possible to add specific analyzer for specific fields?
Like described here: https://www.elastic.co/guide/en/elasticsearch/reference/current/analyzer.html
Most of my content is in russian language and i want perform seraching by content field using russian morfology and stop words.

The text was updated successfully, but these errors were encountered:

essiembre · 2017-03-28T03:05:14Z

This channel is for the Norconex Elasticsearch Committer only.
I think what you are asking is related to the configuration of Elasticsearch itself, not the Committer library. Please confirm.

aleha84 · 2017-03-28T05:41:29Z

If I understand correctly, mappings in Elastic creates automatically based on the data that is sent there, so then i run crawler first tyme with elastic commiter it creates an index and type automatically. But after the crawling is finished i have filled index, and i can't modify it's type fields analyser property, because anylyse is happened at index time.

essiembre · 2017-03-28T16:53:41Z

That's because you are using the dynamic mapping feature of Elasticsearch, which tries to guess the data types of each fields it receives. If you want to control this, you have to define the schema yourself (static mapping). This is something you do within Elasticsearch, not the Collector (refer to Elastic documentation for this).

This being said, if you want to discover which fields are found, you can leave the dynamic mapping while you are developing/testing. Then you can analyze the fields you get and create the best schema for you before re-indexing for real.

You can also use a few different taggers to help you get just what you want. For instance:

KeepOnlyTagger: Use this to only keep the fields you are interested in.
RenameTagger: Rename fields you get from the collector to what you want it to be called in Elasticsearch
DebugTagger: Can print on console/logs the fields captured and their value, so you have an idea while you are developing (before it reaches Elasticsearch).

The above are part of the Importer module and it is recommended to use them as post-parse handlers so all fields extracted during the parsing of documents are there.

aleha84 · 2017-03-28T17:13:51Z

Already have workaroud. Bebore first indexing, just put some mapping for "content" and "title" fields with specific analyzer properties. Commiter is only updates these properties, but not override existing. Forks fine. Will think about KeepOnlyTagger. Thx.

essiembre · 2017-03-28T17:22:05Z

Great, thanks for confirming.

essiembre closed this as completed Mar 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add analyzer to specific field #9

add analyzer to specific field #9

aleha84 commented Mar 27, 2017

essiembre commented Mar 28, 2017

aleha84 commented Mar 28, 2017

essiembre commented Mar 28, 2017

aleha84 commented Mar 28, 2017

essiembre commented Mar 28, 2017

add analyzer to specific field #9

add analyzer to specific field #9

Comments

aleha84 commented Mar 27, 2017

essiembre commented Mar 28, 2017

aleha84 commented Mar 28, 2017

essiembre commented Mar 28, 2017

aleha84 commented Mar 28, 2017

essiembre commented Mar 28, 2017