# 正则化器

正则化器与文本分析器类似，除了它们只能发出单个词汇。因此，它们没有分词器，仅接受可用的字符过滤器和令牌过滤器的子集。仅允许按字符运行的过滤器。例如，将允许使用小写过滤器，但不允许使用词干过滤器，词干过滤器需要从整体上考虑关键字。目前可化在正则化使用的滤镜列表如下：
`arabic_normalization`, `asciifolding`, `bengali_normalization`, `cjk_width`, `decimal_digit`, `elision`, `german_normalization`, `hindi_normalization`, `indic_normalization`, `lowercase`, `persian_normalization`, `scandinavian_folding`, `serbian_normalization`, `sorani_normalization`, `uppercase`

## 1. 内置正则化器

到目前为止，Elasticsearch 尚未提供内置的规范化器，因此，唯一的方法就是构建自定义规范器。

## 2. 自定义正则化器

自定义规范化器可包含[字符过滤器](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-charfilters.html)列表和[词汇过滤器](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-tokenfilters.html)列表

In [None]:
# create index
echo -e "* create index as: ";
settings='{
    "settings": {
        "number_of_replicas": 0,
        "number_of_shards": 1,
        "analysis": {
            "char_filter": {
                "quote_filter": {
                    "type": "mapping",
                    "mappings": [
                        "« => \"",
                        "» => \""
                    ]
                }
            },
            "normalizer": {
                "quote_normalizer": {
                    "type": "custom",
                    "char_filter": [
                        "quote_filter"
                    ],
                    "filter": [
                        "lowercase",
                        "asciifolding"
                    ]
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "text": {
                "type": "text",
                "normalizer": "quote_normalizer",
                "copy_to": "_all"
            }
        }
    }
}';
curl -H 'Cache-Control: no-cache' -H 'Content-Type: application/json' \
     -X PUT 'http://localhost:9200/analyzer?pretty' -d "$(echo $settings)";

# create document
echo -e "\n* create document:";
doc='{
    "text": "Hello, this is Elasticsearch"
}';
curl -H 'Cache-Control: no-cache' -H 'Content-Type: application/json' \
     -X POST 'http://localhost:9200/analyzer/_doc/001?pretty' -d "$(echo $doc)";
     
echo -e "\n* query after create:";
curl -H 'Cache-Control: no-cache' -H 'Content-Type: application/json' \
     -X GET 'http://localhost:9200/analyzer/_search?pretty&';

# delete index
echo -e "\n* delete index as:";
curl -H 'Cache-Control: no-cache' -H 'Content-Type: application/json' \
     -X DELETE 'http://localhost:9200/analyzer?pretty';