Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

支持多个不同的远程词库 #777

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Serendo
Copy link

@Serendo Serendo commented May 7, 2020

未指定custom_dict_name时,直接访问ik配置文件中的url
当指定custom_dict_name时,访问url/custom_dict_name

举例:

  1. ik配置文件中指定词典url为:http://dict.xxx.com/dict,
    则默认访问的路径为http://dict.xxx.com/dict,
  2. 当需要使用定制的词典路径时,则需要对ik分词器进行配置,参考:
PUT ik-test
{
    "settings": {
        "analysis.analyzer": {
            "custom_ik": {
                "type": "ik_smart",
                "enable_remote_dict": true,
                "custom_dict_name": "custom_dict"
            }
        }
    },
    "mappings": {
        "_doc": {
            "properties": {
                "field1": {
                    "type": "text",
                    "analyzer": "custom_ik"
                }
            }
        }
    }
}

POST ik-test/_analyze
{
    "field": "field1",
    "text": "脑暴最棒加一"
}

此配置下将会访问http://dict.xxx.com/dict/custom_dict 获取词库

未指定custom_dict_name时,直接访问ik配置文件中的url
当指定custom_dict_name时,访问url/custom_dict_name

举例:
1. ik配置文件中指定词典url为:http://dict.xxx.com/dict,
   则默认访问的路径为http://dict.xxx.com/dict,
2. 当需要使用定制的词典路径时,则需要对ik分词器进行配置,参考:
```
PUT ik-test
{
    "settings": {
        "analysis.analyzer": {
            "custom_ik": {
                "type": "ik_smart",
                "enable_remote_dict": true,
                "custom_dict_name": "custom_dict"
            }
        }
    },
    "mappings": {
        "_doc": {
            "properties": {
                "field1": {
                    "type": "text",
                    "analyzer": "custom_ik"
                }
            }
        }
    }
}

POST ik-test/_analyze
{
    "field": "field1",
    "text": "脑暴最棒加一"
}
```
此配置下将会访问http://dict.xxx.com/dict/custom_dict 获取词库
Copy link

@kurokosan98 kurokosan98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

大佬 爱你!

@stoplyy
Copy link

stoplyy commented Dec 14, 2021

追加路径的方式设置remote_ext_dic,就很尴尬。既然是remote词库,有可能我需要某个索引的remote_ext_dic是路径完全不同的地址。例如:IIKAnalyzer设置是本地文件路径,而我某个索引希望设置一个 remote-api地址。。。或者说,我集群未设置remote_ext_dic,某个索引希望通过设置custom_dict_name的方式使用remote_ext_dic,都无法做到。

@Serendo
Copy link
Author

Serendo commented Aug 30, 2022

追加路径的方式设置remote_ext_dic,就很尴尬。既然是remote词库,有可能我需要某个索引的remote_ext_dic是路径完全不同的地址。例如:IIKAnalyzer设置是本地文件路径,而我某个索引希望设置一个 remote-api地址。。。或者说,我集群未设置remote_ext_dic,某个索引希望通过设置custom_dict_name的方式使用remote_ext_dic,都无法做到。

我是用一个nginx做的分词服务器,如果子路径的文件不存在,就直接返回默认的词库(就是父路径那个地址),这样词典服务器很好维护:有用户有自定义词典需求时,创建一个文件夹,放入词典文件就行。对于外部的remote-api地址,通过nginx转发过去应该也可以实现,只不过remote-api的地址信息就不是配置在索引setting里了。

@1120475708
Copy link

这样写稍微有点问题,如果某个索引不用了,这个字典似乎也会一直在es里面

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants