Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ik_max_word会将英文末尾带点拆分成两个 #993

Open
sissilab opened this issue Jan 13, 2023 · 0 comments
Open

ik_max_word会将英文末尾带点拆分成两个 #993

sissilab opened this issue Jan 13, 2023 · 0 comments

Comments

@sissilab
Copy link

FoX. 使用 ik_max_word 会拆分成2个,请教下,如何处理,只展示 fox

若通过自定义 char_filtermapping 来映射将 . 映射为 空格,这种会影响到那些需要 . 的情况,如 U.F.O,需要保留 .

GET _analyze
{
  "analyzer": "ik_max_word",
  "text": "FoX."
}

{
  "tokens": [
    {
      "token": "fox.",
      "start_offset": 0,
      "end_offset": 4,
      "type": "LETTER",
      "position": 0
    },
    {
      "token": "fox",
      "start_offset": 0,
      "end_offset": 3,
      "type": "ENGLISH",
      "position": 1
    }
  ]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant