Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于分词,自定义词典以后还是会拆出单字? #1035

Open
crossmaya opened this issue Dec 1, 2023 · 1 comment
Open

关于分词,自定义词典以后还是会拆出单字? #1035

crossmaya opened this issue Dec 1, 2023 · 1 comment

Comments

@crossmaya
Copy link

crossmaya commented Dec 1, 2023

'analyzer' => 'ik_max_word',
'text' => '我爱迪丽热巴'

'爱迪', '热', '巴' 怎么能屏蔽掉呢?

Array
(
[tokens] => Array
(
[0] => Array
(
[token] => 我爱
[start_offset] => 0
[end_offset] => 2
[type] => CN_WORD
[position] => 0
)

        [1] => Array
            (
                [token] => 爱迪
                [start_offset] => 1
                [end_offset] => 3
                [type] => CN_WORD
                [position] => 1
            )

        [2] => Array
            (
                [token] => 迪丽热巴
                [start_offset] => 2
                [end_offset] => 6
                [type] => CN_WORD
                [position] => 2
            )

        [3] => Array
            (
                [token] => 热
                [start_offset] => 4
                [end_offset] => 5
                [type] => CN_WORD
                [position] => 3
            )

        [4] => Array
            (
                [token] => 巴
                [start_offset] => 5
                [end_offset] => 6
                [type] => CN_CHAR
                [position] => 4
            )

    )

)

@Emptyrain
Copy link

使用ik_smart应该就可以了吧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants