New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does Flask-WhooshAlchemy not support Chinese? #46

Closed
ToonoW opened this Issue Apr 26, 2016 · 12 comments

Comments

Projects
None yet
3 participants
@ToonoW

ToonoW commented Apr 26, 2016

I have setup Flask-WhooshAlchemy in my project.

I try to use Flask-WhooshAlchemy to query database with some English character information. It is fine.

But I use Chinese to query database, I alway get a empty array.
Please help me.

It is my models:

class Post(db.Model):
    __tablename__ = 'posts'
    __searchable__ = ['title']  # these fields will be indexed by whoosh

    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.Text)

It is what I do and what it show to me:

>>> p = Post(title='my third and last post 失意时行不行')
>>> db.session.add(p)
>>> db.session.commit()
>>> Post.query.whoosh_search('my').all()
[<app.models.Post object at 0x10629f630>]
>>> Post.query.whoosh_search('行不行').all()
[]
@chenbotao828

This comment has been minimized.

Show comment
Hide comment
@chenbotao828

chenbotao828 May 19, 2016

哈哈, 我遇到了同样的问题, 请问 @ToonoW 是否已经解决?

chenbotao828 commented May 19, 2016

哈哈, 我遇到了同样的问题, 请问 @ToonoW 是否已经解决?

@ToonoW

This comment has been minimized.

Show comment
Hide comment
@ToonoW

ToonoW May 19, 2016

@chenbotao828 我最后转折地用了,还算是可以,就是分词不是很理想,能用。http://www.v2ex.com/t/274600#reply5

ToonoW commented May 19, 2016

@chenbotao828 我最后转折地用了,还算是可以,就是分词不是很理想,能用。http://www.v2ex.com/t/274600#reply5

@ToonoW ToonoW closed this May 19, 2016

@ToonoW ToonoW reopened this May 19, 2016

@chenbotao828

This comment has been minimized.

Show comment
Hide comment
@chenbotao828

chenbotao828 May 19, 2016

@ToonoW 按照链接里的方法试过了,貌似还是不行,依然与之前一样,不能进行中文分词,是不是在此之外还有其他的地方需要设置呢?

chenbotao828 commented May 19, 2016

@ToonoW 按照链接里的方法试过了,貌似还是不行,依然与之前一样,不能进行中文分词,是不是在此之外还有其他的地方需要设置呢?

@ToonoW

This comment has been minimized.

Show comment
Hide comment
@ToonoW

ToonoW May 19, 2016

@chenbotao828 你用的是python3?如果是的话要用那个外国人改过的whoosh

ToonoW commented May 19, 2016

@chenbotao828 你用的是python3?如果是的话要用那个外国人改过的whoosh

@chenbotao828

This comment has been minimized.

Show comment
Hide comment
@chenbotao828

chenbotao828 May 19, 2016

@ToonoW 我用的是2 而且我就是按照他的教程中文版一步一步弄的,而且之前英文的搜索是可以的.现在中文搜索,可以搜到整个句子比如“我爱北京天安门”,搜整句可以搜到,但是搜“北京”就不行了,是不是要自己新建一个类似分词器的东西?

chenbotao828 commented May 19, 2016

@ToonoW 我用的是2 而且我就是按照他的教程中文版一步一步弄的,而且之前英文的搜索是可以的.现在中文搜索,可以搜到整个句子比如“我爱北京天安门”,搜整句可以搜到,但是搜“北京”就不行了,是不是要自己新建一个类似分词器的东西?

@ToonoW

This comment has been minimized.

Show comment
Hide comment
@ToonoW

ToonoW May 19, 2016

分词的话用jieba分词啊,我那个是针对3的,2的话没那么多问题

ToonoW commented May 19, 2016

分词的话用jieba分词啊,我那个是针对3的,2的话没那么多问题

@chenbotao828

This comment has been minimized.

Show comment
Hide comment
@chenbotao828

chenbotao828 May 19, 2016

嗯,具体说下情况:
python 版本 2.7
之前一直按照老外教程来的,一直到全文搜索那一章
英文的没问题,然后中文的不会自己分词
后来 根据链接的方法

  1. whooshalchemyplus 已安装,并且替换了whooshalchemy
  2. jieba已经在models.py增加了analyzer,就两句
from jieba.analysis import ChineseAnalyzer
....
__analyzer__ = ChineseAnalyzer() 
...

然后其他还有需要设置的吗?请教@ToonoW
ps:抓机打字 排版勿喷

chenbotao828 commented May 19, 2016

嗯,具体说下情况:
python 版本 2.7
之前一直按照老外教程来的,一直到全文搜索那一章
英文的没问题,然后中文的不会自己分词
后来 根据链接的方法

  1. whooshalchemyplus 已安装,并且替换了whooshalchemy
  2. jieba已经在models.py增加了analyzer,就两句
from jieba.analysis import ChineseAnalyzer
....
__analyzer__ = ChineseAnalyzer() 
...

然后其他还有需要设置的吗?请教@ToonoW
ps:抓机打字 排版勿喷

@ToonoW

This comment has been minimized.

Show comment
Hide comment
@ToonoW

ToonoW May 19, 2016

初始化app的时候做了这个吗?

from flask_whooshalchemyplus import whoosh_index

# Post 是需要全文检索的表
whoosh_index(app, Post)

__searchable__字段也是需要添加在需要检索的表里的.

ToonoW commented May 19, 2016

初始化app的时候做了这个吗?

from flask_whooshalchemyplus import whoosh_index

# Post 是需要全文检索的表
whoosh_index(app, Post)

__searchable__字段也是需要添加在需要检索的表里的.

@chenbotao828

This comment has been minimized.

Show comment
Hide comment
@chenbotao828

chenbotao828 May 19, 2016

做了, 按照教程一步步来的, 并且删了所有post, 再重新输入, 如果没有的话, 估计英文的搜索也不行, 不过现在由于电脑不在身边, 我明天再试试, 多谢 @ToonoW

chenbotao828 commented May 19, 2016

做了, 按照教程一步步来的, 并且删了所有post, 再重新输入, 如果没有的话, 估计英文的搜索也不行, 不过现在由于电脑不在身边, 我明天再试试, 多谢 @ToonoW

@chenbotao828

This comment has been minimized.

Show comment
Hide comment
@chenbotao828

chenbotao828 May 20, 2016

@ToonoW 现在解决了, 之前只是删除所有Post, 而现在是把数据库完全重置, 已经可以得到中文分词的搜索结果了. 而且分词结果还比较理想. 感谢!

chenbotao828 commented May 20, 2016

@ToonoW 现在解决了, 之前只是删除所有Post, 而现在是把数据库完全重置, 已经可以得到中文分词的搜索结果了. 而且分词结果还比较理想. 感谢!

@ToonoW

This comment has been minimized.

Show comment
Hide comment
@ToonoW

ToonoW May 20, 2016

@chenbotao828 啊,我忘了跟你说,新加入的数据才会增加到它的全文搜索的索引里,也就是新的数据才能全文检索

ToonoW commented May 20, 2016

@chenbotao828 啊,我忘了跟你说,新加入的数据才会增加到它的全文搜索的索引里,也就是新的数据才能全文检索

@ToonoW ToonoW closed this May 20, 2016

@fanne

This comment has been minimized.

Show comment
Hide comment
@fanne

fanne commented Jun 27, 2016

https://segmentfault.com/q/1010000005811334 这是我遇到的问题,不知是为咋的
@chenbotao828
@ToonoW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment