Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

优化知识库文档相关操作 #1413

Merged
merged 4 commits into from
Sep 8, 2023

Commits on Sep 4, 2023

  1. Configuration menu
    Copy the full SHA
    8475a5d View commit details
    Browse the repository at this point in the history
  2. 将KnowledgeFile的file2text拆分成file2docs、docs2texts和file2text三个部分,在保持接口不变…

    …的情况下,实现:
    
    1、支持chunk_size和chunk_overlap参数
    2、支持自定义text_splitter
    3、支持自定义docs
    修复:csv文件不使用text_splitter
    liunux4odoo committed Sep 4, 2023
    Configuration menu
    Copy the full SHA
    93b133f View commit details
    Browse the repository at this point in the history

Commits on Sep 8, 2023

  1. 新功能:

    - 知识库管理中的add_docs/delete_docs/update_docs均支持批量操作,并利用多线程提高效率
    - API的重建知识库接口支持多线程
    - add_docs可提供参数控制上传文件后是否继续进行向量化
    - add_docs/update_docs支持传入自定义docs(以json形式)。后续考虑区分完整或补充式自定义docs
    - download_doc接口添加`preview`参数,支持下载或预览
    - kb_service增加`save_vector_store`方法,便于保存向量库(仅FAISS,其它无操作)
    - 将document_loader & text_splitter逻辑从KnowledgeFile中抽离出来,为后续对内存文件进行向量化做准备
    - KowledgeFile支持docs & splitted_docs的缓存,方便在中间过程做一些自定义
    
    其它:
    - 将部分错误输出由print改为logger.error
    liunux4odoo committed Sep 8, 2023
    Configuration menu
    Copy the full SHA
    661a0e9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4cfee9c View commit details
    Browse the repository at this point in the history