You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
问题描述 / Problem Description
使用 init_database.py 初始化文档库失败,只有一个文件被加载成功,其他文件都失败。
(faiss) [opc@llm-test Langchain-Chatchat]$ python copy_config_example.py
(faiss) [opc@llm-test Langchain-Chatchat]$ python init_database.py --recreate-vs
recreating all vector stores
2023-11-16 09:27:00,152 - faiss_cache.py[line:80] - INFO: loading vector store in 'samples/vector_store/m3e-base' from disk.
2023-11-16 09:27:00,564 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: moka-ai/m3e-base
Batches: 100%|██████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 37.38it/s]
2023-11-16 09:27:01,893 - loader.py[line:54] - INFO: Loading faiss with AVX2 support.
2023-11-16 09:27:01,908 - loader.py[line:56] - INFO: Successfully loaded faiss with AVX2 support.
2023-11-16 09:27:01,922 - faiss_cache.py[line:80] - INFO: loading vector store in 'samples/vector_store/m3e-base' from disk.
Batches: 100%|██████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 41.18it/s]
2023-11-16 09:27:01,953 - migrate.py[line:77] - ERROR: ValueError: 暂未支持的文件格式 .jsonl,已跳过
2023-11-16 09:27:01,953 - migrate.py[line:77] - ERROR: ValueError: 暂未支持的文件格式 .xlsx,已跳过
2023-11-16 09:27:01,953 - migrate.py[line:77] - ERROR: ValueError: 暂未支持的文件格式 .jsonl,已跳过
2023-11-16 09:27:01,953 - migrate.py[line:77] - ERROR: ValueError: 暂未支持的文件格式 .xlsx,已跳过
2023-11-16 09:27:01,954 - utils.py[line:292] - INFO: CSVLoader used for /home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/langchain-ChatGLM_closed.csv
2023-11-16 09:27:01,954 - utils.py[line:292] - INFO: CSVLoader used for /home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/langchain-ChatGLM_open.csv
2023-11-16 09:27:01,954 - utils.py[line:292] - INFO: UnstructuredFileLoader used for /home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/test.txt
文档切分示例:page_content=': 0\ntitle: 效果如何优化\nfile: 2023-04-04.00\nurl: https://github.com/imClumsyPanda/langchain-ChatGLM/issues/14\ndetail: 如图所示,将该项目的README.md和该项目结合后,回答效果并不理想,请问可以从哪些方面进行优化\nid: 0' metadata={'source': '/home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/langchain-ChatGLM_open.csv', 'row': 0}
正在将 samples/test_files/langchain-ChatGLM_open.csv 添加到向量库,共包含323条文档
2023-11-16 09:27:02,255 - utils.py[line:373] - ERROR: RuntimeError: 从文件 samples/test_files/langchain-ChatGLM_closed.csv 加载文档时出错:Error loading /home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/langchain-ChatGLM_closed.csv
Batches: 0%| | 0/11 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
2023-11-16 09:27:03,455 - utils.py[line:160] - INFO: NumExpr defaulting to 8 threads.
2023-11-16 09:27:03,735 - utils.py[line:373] - ERROR: ImportError: 从文件 samples/test_files/test.txt 加载文档时出错:libGL.so.1: cannot open shared object file: No such file or directory
The text was updated successfully, but these errors were encountered:
问题描述 / Problem Description
使用 init_database.py 初始化文档库失败,只有一个文件被加载成功,其他文件都失败。
(faiss) [opc@llm-test Langchain-Chatchat]$ python copy_config_example.py
(faiss) [opc@llm-test Langchain-Chatchat]$ python init_database.py --recreate-vs
recreating all vector stores
2023-11-16 09:27:00,152 - faiss_cache.py[line:80] - INFO: loading vector store in 'samples/vector_store/m3e-base' from disk.
2023-11-16 09:27:00,564 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: moka-ai/m3e-base
Batches: 100%|██████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 37.38it/s]
2023-11-16 09:27:01,893 - loader.py[line:54] - INFO: Loading faiss with AVX2 support.
2023-11-16 09:27:01,908 - loader.py[line:56] - INFO: Successfully loaded faiss with AVX2 support.
2023-11-16 09:27:01,922 - faiss_cache.py[line:80] - INFO: loading vector store in 'samples/vector_store/m3e-base' from disk.
Batches: 100%|██████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 41.18it/s]
2023-11-16 09:27:01,953 - migrate.py[line:77] - ERROR: ValueError: 暂未支持的文件格式 .jsonl,已跳过
2023-11-16 09:27:01,953 - migrate.py[line:77] - ERROR: ValueError: 暂未支持的文件格式 .xlsx,已跳过
2023-11-16 09:27:01,953 - migrate.py[line:77] - ERROR: ValueError: 暂未支持的文件格式 .jsonl,已跳过
2023-11-16 09:27:01,953 - migrate.py[line:77] - ERROR: ValueError: 暂未支持的文件格式 .xlsx,已跳过
2023-11-16 09:27:01,954 - utils.py[line:292] - INFO: CSVLoader used for /home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/langchain-ChatGLM_closed.csv
2023-11-16 09:27:01,954 - utils.py[line:292] - INFO: CSVLoader used for /home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/langchain-ChatGLM_open.csv
2023-11-16 09:27:01,954 - utils.py[line:292] - INFO: UnstructuredFileLoader used for /home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/test.txt
文档切分示例:page_content=': 0\ntitle: 效果如何优化\nfile: 2023-04-04.00\nurl: https://github.com/imClumsyPanda/langchain-ChatGLM/issues/14\ndetail: 如图所示,将该项目的README.md和该项目结合后,回答效果并不理想,请问可以从哪些方面进行优化\nid: 0' metadata={'source': '/home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/langchain-ChatGLM_open.csv', 'row': 0}
正在将 samples/test_files/langchain-ChatGLM_open.csv 添加到向量库,共包含323条文档
2023-11-16 09:27:02,255 - utils.py[line:373] - ERROR: RuntimeError: 从文件 samples/test_files/langchain-ChatGLM_closed.csv 加载文档时出错:Error loading /home/opc/LLM/Langchain-Chatchat/knowledge_base/samples/content/test_files/langchain-ChatGLM_closed.csv
Batches: 0%| | 0/11 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
2023-11-16 09:27:03,455 - utils.py[line:160] - INFO: NumExpr defaulting to 8 threads.
2023-11-16 09:27:03,735 - utils.py[line:373] - ERROR: ImportError: 从文件 samples/test_files/test.txt 加载文档时出错:libGL.so.1: cannot open shared object file: No such file or directory
The text was updated successfully, but these errors were encountered: