Skip to content

Modelcache dev #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 145 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,151 @@ res = requests.post(url, headers=headers, json=json.dumps(data))
Coming soon...
## modules
![modelcache modules](docs/modelcache_modules_en.png)
## Function-Comparison
In terms of functionality, we have made several changes to the git repository. Firstly, we have addressed the network issues with huggingface and enhanced the inference speed by introducing local inference capabilities for embeddings. Additionally, considering the limitations of the SqlAlchemy framework, we have completely revamped the module responsible for interacting with relational databases, enabling more flexible database operations. In practical scenarios, LLM products often require integration with multiple users and multiple models. Hence, we have added support for multi-tenancy in the ModelCache, while also making preliminary compatibility adjustments for system commands and multi-turn dialogue.

<html>
<head>
<style>
table, th, td {
border-collapse: collapse;
text-align: left;
padding: 10px;
margin-left: 20px;
margin-right: 20px;
}
.checkmark {
font-size: 24px;
}

</style>
</head>
<body>

<table>
<tr>
<th rowspan="2">Module</th>
<th rowspan="2">Function</th>

</tr>
<tr>
<th>ModelCache</th>
<th>GPTCache</th>
</tr>
<tr>
<td rowspan="2">Basic Interface</td>
<td>Data query interface</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>Data writing interface</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td rowspan="3">Embedding</td>
<td>Embedding model configuration</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>Large model embedding layer</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>BERT model long text processing</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="2">Large model invocation</td>
<td>Decoupling from large models</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>Local loading of embedding model</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="2">Data isolation</td>
<td>Model data isolation</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>Hyperparameter isolation</td>
<td></td>
<td></td>
</tr>
<tr>
<td rowspan="3">Databases</td>
<td>MySQL</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>Milvus</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>OceanBase</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="3">Session management</td>
<td>Single-turn dialogue</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>System commands</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>Multi-turn dialogue</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="2">Data management</td>
<td>Data persistence</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>One-click cache clearance</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="2">Tenant management</td>
<td>Support for multi-tenancy</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>Milvus multi-collection capability</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>Other</td>
<td>Long-short dialogue distinction</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
</table>

</body>
</html>

## Core-Features
In ModelCache, we adopted the main idea of GPTCache, includes core modules: adapter, embedding, similarity, and data_manager. The adapter module is responsible for handling the business logic of various tasks and can connect the embedding, similarity, and data_manager modules. The embedding module is mainly responsible for converting text into semantic vector representations, it transforms user queries into vector form.The rank module is used for sorting and evaluating the similarity of the recalled vectors. The data_manager module is primarily used for managing the database. In order to better facilitate industrial applications, we have made architectural and functional upgrades as follows:

Expand Down
143 changes: 143 additions & 0 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,149 @@ res = requests.post(url, headers=headers, json=json.dumps(data))
敬请期待
## 架构大图
![modelcache modules](docs/modelcache_modules.png)
## 功能对比
功能方面,为了解决huggingface网络问题并提升推理速度,增加了embedding本地推理能力。鉴于SqlAlchemy框架存在一些限制,我们对关系数据库交互模块进行了重写,以更灵活地实现数据库操作。在实践中,大型模型产品需要与多个用户和多个模型对接,因此在ModelCache中增加了对多租户的支持,同时也初步兼容了系统指令和多轮会话。
<html>
<head>
<style>
table, th, td {
border-collapse: collapse;
text-align: left;
padding: 8px;
margin-left: 20px;
margin-right: 20px;
}
.checkmark {
font-size: 24px;
}

</style>
</head>
<body>

<table>
<tr>
<th rowspan="2">模块</th>
<th rowspan="2">功能</th>

</tr>
<tr>
<th>ModelCache</th>
<th>GPTCache</th>
</tr>
<tr>
<td rowspan="2">基础接口</td>
<td>数据查询接口</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>数据写入接口</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td rowspan="3">Embedding</td>
<td>embedding模型配置</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>大模型embedding层</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>bert模型长文本处理</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="2">Large model invocation</td>
<td>是否与大模型解耦</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>embeddingg模型本地加载</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="2">数据隔离</td>
<td>模型数据隔离</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>超参数隔离</td>
<td></td>
<td></td>
</tr>
<tr>
<td rowspan="3">数据库</td>
<td>MySQL</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>Milvus</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>OceanBase</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="3">会话管理</td>
<td>单轮回话</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>system指令</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>多轮回话</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="2">数据管理</td>
<td>数据持久化</td>
<td class="checkmark">&#9745; </td>
<td class="checkmark">&#9745; </td>
</tr>
<tr>
<td>一键清空缓存</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td rowspan="2">租户管理</td>
<td>支持多租户(多模型)</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>milvus多表能力</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
<tr>
<td>其他</td>
<td>长短对话区分能力</td>
<td class="checkmark">&#9745; </td>
<td></td>
</tr>
</table>

</body>
</html>
## 核心功能
在ModelCache中,沿用了GPTCache的主要思想,包含了一系列核心模块:adapter、embedding、similarity和data_manager。adapter模块主要功能是处理各种任务的业务逻辑,并且能够将embedding、similarity、data_manager等模块串联起来;embedding模块主要负责将文本转换为语义向量表示,它将用户的查询转换为向量形式,并用于后续的召回或存储操作;rank模块用于对召回的向量进行相似度排序和评估;data_manager模块主要用于管理数据库。同时,为了更好的在工业界落地,我们做了架构和功能上的升级,如下:

Expand Down
Binary file modified docs/modelcache_modules_en.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.