Merge pull request #26 from GAIR-NLP/dev

add example for med doc qa
GAIR-NLP · Aug 14, 2023 · bb7869c · bb7869c
2 parents 7e62ef3 + 883b89f
commit bb7869c
Show file tree

Hide file tree

Showing 3 changed files with 27 additions and 3 deletions.
diff --git a/example/example.py b/example/example.py
@@ -45,6 +45,30 @@
                         "The aim of preprocessing is to ensure that the data is in"
                         " a format that can be used by the machine learning algorithm.\n2. Feature Selection: Once the data has been preprocessed, the next step is to select the relevant features that will be used to train the machine learning algorithm. This involves identifying the features that are most important for predicting the target variable. Feature selection can be done using various techniques, such as filter methods, wrapper methods, and embedded methods.\n3. Model Selection: After feature selection, the next step is to select the appropriate machine learning algorithm to use. There are various types of machine learning algorithms, such as supervised learning, unsupervised learning, and reinforcement learning. The choice of algorithm depends on the nature of the problem and the type of data available.\n4. Training the Model: Once the model has been selected, the next step is to train the machine learning algorithm using the preprocessed data. This involves using a training set of data to optimize the parameters of the model so that it can accurately predict the target variable.\n5. Model Evaluation: After the model has been trained, the next step is to evaluate its performance. This involves testing the model on a separate test set of data and measuring its accuracy, precision, recall, and other performance metrics.\n6. Model Deployment: Once the model has been evaluated and fine-tuned, it can be deployed in a production environment. This involves integrating the model into a larger system or workflow, and monitoring its performance over time.\n\nResearch papers and textbooks that support the above points include:\n\n* \"An Introduction to Machine Learning\" by Alpaydin, B., (2010) which provides a comprehensive overview of machine learning algorithms and their applications.\n* \"Data Mining: Concepts and Techniques\" by Han, J., Kamber, B., & Pei, J., (2011) which provides a detailed introduction to data mining and its applications.\n* \"Machine Learning: Trends, Perspectives, and Prospects\" by Kamber, B., & Pei, J., (2012) which provides an overview of the current trends and future prospects of machine learning.\n* \"Machine Learning for Data Mining\" by He, C., & Wang, Y., (2015) which provides an in-depth introduction to machine learning algorithms and their applications in data mining.",
                 "category": "scientific"
+            },
+            {
+                "prompt":"""患者:从上个月开始嘴角就一直有一点疼疼的，你看这是什么原因呢？
+医生:嘴角疼痛可能是嘴角炎或是缺乏一些维生素，你平时有注意口腔卫生吗？
+患者:口腔卫生挺注意的，每天都刷牙，用牙线清理。
+患者:就是最近工作压力有点大，经常吃一些快餐和外卖。
+医生:有没有最近因为吃东西不注意导致口角破裂？
+患者:这个好像没有注意到，只是有一两次吃饭时嘴角稍微疼了一下。
+医生:那我们就暂且定为嘴角炎，我建议你可以试试涂一些口腔护理药膏，如果过几天情况有明显改善，应该就是嘴角炎引起的。
+患者:那维生素的话要补充吗？
+医生:嗯，维生素B2和B6对于舌炎和嘴角炎有一些好处，你可以适当的补充一些。
+患者:一般每天怎么补充？有没有推荐的食物或者药品？
+医生:我们一般建议在药店买一些维生素B2和B6的药片服用，平时你也可以多吃一些绿色蔬菜，豆类及坚果类食物，他们都比较富含这两种维生素。
+患者:好，我知道了，多谢医生。""",
+                "response":"""主诉：上个月开始头痛频繁
+现病史：从上个月开始，患者嘴角疼痛，无破损症状。近期工作压力大，饮食习惯改变，经常吃快餐和外卖，此外还有过度咖啡因摄入的习惯。
+既往史：患者以前有高血压的病史。
+诊断：口腔疾病—嘴角炎。
+治疗建议：
+1. 使用口腔护理药膏，早晚各一次，添加使用抗生素治疗。
+2. 适当补充维生素B2和B6，方法是吃绿色蔬菜、豆类及坚果类食物或药店购买口服，并额外补充维生素D。
+3. 注意口腔卫生，减少吃快餐和外卖，同时要戒掉喝咖啡的习惯。
+4. 尽量减小工作压力，保持良好心态。""",
+                "category": "med_doc_qa",
             }
 ]
 

diff --git a/factool/utils/prompts/agreement_verification.yaml b/factool/utils/prompts/agreement_verification.yaml
@@ -70,7 +70,7 @@ med_doc_qa:
   system: |-
     You are a brilliant assistant.
   user: |-
-    你会收到一个文本片段。你的任务是找出文本中的事实性错误。请不要做任何推测，任何证据无法直接支持的内容都应该被认为是错误的。
+    你会收到一个文本片段。你的任务是找出文本中的事实性错误。请不要将任何先验知识用于推理，任何证据无法直接支持的内容都应该被认为是错误的。
     当你在判断文本中是否有事实性错误时，你可以参考提供的证据。这些证据有可能有用，也有可能相互矛盾。在判断文本的事实性时，你必须小心谨慎。
     你的回答必须是一个含有四个键值的字典，这四个键值是-"reasoning", "factuality", "error", and "correction", 分别对应着论证过程,给定文本的事实性（True / False）,文本中的事实性错误以及纠正后的文本。
     以下是给定的文本
@@ -92,7 +92,7 @@ med_doc_qa:
     [文本]: {claim}
     以下是给定的证据
     [证据]: {evidence}
-    请严格按照以下格式回复，严格返回一个python字典。不要返回其他任何内容。以”[“开始你的回答。
+    请严格按照以下格式回复，严格返回一个python字典。不要返回其他任何内容。以"["开始你的回答。
     [回复格式]: 
     ["证据1","证据2","证据3"]
    
diff --git a/version.py b/version.py
@@ -1 +1 @@
-__version__ = "0.1.2"
+__version__ = "0.1.3"