# 项目测评背景
随着人工智能技术的快速发展，学术论文的数量和复杂性呈指数级增长，如何高效、准确地从海量文献中提取关键信息成为科研人员和行业从业者面临的重大挑战。传统的文献检索和阅读方式已难以满足需求，而现有的大多数问答系统仍局限于对结构化知识库或短文本的浅层解析，无法深入理解学术论文中的复杂逻辑、实验数据和理论推导。为此，东南大学和华为诺亚方舟实验室联合推出《人工智能领域论文复杂问题问答评测》，旨在推动自然语言处理技术在学术文献深度理解与推理方面的突破。本评测任务要求参赛系统基于给定的英文论文全文，回答与之相关的多项选择题，其中每个问题包含一个以上的正确答案，从而考验模型在长文本理解、多跳推理和批判性分析等方面的综合能力。

本评测的核心挑战在于学术论文的独特复杂性——论文通常包含密集的专业术语、复杂的数学公式、跨段落的理论论证以及多维度的实验结果。参赛模型不仅需要准确识别文本中的显性信息，还需整合图表数据、方法描述和实验结论，甚至推断作者未明确陈述的隐含观点。例如，一个问题可能要求系统判断某篇论文提出的方法在特定条件下是否优于基线模型，而答案可能需要结合方法章节的理论分析、实验部分的对比表格以及讨论章节的局限性说明才能得出。此外，多选题的设计进一步增加了任务的难度，因为部分选项可能看似合理但缺乏严格依据，或仅在特定条件下成立，这就要求大模型具备高度的逻辑严谨性和抗干扰能力。我们期望通过这一评测，激发学术界和工业界对复杂学术文本理解技术的探索，并为智能文献综述、科研助手等应用提供技术支撑。

本评测的独特之处在于其紧密结合真实学术场景的需求，与传统的知识图谱问答或开放域问答相比，更强调对非结构化长文本的深度语义解析和高阶推理能力。同时，多选题的设定更容易评估。我们精心构建的数据集涵盖85篇人工智能顶会顶刊论文和225道人工智能专业博士生标注的多选题，问题类型包括实验验证、理论推导和结论分析等，确保全面评估模型的综合性能。为促进大模型在学术领域相关技术的发展，东南大学华为诺亚方舟实验室在CCKS2025大会组织本次评测任务。

## Simple Demo
接下来将会使用chatgpt提供的file api,上传文件，通过检测 multi_choice_questions.json中paper_id对chatgpt进行调用提问。

In [None]:

!pip install openai
!pip install dotenv
!pip install --upgrade openai

In [1]:
import os
import json
import time
import openai
from openai import OpenAI
print(openai.__version__)
from dotenv import load_dotenv
load_dotenv('D:\\workspace\\pycharm_project\\paper-parser\\.env')

# 创建 OpenAI 客户端
client = OpenAI(
    api_key=os.getenv('OPENAI_API_KEY'),
    base_url=os.getenv('OPENAI_URL')  # 如果使用自定义 URL
)

SYS_PROMPT="""You are an excellent paper analyzer. You will be provided with a PDF file of a research paper.
Your task is to thoroughly read and analyze the paper's content, including text, images, and tables.
Subsequently, you will be asked to determine if a given definition related to the paper is correct.
If and only if you are uncertain about a definition, you should endeavor to find evidence within the paper, potentially hidden in its tables or images, to support your answer.
Please reply in the following JSON format:
{
    "answer": "your answer, correct or incorrect",
    "reason": "your reason"
}
"""
# print(os.getenv('OPENAI_URL'))
# print(os.getenv('OPENAI_API_KEY'))

demo_file = "D:\\workspace\\pycharm_project\\paper-parser\\papers\\0\\2024.acl-long.820.pdf"
paper_base_path = "D:\\workspace\\pycharm_project\\paper-parser\\papers\\"
answers_file = "D:\\workspace\\pycharm_project\\paper-parser\\analysis_answers.json"
final_formatted_answers_file = "D:\\workspace\\pycharm_project\\paper-parser\\analysis_answers_formatted.json"

wait_time = 6
def parse_question(question):
    question = question["question"]
    q = question.split("\n")[0]

    # 获取所有非空行
    lines = [line.strip() for line in question.split("\n") if line.strip()]
    
    # 第一行是问题，剩下的是选项
    if len(lines) > 1:
        all_options = lines[1:]
    else:
        return q, []
    
    # 找到所有选项的起始位置
    option_starts = {}  # {选项字母: 在all_options中的索引}
    
    for i, line in enumerate(all_options):
        for letter in ['A', 'B', 'C', 'D']:
            if (line.startswith(f"{letter}.") or line.startswith(f"{letter} ")) and letter not in option_starts:
                option_starts[letter] = i
                break
    
    # 按A, B, C, D顺序构建完整的选项
    ordered_definitions = []
    expected_order = ['A', 'B', 'C', 'D']
    
    for i, letter in enumerate(expected_order):
        if letter not in option_starts:
            # 如果缺少某个选项，返回空列表
            return q, []
        
        # 确定当前选项的内容范围
        start_idx = option_starts[letter]
        
        # 找到下一个选项的起始位置（或结束位置）
        if i + 1 < len(expected_order):
            next_letter = expected_order[i + 1]
            if next_letter in option_starts:
                end_idx = option_starts[next_letter]
            else:
                end_idx = len(all_options)
        else:
            end_idx = len(all_options)
        
        # 合并从start_idx到end_idx-1的所有行
        option_content = []
        for j in range(start_idx, end_idx):
            option_content.append(all_options[j])
        
        # 将多行合并为一行，用空格连接
        full_option = " ".join(option_content)
        ordered_definitions.append(full_option)
    
    # 确保我们有且仅有4个定义
    if len(ordered_definitions) == 4:
        return q, ordered_definitions
    else:
        return q, []

def remove_markdown(text):
    """
    去除markdown格式
    """
    return text.replace("```json", "").replace("```", "")

import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

def formatted_answer(answers):
    formatted_answers = []
    for a in answers:
        correct_answer = ''
        for choice in a.keys():
            if choice in ['A', 'B', 'C', 'D']:
                if a[choice]["answer"] == "correct":
                    correct_answer += choice
        formatted_answer = {
            "paper_id":  a["paper_id"],
            "question":  a["question"],
            "correct_answer": correct_answer
        }
        formatted_answers.append(formatted_answer)
    return formatted_answers

def to_file(answers=None, formatted_answers=None):
    if answers is not None:
        with open(answers_file,  "w", encoding="utf-8"):
            json.dump(answers, open(answers_file, "w", encoding="utf-8"), indent=2, ensure_ascii=False)
            print(f"✅ 已保存分析结果到 {answers_file}")
    if formatted_answers is not None:
        with open(final_formatted_answers_file,  "w", encoding="utf-8"):
            json.dump(formatted_answers, open(final_formatted_answers_file, "w", encoding="utf-8"), indent=2, ensure_ascii=False)
            print(f"✅ 已保存格式化答案到 {final_formatted_answers_file}")

1.84.0


In [11]:
# 详细检查 OpenAI 客户端的可用功能
import sys
print("Python 路径:", sys.executable)
print("OpenAI 版本:", openai.__version__)

print("\n=== 检查 client 的可用属性 ===")
print("client 的主要属性:", [attr for attr in dir(client) if not attr.startswith('_')])

print("\n=== 检查 client.beta 的可用属性 ===")
try:
    beta_attrs = [attr for attr in dir(client.beta) if not attr.startswith('_')]
    print("client.beta 的属性:", beta_attrs)
except Exception as e:
    print("❌ client.beta 不存在:", str(e))

print("\n=== 测试各个 API ===")
try:
    # 测试 vector_stores
    vs = client.beta.vector_stores
    print("✅ client.beta.vector_stores 存在")
    print("vector_stores 的方法:", [attr for attr in dir(vs) if not attr.startswith('_')])
except AttributeError as e:
    print("❌ client.beta.vector_stores 不存在:", str(e))

try:
    # 测试 assistants
    assistants = client.beta.assistants
    print("✅ client.beta.assistants 存在")
    print("assistants 的方法:", [attr for attr in dir(assistants) if not attr.startswith('_')])
except AttributeError as e:
    print("❌ client.beta.assistants 不存在:", str(e))

try:
    # 测试 threads
    threads = client.beta.threads
    print("✅ client.beta.threads 存在")
    print("threads 的方法:", [attr for attr in dir(threads) if not attr.startswith('_')])
except AttributeError as e:
    print("❌ client.beta.threads 不存在:", str(e))

# 检查是否有其他名称的向量存储API
print("\n=== 查找可能的向量存储相关API ===")
for attr in dir(client.beta):
    if not attr.startswith('_') and 'vector' in attr.lower():
        print(f"找到可能的向量API: {attr}")

# 检查完整的client结构
print("\n=== client的完整结构 ===")
print("检查client是否有其他向量相关的API...")
for attr in dir(client):
    if not attr.startswith('_') and ('vector' in attr.lower() or 'store' in attr.lower()):
        print(f"找到: client.{attr}")

Python 路径: D:\Soft\tools\miniconda\python.exe
OpenAI 版本: 1.84.0

=== 检查 client 的可用属性 ===
client 的主要属性: ['api_key', 'audio', 'auth_headers', 'base_url', 'batches', 'beta', 'chat', 'close', 'completions', 'containers', 'copy', 'custom_auth', 'default_headers', 'default_query', 'delete', 'embeddings', 'evals', 'files', 'fine_tuning', 'get', 'get_api_list', 'images', 'is_closed', 'max_retries', 'models', 'moderations', 'organization', 'patch', 'platform_headers', 'post', 'project', 'put', 'qs', 'request', 'responses', 'timeout', 'uploads', 'user_agent', 'vector_stores', 'websocket_base_url', 'with_options', 'with_raw_response', 'with_streaming_response']

=== 检查 client.beta 的可用属性 ===
client.beta 的属性: ['assistants', 'chat', 'realtime', 'threads', 'with_raw_response', 'with_streaming_response']

=== 测试各个 API ===
❌ client.beta.vector_stores 不存在: 'Beta' object has no attribute 'vector_stores'
✅ client.beta.assistants 存在
assistants 的方法: ['create', 'delete', 'list', 'retrieve', 'update', 'with_r

In [12]:
class PaperAnalyzer:
    def __init__(self):
        self.uploaded_files = {}  # 存储已上传的文件 {file_path: file_id}
        self.vector_stores = {}   # 存储已创建的vector store {file_path: vector_store_id}
        self.assistants = {}      # 存储已创建的助手 {vector_store_id: assistant_id}

    def get_or_upload_file(self, file_path):
        """获取或上传文件，避免重复上传"""
        if file_path in self.uploaded_files:
            print(f"♻️ 复用已上传的文件: {self.uploaded_files[file_path]}")
            return self.uploaded_files[file_path]

        print(f"🔄 首次上传文件: {file_path}")
        file_response = client.files.create(
            file=open(file_path, "rb"),
            purpose="assistants"
        )
        file_id = file_response.id
        self.uploaded_files[file_path] = file_id
        print(f"✅ 文件上传完成: {file_id}")
        return file_id

    def get_or_create_vector_store(self, file_path, store_name):
        """获取或创建vector store，避免重复创建"""
        if file_path in self.vector_stores:
            # print(f"♻️ 复用已有的 Vector Store: {self.vector_stores[file_path]}")
            return self.vector_stores[file_path]

        print(f"🔄 创建新的 Vector Store: {store_name}")
        vector_store = client.vector_stores.create(name=store_name)
        vector_store_id = vector_store.id

        # 上传文件到vector store
        file_batch = client.vector_stores.file_batches.upload_and_poll(
            vector_store_id=vector_store_id,
            files=[open(file_path, "rb")]
        )

        self.vector_stores[file_path] = vector_store_id
        print(f"✅ Vector Store 创建完成: {vector_store_id}, 状态: {file_batch.status}")
        return vector_store_id

    def get_or_create_assistant(self, vector_store_id, file_id):
        """获取或创建助手，避免重复创建"""
        if vector_store_id in self.assistants:
            # print(f"♻️ 复用已有的助手: {self.assistants[vector_store_id]}")
            return self.assistants[vector_store_id]

        print("🔄 创建新的助手...")
        try:
            # 首先尝试使用正式API
            try:
                assistant = client.assistants.create(
                    name="Paper Analyzer",
                    description="A paper reader",
                    instructions=SYS_PROMPT,
                    model="gpt-4-turbo",
                    tools=[
                        {"type": "file_search"},
                        {"type": "code_interpreter"}
                    ],
                    tool_resources={
                        "file_search": {"vector_store_ids": [vector_store_id]},
                        "code_interpreter": {"file_ids": [file_id]}
                    }
                )
                print("✅ 使用正式 assistants API 创建成功")
            except AttributeError:
                # 如果正式API不存在，回退到beta API
                # print("⚠️ 正式 assistants API 不存在，使用 beta API")
                assistant = client.beta.assistants.create(
                    name="Paper Analyzer",
                    description="A paper reader",
                    instructions=SYS_PROMPT,
                    model="gpt-4-turbo",
                    tools=[
                        {"type": "file_search"},
                        {"type": "code_interpreter"}
                    ],
                    tool_resources={
                        "file_search": {"vector_store_ids": [vector_store_id]},
                        "code_interpreter": {"file_ids": [file_id]}
                    }
                )
                print("✅ 使用 beta assistants API 创建成功")

            assistant_id = assistant.id
            self.assistants[vector_store_id] = assistant_id
            print(f"✅ 助手创建完成: {assistant_id}")
            return assistant_id
        except Exception as e:
            print(f"❌ 创建助手失败: {e}")
            # 尝试其他模型
            try:
                print("🔄 尝试使用 gpt-4-turbo...")
                try:
                    assistant = client.assistants.create(
                        name="Paper Analyzer",
                        description="A paper reader",
                        instructions=SYS_PROMPT,
                        model="gpt-4-turbo",
                        tools=[
                            {"type": "file_search"},
                            {"type": "code_interpreter"}
                        ],
                        tool_resources={
                            "file_search": {"vector_store_ids": [vector_store_id]},
                            "code_interpreter": {"file_ids": [file_id]}
                        }
                    )
                    # print("✅ 使用正式 assistants API 创建成功 (gpt-4-turbo)")
                except AttributeError:
                    assistant = client.beta.assistants.create(
                        name="Paper Analyzer",
                        description="A paper reader",
                        instructions=SYS_PROMPT,
                        model="gpt-4-turbo",
                        tools=[
                            {"type": "file_search"},
                            {"type": "code_interpreter"}
                        ],
                        tool_resources={
                            "file_search": {"vector_store_ids": [vector_store_id]},
                            "code_interpreter": {"file_ids": [file_id]}
                        }
                    )
                    # print("✅ 使用 beta assistants API 创建成功 (gpt-4-turbo)")

                assistant_id = assistant.id
                self.assistants[vector_store_id] = assistant_id
                print(f"✅ 助手创建完成 (gpt-4-turbo): {assistant_id}")
                return assistant_id
            except Exception as e2:
                print(f"❌ 所有模型都失败了: {e2}")
                return None

    def create_thread(self):
        """创建对话线程，优先使用正式API"""
        try:
            # 首先尝试使用正式API
            try:
                thread = client.threads.create()
                # print(f"✅ 使用正式 threads API 创建线程: {thread.id}")
                return thread
            except AttributeError:
                # 如果正式API不存在，回退到beta API
                # print("⚠️ 正式 threads API 不存在，使用 beta API")
                thread = client.beta.threads.create()
                # print(f"✅ 使用 beta threads API 创建线程: {thread.id}")
                return thread
        except Exception as e:
            print(f"❌ 创建线程失败: {e}")
            return None

    def send_message(self, thread_id, content):
        """发送消息，优先使用正式API"""
        try:
            # 首先尝试使用正式API
            try:
                message = client.threads.messages.create(
                    thread_id=thread_id,
                    role="user",
                    content=content
                )
                return message
            except AttributeError:
                # 如果正式API不存在，回退到beta API
                message = client.beta.threads.messages.create(
                    thread_id=thread_id,
                    role="user",
                    content=content
                )
                return message
        except Exception as e:
            print(f"❌ 发送消息失败: {e}")
            return None

    def run_assistant(self, thread_id, assistant_id):
        """运行助手，优先使用正式API"""
        try:
            # 首先尝试使用正式API
            try:
                run = client.threads.runs.create(
                    thread_id=thread_id,
                    assistant_id=assistant_id
                )
                return run
            except AttributeError:
                # 如果正式API不存在，回退到beta API
                run = client.beta.threads.runs.create(
                    thread_id=thread_id,
                    assistant_id=assistant_id
                )
                return run
        except Exception as e:
            print(f"❌ 运行助手失败: {e}")
            return None

    def get_run_status(self, thread_id, run_id):
        """获取运行状态，优先使用正式API"""
        try:
            # 首先尝试使用正式API
            try:
                run_status = client.threads.runs.retrieve(
                    thread_id=thread_id,
                    run_id=run_id
                )
                return run_status
            except AttributeError:
                # 如果正式API不存在，回退到beta API
                run_status = client.beta.threads.runs.retrieve(
                    thread_id=thread_id,
                    run_id=run_id
                )
                return run_status
        except Exception as e:
            print(f"❌ 获取运行状态失败: {e}")
            return None

    def get_messages(self, thread_id):
        """获取消息列表，优先使用正式API"""
        try:
            # 首先尝试使用正式API
            try:
                messages = client.threads.messages.list(thread_id=thread_id)
                return messages
            except AttributeError:
                # 如果正式API不存在，回退到beta API
                messages = client.beta.threads.messages.list(thread_id=thread_id)
                return messages
        except Exception as e:
            print(f"❌ 获取消息失败: {e}")
            return None

    def cleanup_resources(self):
        """清理创建的资源"""
        print("🧹 清理资源...")

        # 删除助手
        for assistant_id in self.assistants.values():
            try:
                try:
                    client.assistants.delete(assistant_id)
                    print(f"🗑️ 已删除助手 (正式API): {assistant_id}")
                except AttributeError:
                    client.beta.assistants.delete(assistant_id)
                    print(f"🗑️ 已删除助手 (beta API): {assistant_id}")
            except:
                pass

        # 删除vector stores
        for vector_store_id in self.vector_stores.values():
            try:
                client.vector_stores.delete(vector_store_id)
                print(f"🗑️ 已删除 Vector Store: {vector_store_id}")
            except:
                pass

        # 删除上传的文件
        for file_id in self.uploaded_files.values():
            try:
                client.files.delete(file_id)
                print(f"🗑️ 已删除文件: {file_id}")
            except:
                pass

        print("✅ 资源清理完成")
# 为 PaperAnalyzer 类添加按论文ID清理资源的功能
class PaperAnalyzerEnhanced(PaperAnalyzer):
    def __init__(self):
        super().__init__()
        self.paper_resources = {}  # 存储每个论文ID对应的资源 {paper_id: {'file_path': str, 'file_id': str, 'vector_store_id': str}}

    def get_or_upload_file(self, file_path, paper_id=None):
        """获取或上传文件，避免重复上传，并记录论文ID映射"""
        if file_path in self.uploaded_files:
            print(f"♻️ 复用已上传的文件: {self.uploaded_files[file_path]}")
            file_id = self.uploaded_files[file_path]
        else:
            print(f"🔄 首次上传文件: {file_path}")
            file_response = client.files.create(
                file=open(file_path, "rb"),
                purpose="assistants"
            )
            file_id = file_response.id
            self.uploaded_files[file_path] = file_id
            print(f"✅ 文件上传完成: {file_id}")

        # 记录论文ID与资源的映射关系
        if paper_id is not None:
            if paper_id not in self.paper_resources:
                self.paper_resources[paper_id] = {}
            self.paper_resources[paper_id]['file_path'] = file_path
            self.paper_resources[paper_id]['file_id'] = file_id

        return file_id

    def get_or_create_vector_store(self, file_path, store_name, paper_id=None):
        """获取或创建vector store，避免重复创建，并记录论文ID映射"""
        if file_path in self.vector_stores:
            print(f"♻️ 复用已有的 Vector Store: {self.vector_stores[file_path]}")
            vector_store_id = self.vector_stores[file_path]
        else:
            print(f"🔄 创建新的 Vector Store: {store_name}")
            vector_store = client.vector_stores.create(name=store_name)
            vector_store_id = vector_store.id

            # 上传文件到vector store
            file_batch = client.vector_stores.file_batches.upload_and_poll(
                vector_store_id=vector_store_id,
                files=[open(file_path, "rb")]
            )

            self.vector_stores[file_path] = vector_store_id
            print(f"✅ Vector Store 创建完成: {vector_store_id}, 状态: {file_batch.status}")

        # 记录论文ID与资源的映射关系
        if paper_id is not None:
            if paper_id not in self.paper_resources:
                self.paper_resources[paper_id] = {}
            self.paper_resources[paper_id]['vector_store_id'] = vector_store_id

        return vector_store_id

    def cleanup_paper_resources(self, paper_id):
        """清理指定论文ID的资源（文件和vector store），但保留助手"""
        if paper_id not in self.paper_resources:
            print(f"⚠️ 论文ID {paper_id} 没有找到对应的资源")
            return

        paper_resource = self.paper_resources[paper_id]
        print(f"🧹 开始清理论文 {paper_id} 的资源...")

        # 删除vector store
        if 'vector_store_id' in paper_resource:
            vector_store_id = paper_resource['vector_store_id']
            try:
                client.vector_stores.delete(vector_store_id)
                print(f"🗑️ 已删除 Vector Store: {vector_store_id}")

                # 从映射中移除
                file_path = paper_resource.get('file_path')
                if file_path and file_path in self.vector_stores:
                    del self.vector_stores[file_path]
            except Exception as e:
                print(f"⚠️ 删除 Vector Store 失败: {e}")

        # 删除上传的文件
        if 'file_id' in paper_resource:
            file_id = paper_resource['file_id']
            try:
                client.files.delete(file_id)
                print(f"🗑️ 已删除文件: {file_id}")

                # 从映射中移除
                file_path = paper_resource.get('file_path')
                if file_path and file_path in self.uploaded_files:
                    del self.uploaded_files[file_path]
            except Exception as e:
                print(f"⚠️ 删除文件失败: {e}")

        # 从论文资源映射中移除
        del self.paper_resources[paper_id]
        print(f"✅ 论文 {paper_id} 的资源清理完成")

    def cleanup_all_papers_keep_assistants(self):
        """清理所有论文的文件和vector store，但保留助手"""
        paper_ids = list(self.paper_resources.keys())
        for paper_id in paper_ids:
            self.cleanup_paper_resources(paper_id)
        print("✅ 所有论文资源清理完成，助手已保留")

    def get_paper_resource_info(self, paper_id=None):
        """获取论文资源信息"""
        if paper_id:
            if paper_id in self.paper_resources:
                return self.paper_resources[paper_id]
            else:
                print(f"⚠️ 论文ID {paper_id} 没有找到")
                return None
        else:
            return {
                "论文资源": self.paper_resources,
                "助手数量": len(self.assistants),
                "文件数量": len(self.uploaded_files),
                "Vector Store数量": len(self.vector_stores)
            }

# 使用增强版分析器替换原有的
paper_analyzer = PaperAnalyzerEnhanced()

In [13]:
# 更新演示函数以使用增强版分析器
def run_enhanced_demo():
    answers = []
    try:
        # 加载问题
        multi_choice_questions = json.load(open("D:\\workspace\\pycharm_project\\paper-parser\\multi_choice_questions.json", encoding='utf-8'))
        sorted_questions = sorted(multi_choice_questions, key=lambda x: int(x["paper_id"]))

        for paper in sorted_questions:
            paper_id = paper["paper_id"]
            q, definitions = parse_question(paper)
            answers_details = {
                "paper_id": paper_id,
                "question": q,
            }

            file_dir = os.path.join(paper_base_path, paper_id)
            file_path = os.path.join(file_dir, os.listdir(file_dir)[0])
            print(f"📄 论文文件路径: {file_path}")

            # 使用增强版分析器，传入论文ID
            file_id = paper_analyzer.get_or_upload_file(file_path, paper_id=paper_id)
            vector_store_id = paper_analyzer.get_or_create_vector_store(file_path, f"vector_store_{paper_id}", paper_id=paper_id)
            assistant_id = paper_analyzer.get_or_create_assistant(vector_store_id, file_id)

            if not assistant_id:
                print("❌ 无法创建助手，跳过...")
                break

            # 创建线程
            thread = paper_analyzer.create_thread()
            if not thread:
                print("❌ 无法创建线程，跳过...")
                break

            print(f"❓ 问题: {q}")
            print(f"📋 定义数量: {len(definitions)}")

            for idx, definition in enumerate(definitions):
                print(f"\n🔄 处理定义 {idx+1}: {definition[:100]}...")

                try:
                    # 发送消息
                    message = paper_analyzer.send_message(
                        thread.id,
                        f"{q}\n{definition}"
                    )

                    if not message:
                        print("❌ 发送消息失败")
                        continue

                    # 运行助手
                    run = paper_analyzer.run_assistant(thread.id, assistant_id)

                    if not run:
                        print("❌ 运行助手失败")
                        continue

                    # 等待完成
                    # print("⏳ 等待助手回复...")
                    max_attempts = 30
                    attempts = 0

                    while attempts < max_attempts:
                        run_status = paper_analyzer.get_run_status(thread.id, run.id)

                        if not run_status:
                            print("❌ 无法获取运行状态")
                            break

                        status_icons = {
                            "queued": "⏳",
                            "in_progress": "🔄",
                            "completed": "✅",
                            "failed": "❌",
                            "cancelled": "🚫",
                            "requires_action": "🔧"
                        }
                        icon = status_icons.get(run_status.status, "🔍")
                        print(f"{icon} 状态: {run_status.status}")

                        if run_status.status in ["completed", "failed", "cancelled"]:
                            break
                        elif run_status.status == "requires_action":
                            print(f"🔧 需要操作: {run_status.required_action}")
                            break

                        time.sleep(wait_time)
                        attempts += 1

                    if run_status.status == "completed":
                        # 获取回复
                        messages = paper_analyzer.get_messages(thread.id)
                        if messages:
                            for msg in messages.data:
                                if msg.role == "assistant":
                                    response_text = msg.content[0].text.value
                                    # print(f"\n💡 助手回复: {response_text}")

                                    try:
                                        resp = json.loads(remove_markdown(response_text))
                                        answers_details[definition[0]] = {
                                            "answer": resp["answer"],
                                            "reason": resp["reason"]
                                        }
                                        print(f"✅ 答案解析成功: {resp['answer']}")
                                    except json.JSONDecodeError:
                                        print("⚠️ 回复不是有效的JSON格式，保存原始回复")
                                        answers_details[definition] = {
                                            "answer": "解析失败",
                                            "reason": response_text
                                        }
                                    break
                    else:
                        print(f"❌ 运行未完成，状态: {run_status.status}")
                        if hasattr(run_status, 'last_error') and run_status.last_error:
                            print(f"错误详情: {run_status.last_error}")

                except Exception as e:
                    print(f"❌ 处理定义时出错: {e}")
                    continue

            answers.append(answers_details)
            fo_answers = formatted_answer(answers)
            to_file(answers, fo_answers)
            # 显示资源信息
            print("\n📊 当前资源状态:")
            print(json.dumps(paper_analyzer.get_paper_resource_info(), indent=2, ensure_ascii=False))
            resource_status = paper_analyzer.get_paper_resource_info()
            if resource_status['文件数量'] > 2:
                removed_id = list(resource_status["论文资源"].keys())[0]
                cleanup_paper_by_id(removed_id)
                print(f"🗑️ 已清理论文ID: {removed_id}")

        # print("\n🎉 === 最终结果 ===")
        # print(json.dumps(answers, indent=2, ensure_ascii=False))

        return answers

    except Exception as e:
        print(f"❌ 演示运行出错: {e}")
        import traceback
        traceback.print_exc()
        return answers

# 清理函数
def cleanup_paper_by_id(paper_id):
    """清理指定论文ID的资源"""
    paper_analyzer.cleanup_paper_resources(paper_id)

def cleanup_all_papers():
    """清理所有论文资源但保留助手"""
    paper_analyzer.cleanup_all_papers_keep_assistants()

def cleanup_everything():
    """清理所有资源包括助手"""
    paper_analyzer.cleanup_resources()

print("✅ 增强版演示函数已准备好")
print("🎯 运行 run_enhanced_demo() 来测试新功能")
print("🧹 运行 cleanup_paper_by_id('0') 来清理论文0的资源")
print("🧹 运行 cleanup_all_papers() 来清理所有论文但保留助手")
print("🧹 运行 cleanup_everything() 来清理所有资源")


✅ 增强版演示函数已准备好
🎯 运行 run_enhanced_demo() 来测试新功能
🧹 运行 cleanup_paper_by_id('0') 来清理论文0的资源
🧹 运行 cleanup_all_papers() 来清理所有论文但保留助手
🧹 运行 cleanup_everything() 来清理所有资源


In [14]:
# cleanup_updated()
answers = run_enhanced_demo()

📄 论文文件路径: D:\workspace\pycharm_project\paper-parser\papers\0\2024.acl-long.820.pdf
🔄 首次上传文件: D:\workspace\pycharm_project\paper-parser\papers\0\2024.acl-long.820.pdf
✅ 文件上传完成: file-Bwmr1R7g73BLDhJpwern1e
🔄 创建新的 Vector Store: vector_store_0
✅ Vector Store 创建完成: vs_68462c69a818819197ed97b21d7c35ec, 状态: completed
🔄 创建新的助手...
✅ 使用 beta assistants API 创建成功
✅ 助手创建完成: asst_XXnwzUvvUm73pmmTldifAuxR
❓ 问题: Which of the following factors may cause multilingual large language models to show English bias when processing non-English languages?
📋 定义数量: 4

🔄 处理定义 1: A. The model's training data mainly consists of English text....
⏳ 状态: queued
🔄 状态: in_progress
✅ 状态: completed
✅ 答案解析成功: correct

🔄 处理定义 2: B. The model uses English as the central language in the middle layer for semantic understanding and...
⏳ 状态: queued
✅ 状态: completed
✅ 答案解析成功: correct

🔄 处理定义 3: C. In the model's word embedding space, English word embeddings are more densely distributed and eas...
⏳ 状态: queued
✅ 状态: completed
✅ 答案解析成

In [None]:
import json
formatted_answer(answers)
to_file(answers, formatted_answer(answers))

In [None]:
cleanup_everything()