In [1]:
!pip install langchain_community langchain_openai graphrag faiss-cpu pdfminer.six

Collecting langchain_community
  Downloading langchain_community-0.3.26-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain_openai
  Downloading langchain_openai-0.3.27-py3-none-any.whl.metadata (2.3 kB)
Collecting graphrag
  Downloading graphrag-2.3.0-py3-none-any.whl.metadata (6.6 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.11.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.8 kB)
Collecting pdfminer.six
  Downloading pdfminer_six-20250506-py3-none-any.whl.metadata (4.2 kB)
Collecting langchain-core<1.0.0,>=0.3.66 (from langchain_community)
  Downloading langchain_core-0.3.67-py3-none-any.whl.metadata (5.8 kB)
Collecting langchain<1.0.0,>=0.3.26 (from langchain_community)
  Downloading langchain-0.3.26-py3-none-any.whl.metadata (7.8 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.10.1-py3-none-any.whl.metadata (3.4 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain_community)
  Downloading 

In [2]:
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import WebBaseLoader, PDFMinerLoader
from dotenv import load_dotenv
import os
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain import hub
from langchain_core.runnables import RunnablePassthrough

def get_model(temperature=0.9):
    # load_dotenv()
    # api_key = os.getenv('API_KEY')
    # base_url = os.getenv('BASE_URL')
    api_key='0ea786d9905743a4ac9a4cf7b9e74719.TDE1JQlvq9xVHdxf'
    base_url='https://open.bigmodel.cn/api/paas/v4'
    return ChatOpenAI(
        api_key=api_key,
        base_url=base_url,
        # 'glm-4-plus'
        model='glm-z1-flash',
        temperature=temperature
    ), OpenAIEmbeddings(
        api_key=api_key,
        base_url=base_url,
        model='embedding-3',
    )


def get_vector_store(web_path, embedding):
    loader = PDFMinerLoader(web_path) if web_path.endswith('.pdf') else WebBaseLoader(web_path)
    docs = loader.load()
    spliter = RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=100)
    splited_docs = spliter.split_documents(docs)
    vector_store = FAISS.from_documents(splited_docs, embedding)
    return vector_store


def load_vector_store(web_path, embedding):
    slash_index = web_path.rfind('/')+1
    dot_index = web_path.rfind('.',slash_index)
    store_name = web_path[slash_index:] if dot_index < 0 else web_path[slash_index:dot_index]
    print(store_name)
    if os.path.exists(store_name):
        print('加载本地向量数据库', store_name)
        return FAISS.load_local(store_name, embedding, allow_dangerous_deserialization=True)
    vector_store = get_vector_store(web_path, embedding)
    vector_store.save_local(store_name)
    print('本地向量数据库保存成功', store_name)
    return vector_store

In [3]:
def rag_chain(path):
    model, embedding = get_model() 
    # print(embedding.embed_query('如何开启心智'))
    vector_store = load_vector_store(path, embedding)
    retriever = vector_store.as_retriever(search_kwargs={'k': 5})
    prompt = hub.pull('rlm/rag-prompt')
    chain = (
        {"context":retriever, "question": RunnablePassthrough()}
        | prompt
        | model
        | StrOutputParser()
    )
    return chain

# web_path = 'https://hawstein.com/2014/03/06/make-thiner-make-friend-with-time/'
web_path = 'https://hf-mirror.com/blog/zh/eu-ai-act-for-oss-developers'
web_rag_chain = rag_chain(web_path)
resp = web_rag_chain.invoke('你知道什么关于人工智能的法案？')
print(resp)

pdf_path = 'https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf'
pdf_rag_chain = rag_chain(pdf_path)
resp = pdf_rag_chain.invoke('什么是RL？')
print(resp)

eu-ai-act-for-oss-developers
本地向量数据库保存成功 eu-ai-act-for-oss-developers




<think>
好的，我需要回答用户关于欧盟《人工智能法案》的问题。首先，用户的问题是用中文问的，所以回答也要用中文。根据提供的上下文，我需要从多个文档中提取相关信息。

首先，文档提到法案根据风险等级划分，分为不可接受、高风险、有限风险和最小风险。特别是对于有限风险系统，比如聊天机器人或生成图像的模型，开发者需要披露AI交互，并标记合成内容。此外，法案要求GPAI模型（如训练计算量超过10^25 FLOPs的模型）必须通知欧盟委员会，并可能被评估为系统性风险。还有提到2025年5月发布实践准则，2025年8月起执行相关义务。需要确保回答简洁，不超过三句话，涵盖主要点：风险分类、有限风险义务、GPAI的义务和生效时间。
</think>
欧盟《人工智能法案》根据风险等级对AI系统分类，有限风险系统需披露AI交互并标记合成内容，GPAI模型需通知欧盟委员会并可能被审查系统性风险，义务从2025年8月起执行。
The%20Era%20of%20Experience%20Paper
本地向量数据库保存成功 The%20Era%20of%20Experience%20Paper




<think>
Okay, let's see. The user is asking "什么是RL？" which translates to "What is RL?" in Chinese. RL probably stands for Reinforcement Learning, but I need to check the context provided.

Looking at the documents, they mention "experiential learning" several times. The context talks about agents learning through experience, taking sequences of steps to maximize long-term success. There's a comparison with current AI systems that respond immediately without considering future consequences. The documents also mention actions and observations grounded in the environment, which sounds like RL, where agents learn by interacting with an environment and receiving rewards based on actions.

However, the term "RL" isn't explicitly defined here. The context is about experiential learning, which might be a broader concept. But since RL is the most likely expansion of RL, and the context describes elements like actions, rewards, and long-term goals, I can infer that RL here refers to Reinforcemen

In [4]:
web_rag_chain = rag_chain(web_path)
resp = web_rag_chain.invoke('有限风险人工智能系统的开发者需要什么义务？')
print(resp)

pdf_rag_chain = rag_chain(pdf_path)
resp = pdf_rag_chain.invoke('AlphaZero发现了什么？')
print(resp)

eu-ai-act-for-oss-developers
加载本地向量数据库 eu-ai-act-for-oss-developers




<think>
嗯，用户问的是有限风险人工智能系统的开发者需要什么义务。我需要先仔细看看提供的上下文，找出相关的信息。

首先，在第一个文档里提到，有限风险AI系统的开发者必须向用户披露他们正在与AI系统交互，除非这一点显而易见。而且需要以清晰详细的方式，因为终端用户可能没有技术背景。另外，必须标记合成内容，比如生成文本、图像等，要明确标明人工生成，并用机器可读的格式。现有的工具比如Gradio的水印功能可以帮助实现这一点。还有，部署者如果使用这些系统，也需要告知用户，尤其是涉及情绪识别和生物特征识别的情况，要处理个人数据，并披露深度伪造内容。

然后看第二个文档，里面提到有限风险系统需要满足透明性要求，开发者如果已经提供充分文档，模型更容易被集成。但这里可能更多是关于合规准备，不是直接回答义务内容。

第三个文档主要讲版权和个人数据法规，可能和用户的问题关联不大，但提到开发者需要参考Hugging Face的工具，比如退出机制和个人数据删除工具，但不确定是否属于义务的一部分。

第四个文档详细说明了不同风险等级的义务，其中有限风险需要透明性，比如用户披露和标记合成内容。同时提到部署者的义务，但用户的问题只针对开发者，所以可能需要区分。

第五个文档的快速摘要提到义务包括提供清晰文档、添加工具披露模型信息，遵守版权和隐私规则。Hugging Face的工具如模型卡、水印工具等帮助合规。

综合来看，主要义务是用户披露和标记合成内容。需要确保回答简洁，用三个句子，涵盖这两个要点，并提到工具如Gradio的水印功能。同时注意用户可能混淆开发者与部署者，但问题明确是开发者，所以只提开发者的义务，比如披露和标记，以及工具支持。
</think>
有限风险AI系统开发者需向用户明确披露其正在与AI交互（除非显而易见），并标记合成内容为人工生成（如文本、图像），需以机器可读格式提供，可用Gradio等工具辅助实现。
The%20Era%20of%20Experience%20Paper
加载本地向量数据库 The%20Era%20of%20Experience%20Paper




<think>
Okay, let's tackle this question. The user is asking what AlphaZero discovered. The context provided has several documents, all from the same source, so I need to parse through them.

Looking at the first document, the page content mentions "learning a language, or achieving a scientific breakthrough." But that's pretty general. The second document talks about AlphaProof using RL to generate mathematical proofs beyond existing ones. The third mentions AlphaZero's scalability in RL with self-play in games like chess and Go. The fourth discusses AlphaProof learning different theorem-proof methods. The fifth and sixth documents highlight that AlphaZero discovered new strategies in chess and Go, changing human play.

So the key points here are AlphaZero's new strategies in chess and Go, leading to a paradigm shift. The answer should be concise, three sentences max. Need to make sure it's accurate based on the context. The main discovery is the new strategies in those games, which i