# **BUILD A RAG PIPELINE USING ZERO ENTROPY** & **RETAB**

[ZeroEntropy](https://www.zeroentropy.dev/) offers an advanced AI-powered document retrieval pipeline that enables you to seamlessly index, search, and retrieve documents with exceptional precision.

_Agentic Retrieval refers to a retrieval system that actively determines the optimal strategy to find information based on the context of a query. Unlike traditional systems, it mimics human reasoning by selecting techniques dynamically, improving its approach over time through feedback and learning._

**More information on ZeroEntropy [here](https://www.zeroentropy.dev/).**

In [1]:
# %pip install retab
# %pip install zeroentropy

### **UPLOAD**

In [None]:
import base64
from datetime import datetime
from zeroentropy import ZeroEntropy

zclient = ZeroEntropy()

document_path = "../assets/docs/Americas-AI-Action-Plan.pdf"

# Read local PDF in binary mode
with open(document_path, "rb") as f:
    file_content = f.read()

base64_content = base64.b64encode(file_content).decode('utf-8')

response = zclient.documents.add(
    collection_name="pdfs",
    path="docs/document.pdf",  # You can make this dynamic if needed
    content={
        "type": "auto",
        "base64_data": base64_content,
    },
    metadata={
        "timestamp": datetime.now().isoformat(),
        "list:tags": ["Trump", "AI Action Plan"],
    }
)
print(response.message)

### **QUERY**

In [3]:
# Query for top k documents

response = zclient.queries.top_documents(
    collection_name="pdfs",
    query="What are the main pillars of America's AI Action Plan?",
    k=1,
)

print(response.results)

[Result(file_url='https://api.zeroentropy.dev/v1/documents/get-document?token=IzwvoeRd5cjOgU08TBuSnnGuq35RlLSQzgg2OHI4H1gVs4vcm0D_-0tS0sBw0YHDMYQUrPmzt_wDah1bXhSJbw_yFBhN_mNXJcQ1-c90yWui_zvRhOrQcuTgs8_u6betqd9j0AmWIWMizNIon0GDQsTDxrBIoS-9aosYHwftyOXjuFNHDAgyN-xFg5E2Sq3Q4z3VJ_jxms552ADy-ol4ckV_hfeDjyH07wm3FBdWOn9ZRHdBPukVgh9lerzWTYoR', metadata=None, path='docs/document.pdf', score=2.398141113209031)]


In [4]:
# Query for top k pages

response = zclient.queries.top_pages(
    collection_name="pdfs",
    query="What are the main pillars of America's AI Action Plan?",
    k=1,
    include_content=True,
)
print(response.results)

[Result(content="OFFICE\n\nTHE WHITE HOUSE\n\nOF THE\n\nPRESIDENT\n\nE PLURIBUS\n\nOFFICE\n\nOF\n\nSCIENCE\n\nUNUM\n\nAND TECHNOLOGY\n\nPOLICY\n\nOF THE\n\nUNITED\n\nWinning the Race\nAMERICA'S\nAI ACTION PLAN\n\nJULY 2025\n", image_url='https://api.zeroentropy.dev/v1/documents/get-page-image?token=Ih1zkqHGCZabMTP1fPKGfLCZO3GyyNyspCp9GKTt-YNSHpckKVM8TS9NE_3XIvgJp4AX806OeQ1oX6KfTEJorCvyTGoqeHS4yHubiRO1BqHSJYtQ_C0pCcobxkwE8eo0FrXYoPRANTxLm0x7dm0ii8QnoRgKM3dbplEZd38aExvjkStsdBInGK06hm22axzXS8FBYEoWIpWCF4X8utffnIw9qC_hbLBBjGxSVyZJ4opNB0niVIMEFmwpE7uGGsNZn6nD0XHVe_9KKG0B77ANJcPxwjKrON03gQn4PF2YWhBpFOTDi44WywQdLTBcgg0w', page_index=0, path='docs/document.pdf', score=1.543041346744185)]


In [5]:
#Query for top k snippets with metadata filtering

response = zclient.queries.top_snippets(
    collection_name="pdfs",
    query="What are the main pillars of America's AI Action Plan?",
    k=1,
    filter={
        "list:tags": {
            "$in": ["Trump"]
        }
    },
    precise_responses=True,
    reranker="zerank-1", # Use our Reranker as a post-processing step
)

print(response.results)

[Result(content="2\nAMERICA'S AI ACTION PLAN\n\nPillar I: Accelerate Al Innovation\n\nAI\n\nAmerica must have the most powerful Al systems in the world, but we must also lead the world\nin creative and transformative application of these systems.", end_index=8002, page_span=[4, 6], path='docs/document.pdf', score=0.8611650064526928, start_index=7781)]
