Skip to content

spencerhperkins/PatentSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Palisade

Palisade is an AI-native patent search engine that performs semantic retrieval over patent claims. It ingests public patent data, builds a vector index, and exposes both a web UI and an MCP server so AI clients can search patents directly.

What is Palisade?

  • AI-native patent search for semantic similarity across claims.
  • End-to-end pipeline from raw patent XML to searchable embeddings.
  • Multiple interfaces: web app, JSON API, and MCP server for AI clients.

Architecture

Palisade is organized into five main layers:

  1. Ingestion

    • Parses USPTO patent XML and extracts claims, metadata, and specification text.
  2. Indexing

    • Generates claim embeddings with OpenAI.
    • Stores claims and metadata in Postgres.
    • Builds an HNSW index for fast similarity search.
  3. Retrieval

    • Runs semantic search with filters (applicant, title, dates, claim length).
    • Returns ranked matches with similarity scores.
  4. API

    • Flask app serving the web UI and JSON endpoints.
    • Primary search endpoint: POST /api/patents/semantic-claims-search.
  5. MCP Server

    • Exposes the semantic_claims_search tool over MCP.
    • Designed for AI clients such as ChatGPT and Claude.

Local Development

Requirements

  • Python 3.12
  • Postgres 14+
  • OpenAI API key (for embeddings)

Install

python -m venv .venv
source .venv/bin/activate
pip install -r patent_search/requirements.txt

Environment

Set your database and API keys (via .env or environment variables):

  • PATENTS_DATABASE_URL (or PATENTS_DB_NAME, PATENTS_DB_USER, PATENTS_DB_PASSWORD, PATENTS_DB_HOST, PATENTS_DB_PORT)
  • OPENAI_API_KEY
  • Optional S3 settings: PATENT_S3_BUCKET, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION

Run the Web App

python app.py

The app runs on http://localhost:5001.

Ingest Public Patent Data

  1. Download USPTO bulk patent XML data.
  2. Run ingestion:
python patent_search/extractor/run_extraction.py dir /path/to/uspto/xml

Or for a single file:

python patent_search/extractor/run_extraction.py single /path/to/file.xml

This step parses patents, embeds claims, and stores them in Postgres.

MCP Server

Start the MCP server locally:

python mcp_server/mcp_server.py

Default host: 0.0.0.0 and port: 8000.

Docker

Build and run the MCP server with Docker Compose:

docker compose -f mcp_server/docker-compose.yml up --build

The Dockerfile lives at mcp_server/Dockerfile.

Tool

  • Name: semantic_claims_search
  • Required: query
  • Optional: limit, min_similarity, applicant, title, grant_number, application_number, application_date_from, application_date_to, grant_date_from, grant_date_to, min_words, max_words, min_elements, max_elements, ef_search

AI Clients

Add an MCP server in your AI client using either:

  • Your hosted URL (example: https://api.palisadeinnovation.com)
  • Or a local URL like http://localhost:8000

License

This project is licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors