Google Search Tool

A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches and extract results. It can be used directly as a command-line tool or as a Model Context Protocol (MCP) server to provide real-time search capabilities to AI assistants like Claude.

Key Features

Local SERP API Alternative: No need to rely on paid search engine results API services, all searches are executed locally
Advanced Anti-Bot Detection Bypass Techniques:
- Intelligent browser fingerprint management that simulates real user behavior
- Automatic saving and restoration of browser state to reduce verification frequency
- Smart headless/headed mode switching, automatically switching to headed mode when verification is needed
- Randomization of device and locale settings to reduce detection risk
Raw HTML Retrieval: Ability to fetch the raw HTML of search result pages (with CSS and JavaScript removed) for analysis and debugging when Google's page structure changes
Page Screenshot: Automatically captures and saves a full-page screenshot when saving HTML content
MCP Server Integration: Provides real-time search capabilities to AI assistants like Claude without requiring additional API keys
Completely Open Source and Free: All code is open source with no usage restrictions, freely customizable and extensible

Technical Features

Developed with TypeScript, providing type safety and better development experience
Browser automation based on Playwright, supporting multiple browser engines
Command-line parameter support for search keywords
MCP server support for AI assistant integration
Returns search results with title, link, and snippet
Option to retrieve raw HTML of search result pages for analysis
JSON format output
Support for both headless and headed modes (for debugging)
Detailed logging output
Robust error handling
Browser state saving and restoration to effectively avoid anti-bot detection

Installation

# Install from source
git clone https://github.com/web-agent-master/google-search.git
cd google-search
# Install dependencies
npm install
# Or using yarn
yarn
# Or using pnpm
pnpm install

# Compile TypeScript code
npm run build
# Or using yarn
yarn build
# Or using pnpm
pnpm build

# Link package globally (required for MCP functionality)
npm link
# Or using yarn
yarn link
# Or using pnpm
pnpm link

Windows Environment Notes

This tool has been specially adapted for Windows environments:

.cmd files are provided to ensure command-line tools work properly in Windows Command Prompt and PowerShell
Log files are stored in the system temporary directory instead of the Unix/Linux /tmp directory
Windows-specific process signal handling has been added to ensure proper server shutdown
Cross-platform file path handling is used to support Windows path separators

Usage

Command Line Tool

# Direct command line usage
google-search "search keywords"

# Using command line options
google-search --limit 5 --timeout 60000 --no-headless "search keywords"

# Or using npx
npx google-search-cli "search keywords"

# Run in development mode
pnpm dev "search keywords"

# Run in debug mode (showing browser interface)
pnpm debug "search keywords"

# Get raw HTML of search result page
google-search "search keywords" --get-html

# Get HTML and save to file
google-search "search keywords" --get-html --save-html

# Get HTML and save to specific file
google-search "search keywords" --get-html --save-html --html-output "./output.html"

Available Command Line Options

--limit <number>: Limit the number of results (default: 10)
--timeout <number>: Set timeout in milliseconds (default: 30000)
--no-headless: Deprecated: The tool now always starts in headless mode and automatically switches to headed mode if CAPTCHA verification is encountered
--state-file <path>: Specify browser state file path (default: ./browser-state.json)
--no-save-state: Do not save browser state
--get-html: Get raw HTML of search results page instead of parsed results
--save-html: Save HTML to file (used with --get-html)
--html-output <path>: Specify HTML output file path (used with --get-html --save-html)

MCP Server Usage

Configuration

Add the following configuration to your MCP settings file (e.g., ~/.cursor/mcp.json or Claude Desktop configuration):

{
  "mcpServers": {
    "google-search": {
      "command": "google-search-mcp"
    }
  }
}

After configuring, restart your AI assistant (Claude Desktop or Cursor) to enable the Google Search tool.

How to Use in AI Assistants

Once configured, you can directly ask the AI assistant to search for information:

"Search for the latest information about TypeScript 5.0"
"Find tutorials on using Playwright"
"What are the recent developments in AI?"

The AI assistant will automatically call the Google Search tool, retrieve real-time information from the web, and provide answers based on the search results.

Output Format

Standard Search Results

{
  "query": "search keywords",
  "results": [
    {
      "title": "Result title",
      "link": "https://example.com",
      "snippet": "Result summary..."
    }
  ]
}

HTML Retrieval Results

When using --get-html option:

{
  "query": "search keywords",
  "url": "https://www.google.com/search?q=...",
  "originalHtmlLength": 123456,
  "cleanedHtmlLength": 78900,
  "savedPath": "./google-search-html/query-timestamp.html",
  "screenshotPath": "./google-search-html/query-timestamp.png",
  "htmlPreview": "First 500 characters of HTML..."
}

Browser State Management

This tool automatically saves browser state to avoid repeated CAPTCHA verification:

First run: If CAPTCHA verification is encountered, the tool will automatically switch to headed mode (visible browser) and wait for you to complete verification
State saving: After successful verification, browser state is automatically saved to ~/.google-search-browser-state.json (for MCP server) or current directory's browser-state.json (for command line tool)
Subsequent runs: The tool uses saved state to bypass verification, enabling fast headless searches
Fingerprint management: Browser fingerprint configuration is automatically saved and reused to reduce detection risk

Technical Details

Anti-Bot Detection Strategy

This tool uses multiple techniques to bypass Google's anti-bot detection:

Browser Fingerprint Simulation:
- Automatically detects and uses host machine's timezone, language and other settings
- Randomizes device types (Desktop Chrome, Firefox, Safari, Edge)
- Simulates real browser environment (WebGL, plugins, screen resolution, etc.)
Behavior Simulation:
- Random delays simulate human input speed
- Uses real keyboard events instead of direct value setting
- Maintains consistent behavior patterns
State Persistence:
- Saves and restores cookies and local storage
- Maintains consistent browser fingerprint
- Reduces verification frequency by reusing sessions
Intelligent Mode Switching:
- Starts in headless mode for optimal performance
- Automatically switches to headed mode when CAPTCHA is detected
- Prompts user to complete verification, then saves state for future use

Logging

All logs are saved to system temporary directory:

Unix/Linux/macOS: /tmp/google-search-logs/google-search.log
Windows: %TEMP%\google-search-logs\google-search.log

Log level can be controlled via LOG_LEVEL environment variable:

LOG_LEVEL=debug google-search "search keywords"

Development

# Install dependencies
pnpm install

# Compile TypeScript
pnpm build

# Run in development mode
pnpm dev "search keywords"

# Run in debug mode (show browser)
pnpm debug "search keywords"

# Run MCP server
pnpm mcp

# Test build
pnpm test:build

Troubleshooting

CAPTCHA Verification Issues

If you frequently encounter CAPTCHA verification:

Let the tool automatically switch to headed mode and complete verification
Ensure browser state files are properly saved
Try using different Google domains (the tool automatically randomizes this)
Avoid making too many requests in a short time

Playwright Installation Issues

If Playwright browser installation fails:

# Manually install Chromium
npx playwright install chromium

# Or install all browsers
npx playwright install

Permission Issues

On Unix/Linux systems, if you encounter permission issues:

# Make bin files executable
chmod +x bin/google-search
chmod +x bin/google-search-mcp

Contributing

Issues and Pull Requests are welcome! Please ensure code follows existing style and all tests pass.

License

ISC License

Credits

This project is maintained by web-agent-master.

Related Projects

Playwright - Modern web automation library
Model Context Protocol - Protocol for AI assistant tool integration
Claude Desktop - Anthropic's AI assistant desktop application

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bin		bin
output		output
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Google Search Tool

Key Features

Technical Features

Installation

Windows Environment Notes

Usage

Command Line Tool

Available Command Line Options

MCP Server Usage

Configuration

How to Use in AI Assistants

Output Format

Standard Search Results

HTML Retrieval Results

Browser State Management

Technical Details

Anti-Bot Detection Strategy

Logging

Development

Troubleshooting

CAPTCHA Verification Issues

Playwright Installation Issues

Permission Issues

Contributing

License

Credits

Related Projects

About

Uh oh!

Releases

Packages

Languages

License

ACSGenUI/mcp-google-search

Folders and files

Latest commit

History

Repository files navigation

Google Search Tool

Key Features

Technical Features

Installation

Windows Environment Notes

Usage

Command Line Tool

Available Command Line Options

MCP Server Usage

Configuration

How to Use in AI Assistants

Output Format

Standard Search Results

HTML Retrieval Results

Browser State Management

Technical Details

Anti-Bot Detection Strategy

Logging

Development

Troubleshooting

CAPTCHA Verification Issues

Playwright Installation Issues

Permission Issues

Contributing

License

Credits

Related Projects

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages