Product Search Benchmark — PostgreSQL vs MongoDB

A Rails application benchmarking substring search strategies for product inventory systems, comparing PostgreSQL with pg_trgm against MongoDB with regex.

Overview

This project explores how to implement fast, accent-tolerant, case-insensitive substring search for a product autocomplete feature — the kind used in inventory or e-commerce systems where a user types "com g" and expects to find "Água com Gás 500ml".

Key findings

Strategy	Real time / query (500k records)	Index used
PostgreSQL + `pg_trgm` (GIN)	~0.57ms	✅ Bitmap Index Scan
MongoDB + B-tree on `normalized_name`	~1.63ms	❌ Ignored by planner

PostgreSQL with pg_trgm is ~3x faster for substring search at scale.

Stack

Ruby on Rails 7.1
PostgreSQL with pg_trgm extension
MongoDB via Mongoid
RSpec + DatabaseCleaner
Benchmark (stdlib)

How it works

Normalized name

Every product stores a normalized_name column — a lowercase, accent-stripped version of the name generated automatically on save:

def self.normalize_string(str)
  str
    .unicode_normalize(:nfd)  # decompose accents: "á" → "a" + combining mark
    .gsub(/\p{Mn}/, "")       # strip combining marks
    .downcase
    .strip
end

"Água com Gás 500ml" → "agua com gas 500ml"

This makes searches accent-tolerant without any special DB configuration.

Token-based search

Queries are split into tokens and applied as individual AND filters:

def self.search(query)
  tokens = normalize_string(query).split.reject(&:blank?)
  return none if tokens.empty?

  scope = all
  tokens.each { |token| scope = apply_token_filter(scope, token) }
  scope.limit(20)
end

Searching "agua gas" finds any product containing both words in any order — much more natural for autocomplete than an exact substring match.

PostgreSQL filter

scope.where("normalized_name LIKE ?", "%#{token}%")

MongoDB filter

scope.where(normalized_name: /#{Regexp.escape(token)}/i)

Setup

Requirements

Ruby 3.x
PostgreSQL 14+
MongoDB 6+

Install

git clone <repo>
cd estoque
bundle install

Database setup

# PostgreSQL
rails db:create db:migrate

# MongoDB (indexes)
rails runner "ProductMongo.create_indexes"

Enable pg_trgm

The migration enables the extension and creates the GIN index automatically:

rails db:migrate

If you need to run it manually in psql:

CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX index_products_on_normalized_name_trigram
  ON products USING gin (normalized_name gin_trgm_ops);

Running the benchmark

rspec spec/models/product_benchmark_spec.rb --format documentation

This will:

Insert 500,000 products into both PostgreSQL and MongoDB using bulk inserts
Run 100 search queries against each backend
Print Benchmark results side by side
Clean up all records after

Expected output:

                                           user     system      total        real
Product.search('com g'):               0.039609   0.007008   0.046617 (  0.057242)

                                           user     system      total        real
ProductMongo.search('com g'):          0.128274   0.011873   0.140147 (  0.162563)

Reading the benchmark output

Column	What it measures
`user`	CPU time spent by the Ruby process
`system`	CPU time spent by the kernel (I/O, syscalls)
`total`	`user + system`
`real`	Wall-clock time — what the user actually waits

real is the most relevant column for perceived performance.

Running unit tests

rspec spec/models/product_spec.rb --format documentation

Covers:

Substring search with accented input
Case-insensitive search
Multi-token search
Empty result cases
Result limit (max 20)

Why pg_trgm wins for substring search

A standard B-tree index only accelerates LIKE 'prefix%' queries (anchored at the start). For LIKE '%substring%', it is ignored and the database falls back to a sequential scan.

pg_trgm solves this by indexing every trigram (group of 3 characters) in the string. A GIN index on trigrams allows PostgreSQL to use an index scan even for LIKE '%...%' patterns.

MongoDB's B-tree index faces the same limitation with leading-wildcard regex (/substring/). The planner ignores the index and performs a collection scan. The MongoDB equivalent of pg_trgm is Atlas Search (Lucene-based), but it is only available on MongoDB Atlas (cloud).

Conclusion: for a self-hosted inventory system with autocomplete, PostgreSQL + pg_trgm is the pragmatic and performant choice.

Project structure

app/
  models/
    concerns/
      product_searchable.rb   # shared search logic (tokens, normalization)
    product.rb                # ActiveRecord model (PostgreSQL)
    product_mongo.rb          # Mongoid model (MongoDB)
spec/
  models/
    product_spec.rb           # unit tests (correctness)
    product_benchmark_spec.rb # performance benchmark
  support/
    shared_examples/
      product_search.rb       # shared RSpec examples for both backends
db/
  migrate/
    ..._create_products.rb
    ..._add_trigram_index_to_products.rb

Caching (optional)

For production autocomplete, wrap search with Rails cache to avoid hitting the database on repeated queries:

def self.search(query, limit: 20)
  normalized_query = normalize_string(query)

  Rails.cache.fetch("product_search/#{normalized_query}/#{limit}", expires_in: 5.minutes) do
    # ... search logic
  end
end

Invalidate on record changes:

after_commit { Rails.cache.delete_matched("product_search/*") }

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
.kamal		.kamal
app		app
bin		bin
config		config
db		db
lib/tasks		lib/tasks
log		log
public		public
script		script
spec		spec
storage		storage
test		test
tmp		tmp
vendor		vendor
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.rspec		.rspec
.rubocop.yml		.rubocop.yml
.ruby-version		.ruby-version
.solargraph.yml		.solargraph.yml
Dockerfile		Dockerfile
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md
Rakefile		Rakefile
config.ru		config.ru
tags		tags

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Product Search Benchmark — PostgreSQL vs MongoDB

Overview

Key findings

Stack

How it works

Normalized name

Token-based search

PostgreSQL filter

MongoDB filter

Setup

Requirements

Install

Database setup

Enable pg_trgm

Running the benchmark

Reading the benchmark output

Running unit tests

Why pg_trgm wins for substring search

Project structure

Caching (optional)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Product Search Benchmark — PostgreSQL vs MongoDB

Overview

Key findings

Stack

How it works

Normalized name

Token-based search

PostgreSQL filter

MongoDB filter

Setup

Requirements

Install

Database setup

Enable pg_trgm

Running the benchmark

Reading the benchmark output

Running unit tests

Why pg_trgm wins for substring search

Project structure

Caching (optional)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages