A blog indexing service for static-site generation.
- Authenticated reindex API that atomically rebuilds indexed posts.
- Full-text search API returning
title,url,snippet, and score metadata. - Chinese-aware tokenization for query term extraction (
2-gramtokenizer in app layer). - MySQL full-text index with
ngramparser for Chinese retrieval support. - TOML-based configuration as the only runtime configuration source.
- Go 1.25+
- MySQL 9
- HTTP:
github.com/gin-gonic/gin - Logging:
github.com/sirupsen/logrus - Config parser:
github.com/pelletier/go-toml/v2
main.go: runnable API entrypoint.data: data models and validation.service: business services and orchestration.adapter/http: Gin router, handlers, auth middleware.adapter/storage/mysql: MySQL repository implementation.adapter/index: tokenizer and rune-safe snippet builder.config: TOML config loading and validation.app: dependency wiring and server bootstrap.migrations: SQL schema migrations.
Example config: config.example.toml
Create runtime config from the example file:
cp config.example.toml config.tomlRuntime config is file-only. Keep real secrets in your local config.toml and never commit them.
Apply migrations/001_init_posts.sql to your MySQL database before running the service.
go run .The default startup config path is config.toml. If the file is missing or incomplete, startup fails.
GET /healthzGET /readyz
POST /v1/index
Headers:
Authorization: Bearer <token>
Body:
{
"posts": [
{
"title": "Example",
"url": "https://example.com/p/1",
"content": "Post content...",
"published_at": 1710832800
}
]
}Index semantics:
- Each call uploads the full current post set.
- The service reindexes all rows in one transaction.
- Post IDs are generated by the database automatically.
GET /v1/search?q=keyword&page=1&page_size=10
published_at in requests and search responses uses Unix timestamp seconds.
Response item fields:
titleurlsnippetscore(optional)matched_terms(optional)
go test ./...go test -race ./...- Database layer uses MySQL full-text index with
WITH PARSER ngram. - Application layer uses deterministic
2-gramtokenization for query term extraction and matched-term/snippet generation. - Snippet logic is rune-safe to avoid breaking Unicode text boundaries.