Skip to content
This repository has been archived by the owner on Jun 12, 2022. It is now read-only.
/ xuanzang Public archive

A fulltext search backend, especially for static website having sitemap.

License

Notifications You must be signed in to change notification settings

Xuanwo/xuanzang

Repository files navigation

xuanzang

xuanzang is a fulltext search backend build upon wukong.

It will load content form sitemap, index automatically and provide a fulltext search engine by RESTful API.

Features

  • Support chinese word segmentation
  • Load content from sitemap

Quick start

Create a config file like following:

port: 8080
db_path: /project/xuanzang/database
index_path: /project/xuanzang/index

source:
  type: sitemap
  url: https://xuanwo.org/sitemap.xml
  duration: 3600

dictionary: /project/xuanzang/dictionary.txt
stop_tokens: /project/xuanzang/stop_tokens.txt

logger:
  level: debug

Use xuanzang:

:) xuanzang -c /path/to/config.yaml

Try to do a fulltext search by curl:

:) curl 127.0.0.1:8080?text=aspire
{"tokens":["aspire"],"docs":[{"title":"友情链接 // Xuanwo's Blog","url":"https://xuanwo.org/blogroll/","content_text":""}],"total":1}

Installation

Get the latest xuanzang for Linux, macOS and Windows from releases

Configuration

xuanzang has following config options:

# port will controls which port xuanzang will used to provide service.
port: 8080
# db_path is the path to xuanzang's database file.
db_path: /path/to/database
# index_path is the path to xuanzang's index folder.
index_path: /path/to/index

# source is where xuanzang to fetch content.
source:
  # Currently, we only support sitemap.
  type: sitemap
  # url is the URL to sitemap
  url: https://example.com/sitemap.xml
  # xuanzang will index content for every duration seconds.
  duration: 3600

# dictionary is the path to dictionary data.
# https://github.com/Xuanwo/xuanzang/data/dictionary.txt is suitable for new comer.
dictionary: /path/to/dictionary
# stop_tokens is the path to stop_tokens data.
# https://github.com/Xuanwo/xuanzang/data/stop_tokens.txt is suitable for new comer.
stop_tokens: /path/to/stop_tokens

# logger will control the logger behavior.
logger:
  # Available value: fatal, panic, error, warn, info, debug.
  level: debug
  # if output set to "stdout", xuanzang will use stdout to do output.
  output: /path/to/log

Contributing

Please see Contributing Guidelines of this project before submitting patches.

LICENSE

The Apache License (Version 2.0, January 2004).

About

A fulltext search backend, especially for static website having sitemap.

Resources

License

Stars

Watchers

Forks

Packages

No packages published