Skip to content
Nuxeo Elasticsearch VCS sync checker
Branch: master
Clone or download
Pull request Compare This branch is 39 commits ahead, 1 commit behind bdelbosc:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src/main SUPINT-1466: increase number of document types Jun 21, 2019
.gitignore NXP-14811: Initial commit Jul 23, 2014
LICENSE
README.md
esync-mongodb.conf.example
esync-mssql.conf.example NXP-24356: Alter examples to exclude the AclChecker Mar 22, 2018
esync-postgresql.conf.example NXP-24356: Alter examples to exclude the AclChecker Mar 22, 2018
pom.xml Post release 3.0 Mar 23, 2018

README.md

esync a tool to compare Nuxeo repository and Elasticsearch content

When using nuxeo-elasticsearch we want to be sure that the repository content is in sync with the content indexed in Elasticsearch.

This tool enables to detect difference between the Nuxeo database repository and the indexed content in Elasticsearch.

Install

Download

Download the nuxeo-esync-VERSION-capsule-full.jar from https://maven.nuxeo.org.

Version Support

Esync Version Nuxeo Version Elasticsearch version
1.1.X 7.10 1.5.2
2.0.X 8.10 2.3.5
3.0.X 9.10 5.6.4

From esync version 3 the Elasticsearch rest client is used instead of the transport client.

Building from sources

Create the all in one jar:

mvn package

The jar is located here:

./target/nuxeo-esync-VERSION-capsule-full.jar

QA results

Build Status

Usage

Configuration

Create a /etc/esync.conf or ~/.esync.conf using one of the samples provided :

  • esync-postgresql.conf.example
  • esync-mssql.conf.example
  • esync-mongodb.conf.example

You will need to configure the database and Elasticsearch access.

Refer to the source for the full list of options available.

Invocation

 # using a default conf located in /etc/esync.conf or ~/.esync.conf
 java -jar /path/to/nuxeo-esync-$VERSION-capsule-full.jar

 # using an another config file
 java -jar /path/to/nuxeo-esync-$VERSION-capsule-full.jar /path/to/config-file.conf

 # customizing the log
 java -Dlog4j.configuration=file:mylog4j.xml -jar nuxeo-esync-$VERSION-capsule-full.jar

You can find the default log4.xml here default log file is in /tmp/trace.log.

Checkers

The tool runs concurrently different checkers.

Checkers compare the reference database aka expected with the Elasticsearch content aka actual. You should run a full re-index on Elasticsearch before running the tool.

Checkers report different things:

  • Errors like a different number of documents, total or per document type
  • Missing or spurious document types in Elasticsearch
  • Missing documents ids in Elasticsearch
  • Spurious documents ids in Elasticsearch
  • Difference in document properties like ACL, path...

Here is a list of available checkers.

Cardinality Checker

This is a quick check to count the total number of documents in the db and Elasticsearch. There are 4 document counts:

  • documents without version and proxy
  • version documents
  • proxy documents
  • orphan documents other than version

False positive cases:

  • this does not garantee that we have the same documents indexed, just the same number.

False negative cases:

  • some system documents are not indexed (like CommentRelation or PublicationRelation)

Type Cardinality Checker

Checks the number of each document types for documents and versions

False positive cases:

  • this does not guarantee that we have the same documents indexed, just the same number for a primary type.

False negative cases:

  • some system documents are not indexed and reported as missing type

Type Document Lister

When there is a difference raise by the Type Cardinality checker the list of ids for this type is compared, to gives the missing and spurious document ids.

False positive cases: None False negative cases: None

It can takes time and memory to list all doc ids from the database.

ACL Checker

It performs 2 checks:

  • Checks that all documents that hold an ACL are well indexed in ES
  • Checks that all documents in ES have a correct ACL

False positive cases:

  • some ACL can be more permissive on ES

False negative cases:

  • none

License

Apache License, Version 2.0

About Nuxeo

Nuxeo dramatically improves how content-based applications are built, managed and deployed, making customers more agile, innovative and successful. Nuxeo provides a next generation, enterprise ready platform for building traditional and cutting-edge content oriented applications. Combining a powerful application development environment with SaaS-based tools and a modular architecture, the Nuxeo Platform and Products provide clear business value to some of the most recognizable brands including Verizon, Electronic Arts, Netflix, Sharp, FICO, the U.S. Navy, and Boeing. Nuxeo is headquartered in New York and Paris. More information is available at www.nuxeo.com.

You can’t perform that action at this time.