Skip to content
This repository has been archived by the owner on Jul 20, 2020. It is now read-only.

paypal/seazme-sources

Repository files navigation

Confluence to ElasticSearch

A collection of business processes used to mine Confluence, discover information, intelligently index it and expose all of it as a lightning fast search.

Prerequisites

  • Git (of course).
  • JRE 8 must exist in PATH (JRE 7 is fine too, but might be deprecated).
  • Boot 2.x: installation instructions are avaliable here.

Configuration

There are 3 files: a standard resources/log4j.properties and SeazMe specific config.edn as well as mapping.edn which need to be configured properly.

Usage

After making sure that software from Prerequisites is installed, clone this repo and bring up REPL with boot repl command (it might take a few minutes to download all dependencies when run for the first time).

Examples:

export BOOT_JVM_OPTIONS="-Xmx12g -XX:+UseSerialGC"

./<executable> -a reinit -c context-demo-twiki -d elasticsearch-prod
./<executable> -a scan -c context-demo-twiki -d elasticsearch-prod -s twiki-demo-prod
./<executable> -a scan -c context-demo-confluence -d confluence-demo-cache -s confluence-demo-prod
./<executable> -a scan -c context-demo-confluence -d elasticsearch-prod -s confluence-demo-cache

./<executable> -a scan -c context-demo-twiki -d datahub-prod -s twiki-demo-prod
./<executable> -a scan -c context-demo-confluence -d datahub-prod -s confluence-demo-prod

./<executable> -a update -c context-demo-confluence -d datahub-prod -s confluence-demo-prod

./<executable> -a update -c context-hbase -d elasticsearch-prod
./<executable> -a update -c context-hbase

./<executable> -a patch -c context-demo-jira -d datahub-prod -s jira-demo-prod -p "key in (DATAHUB-1,DATAHUB-2)"

Where executable is either sources or java -jar ~sources-<version-tag>.jar build with build. Add -Dlog4j.configuration=file:resources/log4j.properties to java if logging is desired.

Once above is completed (should take a few minutes), a subdirectory db is created. It contains both copy of Confluence and other data sources.

crontab example

runme:

#!/bin/bash
cd ${0%/*}
java -Dlog4j.configuration=file:log4j.properties -jar sources.jar -a update -c context-12h
SHELL=/bin/bash
PATH=/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin

50 * * * * /usr/bin/flock -n /tmp/sources.lockfile seazme-sources/runme &>> runme.log

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages