togremlin

Convert data to Gremlin Tinkerpop format for ingestion into gremlin format supported graph databases such as AWS Neptune or ArangoDB.

Example usage

./togr --source ../sampledata/mini.xml

mini.xml

<notes>
   <note>
      <timestamp>2018-08-25T18:42:58+00:00</timestamp>
      <to>Humans</to>
      <from>Dolphins</from>
      <heading>So Long</heading>
      <body>Thanks for All the Fish!</body>
   </note>
   <note>
      <timestamp>2018-09-25T02:30:28+00:00</timestamp>
      <to>Humans</to>
      <from>Douglas Adams</from>
      <heading>Space</heading>
      <body>Space is big. You just won't believe how vastly, hugely, mind- bogglingly big it is. I mean, you may think it is a long way down the road to the chemists, but thats just peanuts to space.</body>
   </note>
</notes>

Output nodes are created in a local output folder

cat output/note.json
[{
	"body": "Thanks for All the Fish!",
	"from": "Dolphins",
	"heading": "So Long",
	"timestamp": "2018-08-25T18:42:58+00:00",
	"to": "Humans"
}, {
	"body": "Space is big. You just won't believe how vastly, hugely, mind- bogglingly big it is. I mean, you may think it is a long way down the road to the chemists, but thats just peanuts to space.",
	"from": "Douglas Adams",
	"heading": "Space",
	"timestamp": "2018-09-25T02:30:28+00:00",
	"to": "Humans"
}]

Example usage with Edges

In order to generate data for a proper graph database, edges are needed. In order to have edges, we need to specify fields in our xml as key values. This can be done by creating an additional key json that declares what fields we wish to be used as key fields. These fields are then duplicated and an additional edge node is generated between the data items.

./togr --source ../sampledata/miniwithedge.xml --key ../sampledata/hitchhikerkey.json

miniwithedge.xml

<?xml version="1.0" encoding="UTF-8"?>
<guide>
 <name>HitchHiker</name>
 <notes>
    <note>
       <timestamp>2018-08-25T18:42:58+00:00</timestamp>
       <to>Humans</to>
       <from>Dolphins</from>
       <heading>So Long</heading>
       <body>Thanks for All the Fish!</body>
    </note>
    <note>
       <timestamp>2018-09-25T02:30:28+00:00</timestamp>
       <to>Humans</to>
       <from>Douglas Adams</from>
       <heading>Space</heading>
       <body>Space is big. You just won't believe how vastly, hugely, mind- bogglingly big it is. I mean, you may think it is a long way down the road to the chemists, but thats just peanuts to space.</body>
    </note>
 </notes>
</guide>

hitchhikerkey.json

{
"guide": {
  "name": "_key",
  "notes": {
    "note": {
      "timestamp": "_key"
    }
  }
}
}

Again, output nodes are created in a local output folder

cat guide.json
[{
	"_key": "HitchHiker",
	"name": "HitchHiker"
}]

Edge nodes are differentiated by a leading _ and identify both keys of related nodes

cat _hasnote.json
[{
	"_from": "guide/HitchHiker",
	"_to": "note/2018-08-25T18:42:58+00:00"
}, {
	"_from": "guide/HitchHiker",
	"_to": "note/2018-09-25T02:30:28+00:00"
}]

cat note.json
[{
	"_key": "2018-08-25T18:42:58+00:00",
	"body": "Thanks for All the Fish!",
	"from": "Dolphins",
	"heading": "So Long",
	"timestamp": "2018-08-25T18:42:58+00:00",
	"to": "Humans"
}, {
	"_key": "2018-09-25T02:30:28+00:00",
	"body": "Space is big. You just won't believe how vastly, hugely, mind- bogglingly big it is. I mean, you may think it is a long way down the road to the chemists, but thats just peanuts to space.",
	"from": "Douglas Adams",
	"heading": "Space",
	"timestamp": "2018-09-25T02:30:28+00:00",
	"to": "Humans"
}]

Currently xml to graph format is provided. If additional source inputs are needed they many be accommodated as well.

Install

Build from source

go get -u github.com/peterlamar/togremlin/...

Docker

First build the image.

docker build -t togr .

Now use that image mounting your current directory into the container.

docker run --rm -it -v $(pwd):/tmp togr [rest_of_command]

Usage

togr [-key pathtokeyfile] [-source pathtosourcefile]

  -key string
    	Filename to retreive graph key information from
  -source string
    	Filename to retrieve data from

License

See LICENSE for details

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
fileutil		fileutil
gremlin		gremlin
sampledata		sampledata
togr		togr
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

togremlin

Example usage

Example usage with Edges

Install

Docker

Usage

License

About

Releases

Packages

Languages

License

peterlamar/togremlin

Folders and files

Latest commit

History

Repository files navigation

togremlin

Example usage

Example usage with Edges

Install

Docker

Usage

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages