Skip to content

Commit

Permalink
updated pipeline - edge record
Browse files Browse the repository at this point in the history
  • Loading branch information
wshayes committed Aug 27, 2018
1 parent 43add12 commit 7c1f740
Show file tree
Hide file tree
Showing 10 changed files with 305 additions and 228 deletions.
62 changes: 19 additions & 43 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,53 +1,29 @@
# Change Log
# Changelog
All notable changes to the BEL module will be documented in this file.

## [v0.7.3](https://github.com/belbio/bel/tree/v0.7.3) (2018-01-22)
[Full Changelog](https://github.com/belbio/bel/compare/v0.7.2...v0.7.3)
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [v0.7.2](https://github.com/belbio/bel/tree/v0.7.2) (2018-01-21)
[Full Changelog](https://github.com/belbio/bel/compare/v0.7.1...v0.7.2)

## [v0.7.1](https://github.com/belbio/bel/tree/v0.7.1) (2018-01-21)
[Full Changelog](https://github.com/belbio/bel/compare/v0.7.0...v0.7.1)
## [Unreleased]
[Full Commit Log](https://github.com/belbio/bel/compare/v0.4.3...HEAD)

## [v0.7.0](https://github.com/belbio/bel/tree/v0.7.0) (2018-01-20)
[Full Changelog](https://github.com/belbio/bel/compare/v0.6.0...v0.7.0)
### Added
-

## [v0.6.0](https://github.com/belbio/bel/tree/v0.6.0) (2018-01-08)
[Full Changelog](https://github.com/belbio/bel/compare/v0.5.8-dev...v0.6.0)
### Changed
-

## [v0.5.8-dev](https://github.com/belbio/bel/tree/v0.5.8-dev) (2018-01-06)
[Full Changelog](https://github.com/belbio/bel/compare/v0.5.7-dev...v0.5.8-dev)
### Fixed
-

## [v0.5.7-dev](https://github.com/belbio/bel/tree/v0.5.7-dev) (2018-01-05)
[Full Changelog](https://github.com/belbio/bel/compare/v0.5.6-dev...v0.5.7-dev)
## [0.4.3] Aug 16, 2018
[Full Commit Log](https://github.com/belbio/bel_api/compare/v0.3.1...v0.4.3)

## [v0.5.6-dev](https://github.com/belbio/bel/tree/v0.5.6-dev) (2018-01-02)
[Full Changelog](https://github.com/belbio/bel/compare/v0.5.5-dev...v0.5.6-dev)
### Changed
- Serving Swagger from BEL API codebase instead of separate documentation location

## [v0.5.5-dev](https://github.com/belbio/bel/tree/v0.5.5-dev) (2018-01-02)
[Full Changelog](https://github.com/belbio/bel/compare/v0.5.4-dev...v0.5.5-dev)
### Fixed
- Fixed term completion species_id bug
- Added missing parameters to Swagger docs

## [v0.5.4-dev](https://github.com/belbio/bel/tree/v0.5.4-dev) (2018-01-02)
[Full Changelog](https://github.com/belbio/bel/compare/v0.5.3...v0.5.4-dev)

**Merged pull requests:**

- Bump coverage from 4.4.1 to 4.4.2 [\#5](https://github.com/belbio/bel/pull/5) ([dependabot[bot]](https://github.com/apps/dependabot))
- Bump mmh3 from 2.3.1 to 2.5.1 [\#4](https://github.com/belbio/bel/pull/4) ([dependabot[bot]](https://github.com/apps/dependabot))
- Bump TatSu from 4.2.2 to 4.2.5 [\#3](https://github.com/belbio/bel/pull/3) ([dependabot[bot]](https://github.com/apps/dependabot))
- Bump mypy from 0.550 to 0.560 [\#2](https://github.com/belbio/bel/pull/2) ([dependabot[bot]](https://github.com/apps/dependabot))
- Bump python\_arango from 3.5.0 to 3.12.1 [\#1](https://github.com/belbio/bel/pull/1) ([dependabot[bot]](https://github.com/apps/dependabot))

## [v0.5.3](https://github.com/belbio/bel/tree/v0.5.3) (2017-12-02)
[Full Changelog](https://github.com/belbio/bel/compare/v0.5.2...v0.5.3)

## [v0.5.2](https://github.com/belbio/bel/tree/v0.5.2) (2017-12-01)
[Full Changelog](https://github.com/belbio/bel/compare/v0.5.1...v0.5.2)

## [v0.5.1](https://github.com/belbio/bel/tree/v0.5.1) (2017-11-16)
[Full Changelog](https://github.com/belbio/bel/compare/list...v0.5.1)

## [list](https://github.com/belbio/bel/tree/list) (2017-11-15)


\* *This Change Log was automatically generated by [github_changelog_generator](https://github.com/skywinder/Github-Changelog-Generator)*
20 changes: 13 additions & 7 deletions bel/db/arangodb.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,16 +177,15 @@ def delete_database(client, db_name, username=None, password=None):
log.warn('No arango database {db_name} to delete, does not exist')


# TODO Convert ArangoDB loading to use bulk_import - this will be a significant refactor
# since we can push any document via the generator but bulk_import only works with
# a single collection at a time
# http://python-driver-for-arangodb.readthedocs.io/en/master/classes.html?highlight=bulk#arango.collections.Collection.import_bulk
def batch_load_docs(db, doc_iterator):
def batch_load_docs(db, doc_iterator, on_duplicate='replace'):
"""Batch load documents
Args:
db: ArangoDB client database handle
doc_iterator: function that yields (collection_name, doc_key, doc)
on_duplicate: defaults to replace, but can be error, update, replace or ignore
https://python-driver-for-arangodb.readthedocs.io/en/master/specs.html?highlight=import_bulk#arango.collection.StandardCollection.import_bulk
"""

batch_size = 10000
Expand All @@ -195,6 +194,10 @@ def batch_load_docs(db, doc_iterator):
collections = {}
docs = {}

if on_duplicate not in ['error', 'update', 'replace', 'ignore']:
log.error(f'Bad parameter for on_duplicate: {on_duplicate}')
return

for (collection_name, doc) in doc_iterator:
if collection_name not in collections:
collections[collection_name] = db.collection(collection_name)
Expand All @@ -207,12 +210,12 @@ def batch_load_docs(db, doc_iterator):
if counter % batch_size == 0:
log.debug(f'Bulk import arangodb: {counter}')
for cname in docs:
collections[cname].import_bulk(docs[cname], on_duplicate='replace')
collections[cname].import_bulk(docs[cname], on_duplicate=on_duplicate)
docs[cname] = []

log.debug(f'Bulk import arangodb: {counter}')
for cname in docs:
collections[cname].import_bulk(docs[cname], on_duplicate='replace')
collections[cname].import_bulk(docs[cname], on_duplicate=on_duplicate)
docs[cname] = []


Expand All @@ -229,5 +232,8 @@ def arango_id_to_key(_id):
key = re.sub("[^a-zA-Z0-9\_\-\:\.\@\(\)\+\,\=\;\$\!\*\'\%]+", '_', _id)
if len(key) > 254:
log.error(f'Arango _key cannot be longer than 254 chars: Len={len(key)} Key: {key}')
elif len(key) < 1:
log.error(f'Arango _key cannot be an empty string: Len={len(key)} Key: {key}')

return key

0 comments on commit 7c1f740

Please sign in to comment.