Skip to content
/ PYELT Public
forked from NLHEALTHCARE/PYELT

Pyelt is a python etl framework for creating and filling data vault datawarehouses

License

Notifications You must be signed in to change notification settings

spgraham/PYELT

 
 

Repository files navigation

PYELT

Usage

This example will create and fill the historical staging area:

pipeline = Pipeline(config)
pipe = pipeline.get_or_create_pipe('test_source', source_config)

source_file = CsvFile(get_root_path() + '/sample_data/patienten1.csv', delimiter=';')
source_file.reflect()
source_file.set_primary_key(['patientnummer'])
mapping = SourceToSorMapping(source_file, 'persoon_hstage', auto_map=True)
pipe.mappings.append(mapping)

pipeline.run()

More examples can be found on the GitHub repository of NL Healthcare.

Introduction

Pyelt is a Python DDL and ETL framework for creating and loading Data Vaults for datawarehousing.

Pyelt supports several data-layers, including Source-of-Record (SOR), Raw datavault (RDV), Business datavault (BDV) and Datamarts (DM)

Pyelt can import data from several different source systems such as fixed length files, csv-files, and different databases.

Pyelt is developed to run on a postgreSQL database.

Pyelt uses the SQLAlchemy.core only for the connection and for reflection. All other SQL statements (ddl, copy, insert and update statements) are created by the pyelt framework itself.

Write your own mappings to transfer and transform data from sources via staging into the data ware house.

Content

(current documentation on pythonhosted is only in dutch):

work in progress:

Background

The pyelt framework is presently under development at NL Healthcare, with the aim to implement our next-generation datawarehouse (DWH2.0). It serves as the foundation for our work in the area of clinical business intelligence (CBI) and machine-learning.

Architectural cornerstones of this project are:

About

Pyelt is a python etl framework for creating and filling data vault datawarehouses

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%