self-host-etl-pipeline

A translation of Building ETL pipelines with Clojure and transducers to self-hosted ClojureScript.

Motivation: I was curious if this code could be cleanly translated to self-host, using Planck's IO facilities and what the performance difference might be for that environment when using transducers.

The code is in the etl-pipeline.core namespace.

Usage

Start up Planck, setting it to use src for code:

$ planck -c src

Load the code and change to the namespace:

(require 'etl-pipeline.core)
(in-ns 'etl-pipeline.core)

Create a dummy JSON file:

(create-file)

Time processing without transducers:

(time (process ["/tmp/dummy.json"]))
(time (process (repeat 8 "/tmp/dummy.json")))

Time processing with transducers:

(time (process-with-transducers ["/tmp/dummy.json"]))
(time (process-with-transducers (repeat 8 "/tmp/dummy.json")))

Comparison

Processing without transducers:

1 file:
Clojure: 2857.870524 msecs
Self-host: 8620.306281 msecs

8 files:
Clojure: 29106.211138 msecs
Self-host: 72213.714800 msecs

Processing with transducers:

1 file:
Clojure: 2595.401761 msecs
Self-host: 7374.490957 msecs

8 files:
Clojure: 19478.215058 msecs
Self-host: 60890.650729 msecs

Interestingly, Planck without transducers ends up using about 1 1/2 cores, while with transducers, it uses 1 core. (Perhaps this reflects JavaScriptCore collecting garbage in the non-transducers use case.)

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
dev		dev
script		script
src/etl_pipeline		src/etl_pipeline
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
project.clj		project.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dev

dev

script

script

src/etl_pipeline

src/etl_pipeline

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

project.clj

project.clj

Repository files navigation

self-host-etl-pipeline

Usage

Comparison

About

Releases

Packages

Languages

License

mfikes/self-host-etl-pipeline

Folders and files

Latest commit

History

Repository files navigation

self-host-etl-pipeline

Usage

Comparison

About

Resources

License

Stars

Watchers

Forks

Languages