Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RunFlowTask for 5.x flows #1867

Conversation

akshaysonvane
Copy link
Contributor

Add RunLegacyFlowTask for 4.x flows
Add access to FlowRunnerImpl in FlowManager

Add RunLegacyFlowTask for 4.x flows
Add access to FlowRunnerImpl in FlowManager
@aebadirad aebadirad merged commit 51d9fcf into Marklogic-retired:feature/DHFPROD-1710 Feb 12, 2019
aebadirad added a commit that referenced this pull request Mar 1, 2019
* DHFPROD-1710 - stubs

* DHFPROD-1710 add stubs for flowrunnerimpl and flow.sjs

* Remove old e-node code and begin placing in classes

* DHFPROD-1710 - making some minor tweaks to add more than one documentwrite type

* Deploy mapping artifacts even in the absence of entity artifacts

* Creating FlowRunner Class (#1865)

* Creating FlowRunner, Collector and listeners

* Implement the new Collector

* Pass "step" to ml:runFlow endpoint

* Debug export tweak + lion share of provenance work

Implemented:
  createFlowRecord(flowId, options)  -- currently unused, as I don't believe flow info is tracked, just the use of the flow.
  createJobRecord(jobId, flowId, info)
  createStepRecord(jobId, flowId, stepType, docURI, info) {

TODO:
 * Wrap use of provenance API in eval() statement & point to JOBDATABASE
 * Incorporate info.metadata property key/values into provenance data
 * Create query function to search by document uri and return full history
 * Tombstone prov function when a document is deleted (need to determine what this is)

* Add mlLogLevel to thec ustom tokens and default settings

* Actually check in the prior commit of mlLogLevel custom tokens and default settting

* Add datahub base object, constructors where necessary, stubs in some spots for right now

* DHFPROD-1802 - stub and empty constructor for process.sjs, todo fill out

* create new extensions and transforms for 5, mlcp transform and hubstats/hubversion

* Expose the hub log level variable ont he hubconfing interface

* Add defaultConfig if one isn't provided at datahub object creation for override purposes

* Correct spelling of createOn to createdOn

* Default and overridable configs for all the things!

* DHFPROD-1802 - flesh out process.sjs more regarding getting/searching for process artifacts

* Add additional keys to the process object

* Tweaking prov.sjs to use config object, adding delete process to process.sjs, adding default metadata timestamps to mlcp-flow-transforms, and adding a default 'dummy' main.sjs process to test flows with

* Add flow and runflow endpoint extensions, todo: flesh out runflow

* Add RunFlowTask for 5.x flows (#1867)

Add RunLegacyFlowTask for 4.x flows
Add access to FlowRunnerImpl in FlowManager

* DHFPROD-1710 - update for flow running

* Add new gradle task group for running flows

* Create job docs, todo: add batching for running the processor instead of single run in the flow

* UUID generation, queryDocRecords() function implemented

* prov.sjs config tweaks

* Wrapped provenance calls in xdmp.eval()

* Let's remove these, accidentally were checked in.

* Moving uuid() to hub-utils.sjs, refactoring create and delete document methods (#1873)

* Update to prov.sjs: Added metadata, on-the-fly flowTypes via info.status

The library now has the ability to log the status in provenance for Steps, Jobs and Flows.

Pass any info.status for these types to their respective functions:
  createFlowRecord(flowId, info)
  createJobRecord(jobId, flowId, info)
  createStepRecord(jobId, flowId, stepType, docURI, info)

if info.status is defined, a custom on-the-fly flowType will be generated
that we can use to lookup the provenance information.

* DHFPROD-1813 - also added language, so that would now appear in the declared artifacts

* DHFPROD-1851 Initial e-node step work (#1895)

* DHFPROD-1851 pre/post hook and ensure MLCP transform compatibility (#1900)

* DHFPROD-1851 Ensure MLCP transform is working properly

* DHFPROD-1851 pre/post hook for flow

* Tweaks based on feedback review in PR from Ryan's fork

* New collector, FlowRunner, post method for jobs and other misc changes (#1904)

* Convert 'options' to json in collector, fix javadocs and other misc changes (#1909)

* Deploy default artifacts (#1908)

* Change Process to Step
Deploy mappings to final as well
Change step artifact uri to steps (align with flows)

* Add LoadHubArtifactsCommand to deploy default mappings and flow

* Deploy default artifacts (#1915)

* Get resource as stream to work in both jar file and via IDE

* Add LoadHubArtifactsCommand to deploy default steps and flows

* Adds capability to iterate over all dirs under flows and steps

* DHFPROD-1853 - WIP

* DHFPROD-1853 - tweaks to make things run

* Better error logging via MLCP

* DHFPROD-1853 - additional cleanup to run the main.sjs

* DHFPROD-1853 - Mocking out the default ingest just so we have something to ingest with, tweaks to mlcp-flow-transform to handle it

* DHFPROD-1853 - accept a mapping response better, actually attempt to cast values according to datatype, todo: still handle arrays

* Add test for hubDeployArtifacts command (#1920)

* DHFPROD-1853 - Tweaking of the mapping code to handle nested entities, arrays of values, and arrays of entities.

* DHFPROD-1854 Initial step prov work (#1930)

* Printing the fullOuput if "options.fullOutput" is set (#1935)

* Setting 'fullOutput' of type Map and returning 'documents' only if 'fullOutput' is set (#1936)

* Update the export jobs task to be for legacy jobs, update our tests that are for the legacy flow running to use runlegacy task instead

* Actually publish the legacy jobs export task might help.

* Update deploy hub artifact test (#1942)

* Remove dummy flow file

* Update hubDeployArtifact test to account for default flows as well

* Because I might be dyslexic.

* DHFPROD-1715 - correction based on @akshaysonvane feedback
aebadirad added a commit that referenced this pull request Mar 12, 2019
* DHFPROD-1710 - stubs

* DHFPROD-1710 add stubs for flowrunnerimpl and flow.sjs

* Remove old e-node code and begin placing in classes

* DHFPROD-1710 - making some minor tweaks to add more than one documentwrite type

* Deploy mapping artifacts even in the absence of entity artifacts

* Creating FlowRunner Class (#1865)

* Creating FlowRunner, Collector and listeners

* Implement the new Collector

* Pass "step" to ml:runFlow endpoint

* Debug export tweak + lion share of provenance work

Implemented:
  createFlowRecord(flowId, options)  -- currently unused, as I don't believe flow info is tracked, just the use of the flow.
  createJobRecord(jobId, flowId, info)
  createStepRecord(jobId, flowId, stepType, docURI, info) {

TODO:
 * Wrap use of provenance API in eval() statement & point to JOBDATABASE
 * Incorporate info.metadata property key/values into provenance data
 * Create query function to search by document uri and return full history
 * Tombstone prov function when a document is deleted (need to determine what this is)

* Add mlLogLevel to thec ustom tokens and default settings

* Actually check in the prior commit of mlLogLevel custom tokens and default settting

* Add datahub base object, constructors where necessary, stubs in some spots for right now

* DHFPROD-1802 - stub and empty constructor for process.sjs, todo fill out

* create new extensions and transforms for 5, mlcp transform and hubstats/hubversion

* Expose the hub log level variable ont he hubconfing interface

* Add defaultConfig if one isn't provided at datahub object creation for override purposes

* Correct spelling of createOn to createdOn

* Default and overridable configs for all the things!

* DHFPROD-1802 - flesh out process.sjs more regarding getting/searching for process artifacts

* Add additional keys to the process object

* Tweaking prov.sjs to use config object, adding delete process to process.sjs, adding default metadata timestamps to mlcp-flow-transforms, and adding a default 'dummy' main.sjs process to test flows with

* Add flow and runflow endpoint extensions, todo: flesh out runflow

* Add RunFlowTask for 5.x flows (#1867)

Add RunLegacyFlowTask for 4.x flows
Add access to FlowRunnerImpl in FlowManager

* DHFPROD-1710 - update for flow running

* Add new gradle task group for running flows

* Create job docs, todo: add batching for running the processor instead of single run in the flow

* UUID generation, queryDocRecords() function implemented

* prov.sjs config tweaks

* Wrapped provenance calls in xdmp.eval()

* Let's remove these, accidentally were checked in.

* Moving uuid() to hub-utils.sjs, refactoring create and delete document methods (#1873)

* Update to prov.sjs: Added metadata, on-the-fly flowTypes via info.status

The library now has the ability to log the status in provenance for Steps, Jobs and Flows.

Pass any info.status for these types to their respective functions:
  createFlowRecord(flowId, info)
  createJobRecord(jobId, flowId, info)
  createStepRecord(jobId, flowId, stepType, docURI, info)

if info.status is defined, a custom on-the-fly flowType will be generated
that we can use to lookup the provenance information.

* DHFPROD-1813 - also added language, so that would now appear in the declared artifacts

* DHFPROD-1851 Initial e-node step work (#1895)

* DHFPROD-1851 pre/post hook and ensure MLCP transform compatibility (#1900)

* DHFPROD-1851 Ensure MLCP transform is working properly

* DHFPROD-1851 pre/post hook for flow

* Tweaks based on feedback review in PR from Ryan's fork

* New collector, FlowRunner, post method for jobs and other misc changes (#1904)

* Convert 'options' to json in collector, fix javadocs and other misc changes (#1909)

* Deploy default artifacts (#1908)

* Change Process to Step
Deploy mappings to final as well
Change step artifact uri to steps (align with flows)

* Add LoadHubArtifactsCommand to deploy default mappings and flow

* Deploy default artifacts (#1915)

* Get resource as stream to work in both jar file and via IDE

* Add LoadHubArtifactsCommand to deploy default steps and flows

* Adds capability to iterate over all dirs under flows and steps

* DHFPROD-1853 - WIP

* DHFPROD-1853 - tweaks to make things run

* Better error logging via MLCP

* DHFPROD-1853 - additional cleanup to run the main.sjs

* DHFPROD-1853 - Mocking out the default ingest just so we have something to ingest with, tweaks to mlcp-flow-transform to handle it

* DHFPROD-1853 - accept a mapping response better, actually attempt to cast values according to datatype, todo: still handle arrays

* Add test for hubDeployArtifacts command (#1920)

* DHFPROD-1853 - Tweaking of the mapping code to handle nested entities, arrays of values, and arrays of entities.

* DHFPROD-1854 Initial step prov work (#1930)

* Printing the fullOuput if "options.fullOutput" is set (#1935)

* Setting 'fullOutput' of type Map and returning 'documents' only if 'fullOutput' is set (#1936)

* Remove the left over default flow stub

* Convert "rdfjson" triples passed in options to MarkLogic triples (#1952)

* DHFPROD-1924: Metadata and options recording (add option.headers) (#1957)

* DHFPROD-1924: Metadata and options recording (add option.headers)

* DHFPROD-1922 Improve MLCP transform performance by accumulating all docs in batch to process at once (#1960)

* DHFPROD-1924: Metadata and options recording (add option.headers)

* fix a bug and simplify

* DHFPROD-1708 WIP run ingest work

* DHFPROD-1702 - updates to the baked in artrifact steps

* DHFPROD-1923 handle xml, json, and binary

* DHFPROD-1924 - add metadata (#1969)

* DHFPROD-1708 - adding a default mapping flow, making some minor modifications to flow to always add step name as a collection, moving some bits around to be universally used, updating mapping step to adhere to new structure

* DHPROD-1708 - tag metadata as part of the write

* Update test for staging triggers

* DHFPROD-1708 - updating the flow counts due to having now two default flows

* Fixes found in review

* Fix MLCP transform

* Add our metadata properly to anything, including no-writes.

* DHFPROD-1702 - adding default options to be generated on creating a type ingest/mapping/custom

* Update marklogic-data-hub/src/main/resources/ml-modules/root/data-hub/5/impl/flow.sjs

Co-Authored-By: aebadirad <alex@ebadirad.com>

* Add getter/setters in StepImpl for new fields (#1996)

Update test artifacts to better represent the step artifacts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants