Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Std/Conf/Std+Conf jobs cannot be run twice as a lib #2165

Closed
dk1844 opened this issue Jan 18, 2023 · 1 comment · Fixed by #2166
Closed

Std/Conf/Std+Conf jobs cannot be run twice as a lib #2165

dk1844 opened this issue Jan 18, 2023 · 1 comment · Fixed by #2166
Labels
bug Something isn't working Conformance Conformance Job affected priority: low Nice to have Standardization Standardization Job affected

Comments

@dk1844
Copy link
Contributor

dk1844 commented Jan 18, 2023

Describe the bug problem

Current implementation of Enceladus Spark jobs internally uses Atum's initialization and does not explicitly disable Atum's control measurement tracking, because it relies on the implied disable routine with Spark session being ended.

However, if one of these sparkjobs is ran from other code (library-like), Atum's disabling of the CM tracking is not called and Control framework tracking is already initialized. exception is raised.

To Reproduce

Steps to reproduce the behavior OR commands run:

  1. run StandardizationJob.main() from other code more than once.
  2. See error Control framework tracking is already initialized.

Expected behavior

Running sparkjobs multiple times from other code should work.

Additional context

If this way of using is to be supported, explicit spark.disableControlMeasuresTracking() for Atum must be called at the end of all Enceladus SparkJobs

Temporary Workaround

Until fixed and released, when using Enceladus spark jobs in this as-a-library fashion, one can explicitly call

spark.disableControlMeasuresTracking()

between individual jobs.

@dk1844 dk1844 added bug Something isn't working priority: undecided Undecided priority to be assigned after discussion labels Jan 18, 2023
dk1844 added a commit that referenced this issue Jan 18, 2023
…the end of each Std/Conf/Std+Conf spark job - so that users are able to chain spark jobs.

`prepareStandardization()` and `finishJob()` are now paired in this sense
 - unit test added
@benedeki
Copy link
Collaborator

I am not sure it's a bug, rather an improvement (after all, it was never intended to run this way), but we can keep the designation... 😉

@benedeki benedeki added good first issue Good for newcomers Conformance Conformance Job affected Standardization Standardization Job affected priority: low Nice to have and removed priority: undecided Undecided priority to be assigned after discussion labels Jan 18, 2023
dk1844 added a commit that referenced this issue Jan 23, 2023
* #2165 Atum's spark.disableControlMeasuresTracking() is now called at the end of each Std/Conf/Std+Conf spark job - so that users are able to chain spark jobs.
`prepareStandardization()` and `finishJob()` are now paired in this sense
 - unit test added
- review update: commons-TempDirectory used instead of nio-Files

Co-authored-by: David Benedeki <14905969+benedeki@users.noreply.github.com>
@dk1844 dk1844 removed the good first issue Good for newcomers label Jan 23, 2023
dk1844 added a commit that referenced this issue Jan 26, 2023
dk1844 added a commit that referenced this issue Jan 26, 2023
benedeki added a commit that referenced this issue Jan 26, 2023
Co-authored-by: Daniel Kavan <dk1844@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Conformance Conformance Job affected priority: low Nice to have Standardization Standardization Job affected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants