Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(engine): allow exclusive jobs across sub-processes #4146

Merged
merged 37 commits into from
Mar 22, 2024

Conversation

psavidis
Copy link
Contributor

@psavidis psavidis commented Mar 4, 2024

Related-to: #4004

@psavidis psavidis self-assigned this Mar 4, 2024
@psavidis psavidis changed the title Define exclusive jobs across subprocesses new feat(engine): allow exclusive jobs across sub-processes Mar 4, 2024
@psavidis psavidis added the ci:all-db Runs the builds for all databases. label Mar 4, 2024
imports, static imports, spaces, newLines
- create / drop
- upgrade
@psavidis psavidis force-pushed the define-exclusive-jobs-across-subprocesses-new branch from c9d161f to 590308d Compare March 6, 2024 08:17
- create / drop
- upgrade
- use small letter for DDL
`enableExclusivenessAcrossProcessInstances` controls the feature
- Parameterize the query `selectNextJobsToExecute` base on the config `enableExclusivenessAcrossProcessInstances`
- When the feature is disabled, the select columns and the selection of exclusive jobs will not consider the new column at all
- The legacy query select queries will be preserved when the feature is turned off.
Rename config to `jobExecutorAcquireExclusiveOverProcessHierarchies`

- Respects the `jobExecutorAcquire` convention
- Emphasizes the feature is about process hierarchies and the application of exclusive jobs there
@psavidis psavidis force-pushed the define-exclusive-jobs-across-subprocesses-new branch from 590308d to 675a832 Compare March 7, 2024 11:01
What:

- Rename all test cases to better reflect their purpose
- Add test for feature disabled
- Add java doc and comments
- Reformat all test class
- For testing the default legacy acquisition behaviour (parallel execution) of 7.20 processes with multiple hierarchies
- For testing the new behaviour of sequential behaviour of 7.20 processes with multiple hierarchies

- 7.20 processes will have the new column `rootProcessId` with `null` value in the batch table
# Conflicts:
#	engine/src/main/resources/org/camunda/bpm/engine/db/upgrade/mssql_engine_7.20_to_7.21.sql
- MariaDB
- MySQL
- SQLServer
@psavidis psavidis force-pushed the define-exclusive-jobs-across-subprocesses-new branch from 937f133 to 8eb176f Compare March 14, 2024 07:51
@psavidis psavidis added ci:mariadb Runs the builds for the MariaDB database. and removed ci:mariadb Runs the builds for the MariaDB database. labels Mar 14, 2024
@psavidis
Copy link
Contributor Author

Update

Context: Manual testing the feature
What: camunda-run & the new feature-flag

  • Deployed a root process having a multi-instance call activity with cardinality=1000 and a subprocess with a groovy script task which prints a "hello world from thread:" + Thread.currentThread().name
  • Given the feature is disabled & a new process instance is started
    • The "hello world" message gets printed numerous times on the console & there are various logs with different task executor threads printing OptimisticLockingException instances; This is the reproduction of the legacy behaviour
  • Given the feature is enabled & a new process instance is started
    • The are no more OptimisticLockingException instances in the logs and the print statements follow sequential thread execution according to the DefaultJobExecutor's queueSize=3 & corePoolSize=3
Screenshot 2024-03-14 at 5 43 32 PM Screenshot of logs with feature disabled
Screenshot of logs with feature enabled Screenshot 2024-03-14 at 5 52 28 PM

@psavidis
Copy link
Contributor Author

psavidis commented Mar 15, 2024

Update

All Pipelines passed:

There is no need to wait for the last build since the commit concerns fixing a typo & homogenising the comments in the unit tests

@psavidis psavidis marked this pull request as ready for review March 15, 2024 08:46
@psavidis psavidis added ci:db2 Runs the builds for the DB2 database. ci:sqlserver Runs the builds for the MS SQL Server database. and removed ci:all-db Runs the builds for all databases. labels Mar 20, 2024
- Extract tests into ExclusiveJobAcquisitionTest & refactor setup / tearDown logic
- Extract job executor wait logic into separate JobExecutorWaitUtils class
@psavidis
Copy link
Contributor Author

psavidis commented Mar 20, 2024

Update

Context: The Unit tests were expected to fail with db2 and mssql
Why: The adjustment on the SQL for h2 & db2 was not commited when the CI was executed.
Problem: The tests were h2-specific and they were using a custom ProcessEngineConfiguration setup. Thus, they wouldn't run against all databases.
Solution: This commit makes the tests db-agnostic instead of h2-specific; that means the ExclusiveJobAcquisitionTest will be executed in h2 or any other db profile including db2 & mssql, allowing to reveal any vendor-specific SQL issues

@psavidis psavidis added ci:all-db Runs the builds for all databases. and removed ci:db2 Runs the builds for the DB2 database. ci:sqlserver Runs the builds for the MS SQL Server database. labels Mar 20, 2024
@psavidis
Copy link
Contributor Author

psavidis commented Mar 20, 2024

Update

The Code Review can proceed in the meantime; the last two points (pipeline build with all-dbs & revert label) can be done in parallel (the approval can wait till the build passes)

Assigning to @yanavasileva for a second review round.

@psavidis psavidis added ci:oracle Runs the builds for the Oracle database. and removed ci:all-db Runs the builds for all databases. labels Mar 20, 2024
Oracle returns group batches in lists which contain elements in different order.
This commit changes the groups from list into sets to avoid the ordering issue for databases which might return elements in different order.
Detected flakiness during oracle run due to clock discrepancy (1h).
Adding the clock reset in the class to make it stable
@psavidis
Copy link
Contributor Author

psavidis commented Mar 21, 2024

Update

The last build using Oracle failed in DecisionDefinitionTest with a clock discrepancy that looks unrelated to the adjustments of this PR (ExclusiveJobAcquisitionTest does not depend on Clock). For the sake of making the build pass, added a clock reset on DecisionDefinitionTest to fix the flakiness

Screenshot Screenshot 2024-03-21 at 10 06 59 AM

Context: The HTTL Period expressions in days are affected by the date the tests run.
Problem: The expected date are calculated and have discrepancies with DST
Solution: Make the Test consider a fixed date that belongs to the past and make sure the data is bounded and unaffected by the DST.
Link: Commit that fixes the issue

@psavidis psavidis force-pushed the define-exclusive-jobs-across-subprocesses-new branch from 07cf324 to 19605e8 Compare March 21, 2024 12:14
@yanavasileva yanavasileva added the ci:cockroachdb Runs the builds for the CockroachDB database. label Mar 21, 2024
Copy link
Member

@yanavasileva yanavasileva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice. I only have suggestions for improvements in the tests.

this.repositoryService = engineRule.getRepositoryService();
this.decisionService = engineRule.getDecisionService();
this.historyService = engineRule.getHistoryService();

SimpleDateFormat sdf = new SimpleDateFormat("dd/MM/yyyy hh:mm:ss.SSS");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context: Fixes Flaky DST issues with this test
Why: The change is not related to this PR but will be committed together since it was identified

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to put a note for the fix in the commit message upon merge.
I think the fix for the test is good to go.

psavidis and others added 2 commits March 21, 2024 17:28
Co-authored-by: yanavasileva <yanavasileva@users.noreply.github.com>
- Rename asArrayOfSets method
- Remove ProcessEngineState
- Reorder & group instance fields to be symmetric
@psavidis
Copy link
Contributor Author

psavidis commented Mar 21, 2024

Update

All Code review points are addressed. Will assign after a successful build for the final approval.

Next steps after approval: Revert the commit in the stage-types.yml file.

@yanavasileva yanavasileva removed the ci:cockroachdb Runs the builds for the CockroachDB database. label Mar 21, 2024
Copy link
Member

@yanavasileva yanavasileva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thank you for considering my feedback. Please wait for the CI before merging.

@psavidis
Copy link
Contributor Author

psavidis commented Mar 22, 2024

Update

All pipelines passed*

  • The failure of CockroachDB is not related to this PR
Screenshot Screenshot 2024-03-22 at 10 12 54 AM

Proceeding to revert the change on stage files and merge.

@psavidis psavidis merged commit 94024a0 into master Mar 22, 2024
2 of 3 checks passed
@psavidis psavidis deleted the define-exclusive-jobs-across-subprocesses-new branch March 22, 2024 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci:migration Runs the process instance migration builds. ci:oracle Runs the builds for the Oracle database.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants