Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to run new "data" upgrade tasks #24681

Closed
jgambarios opened this issue Apr 19, 2023 · 4 comments · Fixed by #24710
Closed

Allow to run new "data" upgrade tasks #24681

jgambarios opened this issue Apr 19, 2023 · 4 comments · Fixed by #24710

Comments

@jgambarios
Copy link
Contributor

jgambarios commented Apr 19, 2023

Parent Issue

#24093

User Story

There are some cases when we need to call the APIs from an upgrade task, calling those upgrade tasks together with the regular "schema" upgrade task can cause multiple problems, that's why, we need to be able to run those "data" upgrade tasks after the regular "schema" upgrade tasks finished.

Acceptance Criteria

Allow to create "Data" Upgrade tasks and to be called after regular "Schema" upgrade tasks are done.

Proposed Objective

Core Features

Proposed Priority

Priority 2 - Important

@jgambarios
Copy link
Contributor Author

Solution

The code allows to define a new group of upgrade tasks, Data Upgrade task, those upgrade tasks are mostly tasks to solve data issues using our existing APIs and to avoid conflicts the Data Upgrade Tasks are going to run at the very end of the startup process to avoid issues with the Schema (regular) upgrade tasks.

Now, two version tables will exist, the existing db_version and the new data_version, where the new data_version will track the version for the Data Upgrade tasks.

To define a new data upgrade tasks we just need to create a regular upgrade task but instead of registering the class inside TaskLocatorUtil.getStartupRunOnceTaskClasses it needs to be registered in TaskLocatorUtil.getStartupRunOnceDataTaskClasses

@jgambarios
Copy link
Contributor Author

PR: #24710

@nollymar nollymar linked a pull request Apr 21, 2023 that will close this issue
nollymar pushed a commit that referenced this issue Apr 21, 2023
* #24681 Adding support to run data upgrade tasks

* #24681 Adding integration tests

* #24681 Applying code style
@nollymar nollymar reopened this Apr 21, 2023
@fabrizzio-dotCMS
Copy link
Contributor

fabrizzio-dotCMS commented May 10, 2023

we had a situation where a new upgrade task (A) was built using some of our existing API.
Making it prone to failure when new changes are introduced on the APIs.
e.g. Task (A) was using our Content-Type APi but the same API had been modified making it dependent on a new column that is created on another upgrade task (B) that needs to be applied first. Therefore the upgrade task (A) that uses the CT API upgrade task can not run because it depends on task (B) that is scheduled to run after.

I stumbled onto this situation while working on upgrade task B.
I was able to verify the updated code prevents the error from happening. Therefore Im passing it.

@bryanboza
Copy link
Member

Fixed, tested running the upgrade task on release-23.06 // Docker and this is working as expected

testing_pg-dotcms-1         | 21:45:39.089  INFO  startup.StartupTasksExecutor - 
testing_pg-dotcms-1         | 21:45:39.089  INFO  startup.StartupTasksExecutor - Running Data Upgrade Tasks
testing_pg-dotcms-1         | 21:45:39.089  INFO  startup.StartupTasksExecutor - Database data version: 0
testing_pg-dotcms-1         | 21:45:39.107  INFO  startup.StartupTasksExecutor - Running Data Upgrade Tasks: Task230320FixMissingContentletAsJSON
testing_pg-dotcms-1         | 21:45:39.107  INFO  runonce.Task230320FixMissingContentletAsJSON - Running upgrade Task230320FixMissingContentletAsJSON
testing_pg-dotcms-1         | 21:45:39.168  INFO  impl.StdSchedulerFactory - Using default implementation for ThreadExecutor
testing_pg-dotcms-1         | 21:45:39.170  INFO  simpl.SimpleThreadPool - Job execution threads will use class loader of thread: main
testing_pg-dotcms-1         | 21:45:39.189  INFO  core.SchedulerSignalerImpl - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
testing_pg-dotcms-1         | 21:45:39.192  INFO  core.QuartzScheduler - Quartz Scheduler v.1.8.6 created.
testing_pg-dotcms-1         | 21:45:39.196  INFO  quartz.DotJobStore - Using db table-based data access locking (synchronization).
testing_pg-dotcms-1         | 21:45:39.198  INFO  quartz.DotJobStore - JobStoreCMT initialized.
testing_pg-dotcms-1         | 21:45:39.199  INFO  core.QuartzScheduler - Scheduler meta-data: Quartz Scheduler (v1.8.6) 'dotCMSQuartz' with instanceId 'NON_CLUSTERED'
testing_pg-dotcms-1         |   Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
testing_pg-dotcms-1         |   NOT STARTED.
testing_pg-dotcms-1         |   Currently in standby mode.
testing_pg-dotcms-1         |   Number of jobs executed: 0
testing_pg-dotcms-1         |   Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads.
testing_pg-dotcms-1         |   Using job-store 'com.dotmarketing.quartz.DotJobStore' - which supports persistence. and is clustered.
testing_pg-dotcms-1         | 
testing_pg-dotcms-1         | 21:45:39.199  INFO  impl.StdSchedulerFactory - Quartz scheduler 'dotCMSQuartz' initialized from an externally provided properties instance.
testing_pg-dotcms-1         | 21:45:39.199  INFO  impl.StdSchedulerFactory - Quartz scheduler version: 1.8.6
testing_pg-dotcms-1         | 21:45:39.226  INFO  json.PopulateContentletAsJSONUtil - Populate Contentlet as JSON task started for asset subtype [Host]
testing_pg-dotcms-1         | 21:45:39.272  INFO  json.PopulateContentletAsJSONUtil - -- Records found to process: 0
testing_pg-dotcms-1         | 21:45:39.274  INFO  json.PopulateContentletAsJSONUtil - Updating records with missing Contentlet as JSON
testing_pg-dotcms-1         | 21:45:39.275  INFO  json.PopulateContentletAsJSONUtil - -- total updates: 0
testing_pg-dotcms-1         | 21:45:39.275  INFO  json.PopulateContentletAsJSONUtil - Updated records with missing Contentlet as JSON
testing_pg-dotcms-1         | 21:45:39.277  INFO  json.PopulateContentletAsJSONUtil - Contentlet as JSON migration task DONE for assetSubtype: [Host] / excludingAssetSubtype [null].
testing_pg-dotcms-1         | 21:45:39.282  INFO  json.PopulateContentletAsJSONUtil - Call for class: com.dotcms.util.content.json.PopulateContentletAsJSONUtil#populate, duration:48 millis
testing_pg-dotcms-1         | 21:45:39.283  INFO  startup.StartupTasksExecutor - Data upgraded to version: 230320

After this UT runs once this is not running anymore.

data_version table created and just one register on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants