Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#12048] V9 migration and verification script optimization #12896

Merged

Conversation

FergusMok
Copy link
Contributor

@FergusMok FergusMok commented Mar 13, 2024

Part of #12048

  1. Migration: Increase the hibernate batch size to 50.
  2. Verification: Batched the datastore queries, instead of doing singular queries. Also ordered by id instead of createdAt to take advantage of SQL's indexes.

Copy link

Hi @FergusMok, thank you for your interest in contributing to TEAMMATES!
However, your PR does not appear to follow our contribution guidelines:

  • Title must start with the issue number the PR is fixing in square brackets, e.g. [#<issue-number>]
  • Description must reference the issue number the PR is fixing, e.g. Fixes #<issue-number> (or Part of #<issue-number> if the PR does not address the issue fully)

Please address the above before we proceed to review your PR.

Comment on lines -49 to -53
// 100 is the optimal batch size as there won't be too much time interval
// between read and save (if transaction is not used)
// cannot set number greater than 300
// see
// https://stackoverflow.com/questions/41499505/objectify-queries-setting-limit-above-300-does-not-work
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've verified that this is no longer true. Running it on staging with 1000 does not lead to any crashes. However, it's slower on average compared to the original 100.

@@ -110,6 +110,7 @@ public static void buildSessionFactory(String dbUrl, String username, String pas

Configuration config = new Configuration()
.setProperty("hibernate.dialect", "org.hibernate.dialect.PostgreSQLDialect")
.setProperty("hibernate.jdbc.batch_size", "50")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we want to commit this to master, as it's only for migration purposes. Maybe we could comment this line for master and put another comment on the migration script to remember to uncomment this line.

Copy link
Contributor

@ziqing26 ziqing26 Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to change the config in the script? Otherwise I think we can put a comment in script and comment this line out on master.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. I've added the notes

@FergusMok FergusMok changed the title v9 migration and verification script optimization [#12048] V9 migration and verification script optimization Mar 13, 2024
@FergusMok FergusMok self-assigned this Mar 13, 2024
@FergusMok FergusMok marked this pull request as ready for review March 13, 2024 10:05
Copy link
Contributor

@ziqing26 ziqing26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to the verification script LGTM! But it might be better the HibernateUtil change to be commented out first and put a note

Comment on lines +145 to -148
HibernateUtil.beginTransaction();
for (Map.Entry<String, Instant> entry : oldAccount.getReadNotifications().entrySet()) {
HibernateUtil.beginTransaction();
UUID notificationId = UUID.fromString(entry.getKey());
Notification newNotification = HibernateUtil.get(Notification.class, notificationId);
HibernateUtil.commitTransaction();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this cause a problem if one read notification migration fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the migration does the early escape at if (newNotification == null) {? I think probably not, because it's outside of the entire loop

@FergusMok
Copy link
Contributor Author

Running it on staging (Singapore for datastore and SQL) for 10000 AccountRequest migration and verification:

Before: 294s
After: 15s

@ziqing26
Copy link
Contributor

ziqing26 commented Mar 14, 2024

Yep the notes looks good, thanks for the optimization! Now just need to fix the lint

Copy link
Contributor

@NicolasCwy NicolasCwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM hope this speeds up the migration overall ✨

One note on readability though. If the output is readable lets merge this in

List<T> sqlEntities = lookupSqlEntitiesByPageNumber(currPageNum);
long endTimeForSql = System.currentTimeMillis();
log("Querying for SQL for page " + currPageNum + " took "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these timings (here and DataMigrationEntitiesBaseScriptSql.java) affect how readable the logged output is? If it is perhaps we can remove it from the PR or add a flag/ log option to only output on debug or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less verbose because I've adjusted batch size to be 1000 instead of 100. The log statements can also show us what to further optimize if it's still slow during migration.

@FergusMok FergusMok merged commit 2342189 into TEAMMATES:master Mar 14, 2024
11 checks passed
FergusMok added a commit to ziqing26/teammates that referenced this pull request Mar 18, 2024
…EAMMATES#12896)

* Add changes

* Add changes for migration

* Revert the illegals

* Linting

* Add additional batching

* Add notes

* Fix linting errors

---------

Co-authored-by: Zhang Ziqing <69516975+ziqing26@users.noreply.github.com>
@cedricongjh cedricongjh added this to the V9.0.0-beta.1 milestone Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants