Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrub speedup #1654

Merged
merged 2 commits into from
Jun 3, 2023
Merged

Conversation

christopherlam
Copy link
Contributor

@christopherlam christopherlam commented Jun 1, 2023

Instead of #1651 which does reduce the number of scrubs dramatically, this branch will augment xaccSplitScrub to take a dry_run option returning true/false if a scrub must take place.

Most splits wouldn't need scrubbing therefore the slow Begin/Commit can be skipped.

it's defined in Scrub.h anyway
@christopherlam
Copy link
Contributor Author

Some benchmarks for scrub_all from main account tree:

@christopherlam christopherlam marked this pull request as ready for review June 2, 2023 15:00
@code-gnucash-org code-gnucash-org merged commit 81902ba into Gnucash:stable Jun 3, 2023
3 checks passed
@christopherlam christopherlam deleted the scrub-speedup branch June 3, 2023 07:41
@christopherlam
Copy link
Contributor Author

christopherlam commented Jun 12, 2023

I found an issue with this branch: the refactor will use QofQuery to in get_all_transactions whereas the previous code used xaccAccountGetSplitList. The refactor unfortunately includes the blank transaction.

As a result, the blank transaction will be scrubbed, which results in unwanted Orphan accounts.

We can fix this by redefining get_all_transactions as follows, which is slightly ugly, but it is still faster than the workload from prior this branch because it skips duplicate transactions. Any better ideas?

modified   libgnucash/engine/Scrub.c
@@ -96,13 +96,13 @@ get_all_transactions (Account *account, bool descendants)
 {
     GList *accounts = descendants ? gnc_account_get_descendants (account) : NULL;
     accounts = g_list_prepend (accounts, account);
-    QofQuery *q = qof_query_create_for (GNC_ID_TRANS);
-    QofBook *book = qof_session_get_book (gnc_get_current_session ());
-    qof_query_set_book (q, book);
-    xaccQueryAddAccountMatch (q, accounts, QOF_GUID_MATCH_ANY, QOF_QUERY_AND);
-    GList *transactions = g_list_copy (qof_query_run (q));
-    qof_query_destroy (q);
-    return transactions;
+    GHashTable *ht = g_hash_table_new (g_direct_hash, g_direct_equal);
+    for (GList *n = accounts; n; n = g_list_next (n))
+        for (GList *m = xaccAccountGetSplitList (n->data); m; m = g_list_next (m))
+            g_hash_table_add (ht, xaccSplitGetParent (m->data));
+    GList *rv = g_hash_table_get_keys (ht);
+    g_hash_table_destroy (ht);
+    return rv;
 }

@jralls
Copy link
Member

jralls commented Jun 12, 2023

Not just ugly, it'll be slow as heck on a large book with a lot of accounts. The blank transaction should have a non-zero edit level: You could test that and skip the scrub. That would also skip scrubbing any other transaction being edited and ISTM that's right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants