Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize uuid building #4279

Merged
merged 3 commits into from Mar 11, 2021
Merged

Conversation

bradymiller
Copy link
Sponsor Member

@bradymiller bradymiller commented Mar 10, 2021

Just seeing if:

  1. Can speed things up a bit
  2. Avoid breakage on huge updates

example of testing script:

<?php
// halved on each subsequent run (there are 4 runs)
$records_run_start = 1000000;

set_time_limit(0);
$ignoreAuth = true; // no login required
require_once('interface/globals.php');

use OpenEMR\Common\Uuid\UuidRegistry;

foreach (['one', 'two', 'three', 'four'] as $run) {
    if ($run == 'one') {
        $records_this_run = $records_run_start;
    } elseif ($run == 'two') {
        $records_this_run = $records_run_start / 2;
    } elseif ($run == 'three') {
        $records_this_run = $records_run_start / 4;
    } else { // $run == 'four'
        $records_this_run = $records_run_start / 8;
    }
    echo "number of records on this run: " . $records_this_run . "<br>";
    error_log("number of records on this run: " . $records_this_run);
    $start = microtime(true);
    for ($i=0; $i<$records_this_run; $i++) {
        sqlStatement("INSERT INTO `procedure_result` (`date`) VALUES (NOW())");
    }
    echo "time (minutes) to complete data insert for run " . $run . ": " . round((microtime(true) - $start) / 60) . "<br>";
    error_log("time (minutes) to complete data insert for run " . $run . ": " . round((microtime(true) - $start) / 60));
    $start = microtime(true);
    (new UuidRegistry(['table_name' => 'procedure_result', 'table_id' => 'procedure_result_id']))->createMissingUuids();
    echo "time (minutes) to complete uuid insert for run " . $run . ": " . round((microtime(true) - $start) / 60) . "<br>";
    error_log("time (minutes) to complete uuid insert for run " . $run . ": " . round((microtime(true) - $start) / 60));
    echo "uuids per minute for run " . $run . ": " . round($records_this_run / ((microtime(true) - $start) / 60)) . "<br><br>";
    error_log("uuids per minute for run " . $run . ": " . round($records_this_run / ((microtime(true) - $start) / 60)));
}
?>

Output for 100,000 run (in 1000 blocks):

number of records on this run: 100000
time (minutes) to complete data insert for run one: 3
time (minutes) to complete uuid insert for run one: 14
uuids per minute for run one: 7393

number of records on this run: 50000
time (minutes) to complete data insert for run two: 3
time (minutes) to complete uuid insert for run two: 8
uuids per minute for run two: 6263

number of records on this run: 25000
time (minutes) to complete data insert for run three: 2
time (minutes) to complete uuid insert for run three: 4
uuids per minute for run three: 6296

number of records on this run: 12500
time (minutes) to complete data insert for run four: 1
time (minutes) to complete uuid insert for run four: 2
uuids per minute for run four: 6143

Output for 100,000 run (in 100 blocks):

number of records on this run: 100000
time (minutes) to complete data insert for run one: 3
time (minutes) to complete uuid insert for run one: 13
uuids per minute for run one: 7506

number of records on this run: 50000
time (minutes) to complete data insert for run two: 3
time (minutes) to complete uuid insert for run two: 8
uuids per minute for run two: 6183

number of records on this run: 25000
time (minutes) to complete data insert for run three: 2
time (minutes) to complete uuid insert for run three: 4
uuids per minute for run three: 6001

number of records on this run: 12500
time (minutes) to complete data insert for run four: 1
time (minutes) to complete uuid insert for run four: 2
uuids per minute for run four: 6001

Output for 1,000,000 run (in 1000 blocks):

number of records on this run: 1000000
time (minutes) to complete data insert for run one: 61
time (minutes) to complete uuid insert for run one: 163
uuids per minute for run one: 6130

number of records on this run: 500000
time (minutes) to complete data insert for run two: 32
time (minutes) to complete uuid insert for run two: 83
uuids per minute for run two: 6058

number of records on this run: 250000
time (minutes) to complete data insert for run three: 17
time (minutes) to complete uuid insert for run three: 40
uuids per minute for run three: 6283

number of records on this run: 125000
time (minutes) to complete data insert for run four: 8
time (minutes) to complete uuid insert for run four: 20
uuids per minute for run four: 6146

$stringQuery .= " AND `" . $column . "` = ? ";
$done = false;
while (!$done) {
// just maximum of 1000 at a time to attempt to speed things up and not break when inserting a large number of uuids
Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan to try 100, 500, 1000, 10000 to see what is best (or if there is even any difference)

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just gonna stick with 1000 for now. this is not really speeding things up (or slowing things down), but it definitely prevents breaking when doing very large updates (when having to do more than 500,000 uuid updates on a table).

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addams_family

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants