Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add populateHashField.php #3611

Merged
merged 1 commit into from Jan 13, 2019
Merged

Add populateHashField.php #3611

merged 1 commit into from Jan 13, 2019

Conversation

mwjames
Copy link
Contributor

@mwjames mwjames commented Jan 13, 2019

This PR is made in reference to: #

This PR addresses or contains:

  • During an upgrade some elements of a table may require to be updated and pending its importance has to happen before any access to that table occurs but due to the sheer size of the table it could hog resources making the update.php run for a considerable time
  • This PR provides a mechanism to postpone such tasks and trigger a message block about outstanding tasks
  • The PR includes an example implementation for the smw_hash conversion task. The initial update of smw_hash (Add IDs warmUpCache #3080) only happens once when switching to 3.0 and with a set execution threshold of 200000 (meaning missing converted rows) the conversion will be part of the normal update.php / setupStore.php routine, hereafter the process is postponed (the status is saved in .smw.json) to allow administrators to quickly run the upgrade while a message will start to appear to remind users and administrators that the upgrade hasn't finished yet and requires immediate action.
  • The message cannot be turned off unless actions follow and remove the necessary obstacles. In case of smw_hash, the populateHashField.php script is provided to finalize the conversion and set the "populate.smw_hash_field_complete": true.
  • As previously noted (Add smwgConfigFileDir, refs 3506, 3563 #3596 (comment)), .smw.json will contain more than the upgrade key allowing the system to store state information hereby providing a method to check its software, upgrade, and components.

This PR includes:

  • Tests (unit/integration)
  • CI build passed

Example

image

{
    "mw-31-00": {
        "upgrade_key": "a10c426c369eddc1bb5bffda68e565c430d2d96a",
        "populate.smw_hash_field_complete": true
    },
    "mw-31-00-sunittest_": {
        "upgrade_key": "01bae0a6f5c40524001a7fa4912de253b5a2f6da",
        "populate.smw_hash_field_complete": true
    }
}

@mwjames
Copy link
Contributor Author

mwjames commented Jan 13, 2019

execution threshold of 200000

200000 (defined by PopulateHashField::COUNT_SCRIPT_EXECUTION_THRESHOLD) is more a less a house number I picked so that by normal server hardware standards it should finish the task within 2 minutes tops and hereafter switches to the populateHashField.php script.

This is mainly of interested for users who have a huge smw_object_ids table (with millions of rows) and would make update.php (the SMW part) to monopolize the time required to finish the update.

We may trust people to run populateHashField.php when we tell them to do so but we cannot be sure and reminding them in a persistent way (== annoy users) is the only guarantee we have to ensure administrators actually do follow the task required.

@mwjames
Copy link
Contributor Author

mwjames commented Jan 13, 2019

@kghbln On some thread (can't find the ref) you mentioned that it may take up the 15 min to run update.php, this PR should minimize the impact yet provide the software with a way to inform users about missing tasks and continue its functioning.

Crucial part is the information we store in .smw.json therefore the file is no longer just a simple key file, it provides us with the method to store the state and compare its components in a performant way (no DB connection or table access, just a simple file read like the one that happens on other .json message files or when PHP loads a .php file).

@kghbln kghbln added enhancement Alters an existing functionality or behaviour wikidocu missing Code changes (mostly features) what have not yet been documented labels Jan 13, 2019
@mwjames mwjames merged commit 7363a1c into master Jan 13, 2019
@mwjames mwjames deleted the populate-hash-field branch January 13, 2019 12:52
@kghbln
Copy link
Member

kghbln commented Jan 13, 2019

I will "rename" the file from "update key file" to "setup key file" in the docu since this appears to be more appropriate in the light of this enhancement.

@kghbln
Copy link
Member

kghbln commented Jan 13, 2019

"setup key file"

Second thoughts, now "setup information file".

@mwjames
Copy link
Contributor Author

mwjames commented Jan 13, 2019

I will "rename" the file from "update key file" to "setup key file"

Fine by me, I hadn't any name in mind so I generally refer to it as .smw.json. For technical reasons that matches the $GLOBALS['smw.json'] key that holds the content of the file once loaded by Installer:loadSchema.

@kghbln
Copy link
Member

kghbln commented Jan 13, 2019

Fine by me ...

Fair enough. I just figured something a bit more descriptive and more intuitive is good for the end users "marketing wise".

@mwjames
Copy link
Contributor Author

mwjames commented Jan 13, 2019

docu since this appears to be more appropriate in the light of this enhancement.

At first it just contained the upgrade key but I needed a place to store certain information without relying on a DB connection (also we cannot manipulate LocalSettings.php to set some variables) and since .smw.json was already in place it seemed a natural choice to extent its application.

@mwjames
Copy link
Contributor Author

mwjames commented Jan 13, 2019

the main page of the wiki which will be accessible:

To be clear, the message will be shown an every page to really "annoy" users so that administrator gets his act together and runs the job. It is also a warning to the end-user that something hasn't been finished and they should avoid data intensive work.

kghbln pushed a commit that referenced this pull request Jan 18, 2019
@kghbln kghbln added this to the SMW 3.0.1 milestone Jan 18, 2019
@kghbln
Copy link
Member

kghbln commented Jan 18, 2019

Back-ported with 8f1177a

@kghbln
Copy link
Member

kghbln commented Jan 19, 2019

Documented

@kghbln kghbln removed the wikidocu missing Code changes (mostly features) what have not yet been documented label Jan 19, 2019
@kghbln
Copy link
Member

kghbln commented May 16, 2019

Just upgraded a rather big instance:

Checking 'smw_hash' field consistency ...
   ... missing 344933 rows ...
   ... updating document no.       14077114 (100%)
   ... writing the status to the setup information file ... 
   ... done.

For 4 Cores, 10 GB RAM with PHP 7.0.x it took 7 minutes + another 5 minutes for the first 200,000 rows.

@kghbln
Copy link
Member

kghbln commented Jun 3, 2019

@mwjames Not sure if related to the changes connected with this feature, but the size of the database connected to the wiki mentioned in my previous post dropped from 3.3 GB to 1.9 GB [!sic]. Talking about more efficient data storage. :)

@mwjames
Copy link
Contributor Author

mwjames commented Jun 3, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Alters an existing functionality or behaviour
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants