Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Big Kahuna re-factoring #10

Merged
merged 11 commits into from Feb 22, 2014
Merged

The Big Kahuna re-factoring #10

merged 11 commits into from Feb 22, 2014

Conversation

mwjames
Copy link
Contributor

@mwjames mwjames commented Feb 9, 2014

  • Split responsibilties
  • Clean-up invocation
  • Use appropriate service registration
  • Added PropertyRegistryTest and PredefinedPropertyAnnotatorTest
  • Moved Exif handling in its own class
  • Moved ShortUrl handling in its own class

@mwjames
Copy link
Contributor Author

mwjames commented Feb 9, 2014

Travis needs to be setup correctly

Configuration read from /home/travis/build/SemanticMediaWiki/SemanticExtraSpecialProperties/phpunit.xml.dist
PHP Fatal error:  Class 'SESP\PredefinedPropertyAnnotator' not found in /home/travis/build/SemanticMediaWiki/SemanticExtraSpecialProperties/tests/phpunit/PredefinedPropertyAnnotatorTest.php on line 43

@mwjames
Copy link
Contributor Author

mwjames commented Feb 9, 2014

I have not made any attempt to clean-up some general design issues (this is for someone else to figure out and clean-up)

@mwjames
Copy link
Contributor Author

mwjames commented Feb 9, 2014

@SemanticMediaWiki/testers @rotsee please feel free to test. Functionality have not been altered (as far as I know) but in order to prove this its needs testing. Classes have been re-factored in such a way that is possible to do constructor injection, eliminate the use of GLOBALS and it introduces the "smallest" DI container available which amounts to the possibility to run proper unit tests.

As far as I'm concerned, I see SESP as a case study (that's why I did the re-factoring) so other SMW related extensions can see how to setup Travis, use phpunit and regression tests to ensure that if something changes in core it can (or should) be detected by the extension when running those tests.

@mwjames mwjames added this to the SESP 0.3 milestone Feb 9, 2014
@JeroenDeDauw
Copy link
Member

@mwjames Does this really require 1.9.1, as in, 1.9.0 will not work? If so, then we are not doing the SMW releases properly. There should be no compat changes in minor releases.

@mwjames
Copy link
Contributor Author

mwjames commented Feb 9, 2014

I didn't delete the SESP.php file as people might want to poke and compare before and after but it should eventually vanish.

@JeroenDeDauw
Copy link
Member

Code definitely looks better then before now :)

@mwjames
Copy link
Contributor Author

mwjames commented Feb 9, 2014

Adds $sespUseAsFixedTables to enable the use of fixed tables as each of the properties are high volume targets by being potentially included for each page.

@kghbln
Copy link
Member

kghbln commented Feb 13, 2014

Just wanted to dive into testing but got an error while doing the data refresh

My SES configuration:

$sespSpecialProperties = array(
    '_EUSER', '_CUSER', '_REVID', '_VIEWS', '_NREV', '_TNREV',
    '_SUBP', '_SHORTURL', '_METADATA', '_USERREG'
    );
$sespUseAsFixedTables = true;
$wgSESPExcludeBots = true;

@kghbln
Copy link
Member

kghbln commented Feb 13, 2014

I just copied over the setting from the README which obviously has a typo. Just tried and everything seems fluffy. Now doing some testing and will build up some data for the new concept cache script.

@mwjames
Copy link
Contributor Author

mwjames commented Feb 15, 2014

pageId vs. revisionId

I realized an obvious blunder in the original SESP [0]. It stores the pageId (getId returns page Id not the revision id) and not the revisionId (getLatest) as expected when _REVID is selected.

This needs to be addressed in the docs.

Identifier

The next PR stops with the mumbo jumbo of assigning double meaning to a single identifier. Currently _METADATA means _EXIFDATETIME, _EXIFSOFTWARE or _MIMETYPE also means _MEDIATYPE. Things should be explicit in the sespSpecialProperties. If one wants to see _EXIFDATETIME then _EXIFDATETIME should be noted in the sespSpecialProperties. This makes the interface less complicate and avoids logic being tacit within the code.

[0] https://github.com/SemanticMediaWiki/SemanticExtraSpecialProperties/blob/master/src/SESP.php#L138

@kghbln
Copy link
Member

kghbln commented Feb 17, 2014

95cc2b4 now creates the tables when setting $sespUseAsFixedTables to "true" except for the "smw_ftp_sesp_metadata" table which is probably expected according to #10 (comment). Will try this with the individual "subidentifiers"

About pageId vs. revisionId: Shouldn't is be rectified and have a _PAGEID as well as a _LATESTID and depreciate _REVID. I guess both information may be useful.

@mwjames
Copy link
Contributor Author

mwjames commented Feb 17, 2014

Shouldn't is be rectified and have a _PAGEID as well as

Already added a _PAGEID while _REVID depicts the expected information now.

@kghbln
Copy link
Member

kghbln commented Feb 17, 2014

Ah, this tiny "Some clean-up". :) Breaking but better! Thanks for this great work!

@kghbln
Copy link
Member

kghbln commented Feb 17, 2014

Jep, now also getting the pageid, exifsoftware and exifdatetime tables. Did not bother to add mimetype and mediatype. Working like charm I still need to add the I18n though.

@kghbln
Copy link
Member

kghbln commented Feb 17, 2014

Just discovered an "issue": _NTREV (Anzahl der Diskussionsbearbeitungen) is at 1 though the talk page has not been created.

@jthingelstad
Copy link

Whoa! _PAGEID can now be done automatically by SESP? That's awesome! I've been manually setting that.

@JeroenDeDauw
Copy link
Member

@mwjames can you poke me when this is ready for review?

@mwjames
Copy link
Contributor Author

mwjames commented Feb 18, 2014

an "issue": _NTREV (Anzahl der Diskussionsbearbeitungen) is at 1 though the talk

I haven't changed to logic which means that this was probably broken before.

git.wikimedia.org

I just had a look at git.wikimedia.org-SemanticExtraSpecialProperties which shows a rather larger differences (the next time this happens argh... I will sent some github-忍者 ) especially with the 'EXIFDATA' implementation therefore I will once again make a PR (this will be the last) which implements the already existing _EXIFDATETIME and _EXIFDATETIME as _EXIFDATA subobject.

The PR will not implement the remaining EXIF data properties but it will provided the foundation to make those work at a later point.

All EXIF implementation details have been moved from SESP into its own ExifAnnotator class allowing for changes to be deployed in a follow-up without the need to alter the core-class.

@kghbln Probably the git.wikimedia.org and github.com i18n version may show some differences.

@s7eph4n Maybe of interest

@kghbln
Copy link
Member

kghbln commented Feb 18, 2014

No worries about the _NTREV. Will file an issue for this. No holdup here.

@JeroenDeDauw Prior to moving the code over to wikimedia I will copy over the I18n to GiHub to save the translator's effort. I guess this is how it should be done best.

@mwjames I am not sure where the messages for the new special properties ended up. Adding these to the I18n can be done later. When it comes to "exposure time" and "software" and possible new implementations the name should proably be "exposure time from Exif data" etc. to avoid property naming collisions. Since the new version will be breaking anyway ...

@mwjames
Copy link
Contributor Author

mwjames commented Feb 18, 2014

I am not sure where the messages for the new special properties ended up

I think that all the exif messages are retrieved from mw-core such as exif-software etc.

Speaking of which, after I looked at possible exif properties (30+), I have decided that maintaining those in an php-array comes close to insanity therefore the definitions have been removed from PHP and put into a json file with the likely form of:

{
    "_REVID": {
        "id": "___REVID",
        "type": 1,
        "msgkey": "sesp-property-revision-id",
        "alias": "Revision ID"
    },
    "_EXIF": {
        "DATETIME": {
            "id": "___EXIFDATETIME",
            "type": 6,
            "msgkey": "exif-datetime",
            "alias": "Exposure date"
        },
        "SOFTWARE": {
            "id": "___EXIFSOFTWARE",
            "type": 2,
            "msgkey": "exif-software",
            "alias": "Software"
        }
        ...
    }
}

Any additional exif properties should to be added to the json. Profiling showed that there is no difference in loading the data from json (loading and decoding takes 2 ms) or from an php-array.

Identifiers such as _REVID or DATETIME are to be in capital letters. A type refers to the dataitem constant such as DataItem::TYPE_BLOB or DataItem::TYPE_WIKIPAGE.

@kghbln
Copy link
Member

kghbln commented Feb 18, 2014

I see. Looks like a good option to retrieve the names for the properties existing translations for core. So there is absolute consistency about them. I am not sure if this will avoid property name collision but it may be a matter of just documenting in the README that these names are taboo if you want to activate properties regarding "exif" information. Consistency would take precedence over collision which is ok in this case.

Just installed the new version and identified two issues. Neither "exif software" nor "exif datetime" are stored, i.e. vanished even though the configuration in LocalSettings.php did not change. Same applies to "ntrev". All other special properties are still there including "rev". As far as I see the configuration did not change.

@mwjames
Copy link
Contributor Author

mwjames commented Feb 18, 2014

Neither "exif software" nor "exif datetime" are stored, i.e. vanished

Since those are now stored within a subobject, _EXIFDATA indicates and activate those that are maintained in the json file and supported by the ExifDataAnnotator.

As for the fixed tables, only _EXIFDATA will be created as a fixed table all other exif related properties that may or may not be present will not be considered for its own table because you don't want to end up having 30+ tables just to manage exif.

@kghbln
Copy link
Member

kghbln commented Feb 18, 2014

Yay, that's utterly cool! Now I get it. Just added _EXIFDATA to LocalSettings.php deleted the backend, recreated it and did a nice data refresh and here we are: paradise regained! Even _NTREV works now, so if there is no talk you do not get the special property at all. Great!

Since we are now talking about subobjects we can forget about the name collision.

Are there any further blockers? I do not see any. Are we ready to poke for review?

@mwjames
Copy link
Contributor Author

mwjames commented Feb 18, 2014

The implementation now comes close to the features that were available on the git.wikimedia.org version (the 30+ exif properties are not included in the json file and if someone wants to tackle this, feel free).

I'm now reasonably satisfied. It runs (35 tests, 68 assertions) which should roughly account for 70% coverage and scrutinizer is pushed from 5 to 8.4.

Are there any further blockers? I do not see any. Are we ready to poke for review?

@kghbln I think we are good to go.

@JeroenDeDauw made the last sprint

@kghbln
Copy link
Member

kghbln commented Feb 19, 2014

I just created pull request #13 to sync with the translations provided by twn. Hopefully all of this can be mashed together somehow by @JeroenDeDauw

@kghbln
Copy link
Member

kghbln commented Feb 19, 2014

So the docu is now nitpicked with #15 and fluffy. Ready to merge and release?

mwjames and others added 11 commits February 23, 2014 00:00
- Split responsibilties
- Clean-up invocation
- Use appropriate service registration
- Added PropertyRegistryTest and PredefinedPropertyAnnotatorTest
- Moved Exif handling in its own class
- Moved ShortUrl handling in its own class
Added _NREV test to show case how to use DBConnection injection
Introduce $sespUseAsFixedTables to enable the use of fixed
tables as each of the properties are high volume targets by being
potentially included for each page.

If $sespUseAsFixedTables = true then update.php + SMW_refresh.php needs
to be executed.
This only verifies that the invoked hook works in accordance within
its specification.
JeroenDeDauw added a commit that referenced this pull request Feb 22, 2014
@JeroenDeDauw JeroenDeDauw merged commit b29f16f into master Feb 22, 2014
@JeroenDeDauw JeroenDeDauw deleted the big-kahuna branch February 22, 2014 23:07
@JeroenDeDauw
Copy link
Member

@kghbln
Copy link
Member

kghbln commented Feb 23, 2014

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants