Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add keyword (_keyw) type #3020

Merged
merged 1 commit into from Feb 24, 2018
Merged

Add keyword (_keyw) type #3020

merged 1 commit into from Feb 24, 2018

Conversation

mwjames
Copy link
Contributor

@mwjames mwjames commented Feb 18, 2018

This PR is made in reference to: #

This PR addresses or contains:

  • As the name suggests this PR adds a keyword type that is different from the standard text type by:
    • being restricted in length (150 for ASCII, 85 for UTF-8 to ensure that it fits into the 255 char o_hash field)
    • is normalized (lowercased, removed diacritics)
    • the original annotation value is kept but during a query request the normalized value is used to match the broadest possible result set which means whether you use teSt, TEST, or test the normalized version will always match the keyword test independent of any SMW_FIELDT_CHAR_NOCASE or the full-text index settings
    • allows to add a formatter rule (Add SMW_NS_RULE ns, refs 2273 #3019, [[Formatter rule:...]]) which will self link any keyword to a link_to target (which can either be a Special:SearchByProperty or Special:Ask) to have immediate access to entities that have the same normalized keyword annotated
  • The Keyword as property has been added to smwgDataTypePropertyExemptionList so that Property:Keyword is not automatically occupied and doesn't override any existing user properties with the same name

This PR includes:

  • Tests (unit/integration)
  • CI build passed

Why?

Users apply the #lc parser function to normalize texts and in a later phase format those via a template to help generate query links. Both functions #lc and templates can lengthen the parser time of a page therefore the keyword type and hereby any property defined as such should help minimize the work related to it.

Formatter rule

Any keyword typed property can have its own formatter rule assigned which defines how links should be generated where #3017 was added to allow those links to become compact and easier to handle from a user point of view.

Link to Special:Ask with format list

{
    "description": "Specifies a formatting rule for a keyword type",
    "type": "LINK_FORMAT_RULE",
    "rule": {
        "link_to": "Special:Ask",
        "parameters": {
            "format": "list"
        }
    },
    "tags": [
        "formatter",
        "link formatter",
        "keyword"
    ]
}

Link to Special:Ask with format table and show Has description as printout.

{
    "description": "Specifies a formatting rule for a keyword type",
    "type": "LINK_FORMAT_RULE",
    "rule": {
        "link_to": "Special:Ask",
        "parameters": {
            "format": "table",
            "printouts": [
                "Has description"
            ]
        }
    },
    "tags": [
        "formatter",
        "link formatter",
        "keyword"
    ]
}

Link to Special:SearchByProperty with no parameters required (the link definition is clear by just being a property and a value).

{
    "description": "Specifies a formatting rule for a keyword type",
    "type": "LINK_FORMAT_RULE",
    "rule": {
        "link_to": "Special:SearchByProperty"
    },
    "tags": [
        "formatter",
        "link formatter",
        "keyword"
    ]
}

Example

@mwjames mwjames added the new feature A new, or altered behaviour of an existing functionality that fundamentally impacts behaviour label Feb 18, 2018
@mwjames mwjames added this to the SMW 3.0.0 milestone Feb 18, 2018
@mwjames
Copy link
Contributor Author

mwjames commented Feb 18, 2018

@kghbln This is where the minimal framework of #3019 comes into play without being a strict production rule but here LINK_FORMAT_RULE defines what sort of rules need to be enacted. Using JSON allows to define structured elements with a schema being able to enforce elements and types.

Of course, if the property has no formatter rule then it just displays the annotated value.

I needed this type for a better performance on the upcoming feature in order to decide when to use a term instead of a phrase_search with a keyword type always being normalized (it just means I can use a not_analyzed field without having to care about how the query value is formatted).

@JeroenDeDauw
Copy link
Member

Cool

@mwjames mwjames merged commit dfd2b43 into master Feb 24, 2018
@mwjames mwjames deleted the keyw-type branch February 24, 2018 19:40
@kghbln kghbln added the wikidocu missing Code changes (mostly features) what have not yet been documented label Feb 26, 2018
@kghbln kghbln mentioned this pull request Feb 26, 2018
2 tasks
wmfgerrit pushed a commit to wikimedia/translatewiki that referenced this pull request Feb 26, 2018
References new datatype keyword:
* SemanticMediaWiki/SemanticMediaWiki#3020

Change-Id: I44b4f9855f8c2511e485a2a35c0f20e7f9cc289d
This was referenced Feb 26, 2018
@krabina
Copy link
Contributor

krabina commented Dec 12, 2018

added Wikidokuk at https://www.semantic-mediawiki.org/wiki/Help:Type_Keyword

@krabina
Copy link
Contributor

krabina commented Nov 1, 2022

note: in the examples above, LINK_FORMAT_RULE must be changed to LINK_FORMAT_SCHEMA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature A new, or altered behaviour of an existing functionality that fundamentally impacts behaviour wikidocu missing Code changes (mostly features) what have not yet been documented
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants