<a href="https://colab.research.google.com/github/donaldziff/kgqa-ucb-210/blob/main/training/summarization/AskWiki_Wikidata_feature_generation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ask Wiki Problem definition 
AskWiki must perform 2 tasks sequentially, first is to construct a SPARQL query based on question, second task is to verbalize and generate an answer from the query results. 

Author: shrinivasbjoshi@berkeley.edu

#Data Extraction and Feature Generation
To generate NL answer to a question, Askwiki must extract sturctured and relevant information from a data source and feed the input to a model which can generate and summarize answer and present it back to the user. 

As part of W210 capstone project, we are considered wikidata as our structured source of data. 

Intution behind choosing wikidata 
1. It supports sparql 
2. Provides access to structured information on varied topics 
3. Does not put domain limitation which is important for AskWiki, as it is trying to showcase applicability across multiple domains of data and intends reduce large scale training requirements 
4. Provides hierachial information mapped in knowledge graph perspective 
5. Provided ready to use dataset on which Spaqrl generation was tested.


Approach to feature generation 

1. Any sparql may result in 
  
  1.1 Single answer [number, string, boolen etc] 
  
  1.2 Single row mutliple colums answer [providing answer,contextual information and the wikiobject ]
  
  1.3 Multiple row multiple columns answer [providing multiple wikiobjects and respective summaries] 
  
  1.4 Return no results [if no data is found using sparql] 

2. We want to write generic functions using python sparqlwrapper and wikidata packages to 
  
  2.1 Execute the sparql across wikidata and gather results 
  
  2.2 Extract embedded wikidata objects in the result set if present
  
  2.3 Parse the wikidata objects properties [mentioned as wikiprop in rest of the notebook]
  
  2.4 Generate TRIPLES from the above information indicating positional embeddings of subject predicate and object from wikidata 
  
  2.5 Aggregate and format the triples in a certain way so that it can be fed into the NLG & Summarization model 

3. Given the limitations of infrastructure and project timelines, we have taken certain assumption outlines [explained through the notebook] to ensure that our code executes in aceptable time frame. Please note wiki also enforces certain limitations https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual

4. Entity profoling and custom label genration is out of scope for AskWiki MVP





# Install Packages and Import libraries



In [None]:
!pip install sparqlwrapper
!pip install Wikidata
import pandas as pd
import sys
from SPARQLWrapper import SPARQLWrapper, JSON
endpoint_url = "https://query.wikidata.org/sparql"
from wikidata.client import Client

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
# pip install these as needed for testing with openai
!pip install openai

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting openai
  Downloading openai-0.27.4-py3-none-any.whl (70 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.3/70.3 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
Collecting aiohttp
  Downloading aiohttp-3.8.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m18.3 MB/s[0m eta [36m0:00:00[0m
Collecting aiosignal>=1.1.2
  Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting multidict<7.0,>=4.5
  Downloading multidict-6.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.2/114.2 kB[0m [31m12.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting frozenlist>=1.1.1
  Downloading frozenlist-1.3.3-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux

#Open AI functions and code

In [None]:
import time
import openai
import pandas as pd
import os
import subprocess

API Key is removed from this code for security reasons 

In [None]:
openai.api_key = 'removed'
import os
os.environ['OPENAI_API_KEY'] = 'removed'
os.environ['OPENAI_ORGANIZATION'] ='org-6Tm9wvTU2DAyCUVamArcvPxV'
openai.organization = "org-6Tm9wvTU2DAyCUVamArcvPxV"
openai.api_key = os.getenv("OPENAI_API_KEY")
openai.Model.list()

AuthenticationError: ignored

run_prompt enables execution of NLG generation against fine tuned open ai model, for now it limits number of tokens generated in its answer min:50 and max: 200 . For performance reasons we have bechmarked to value of 200 at max. At the same time want to strike balance for single scalar answers so in that case lower limit is 50 tokens.

In [None]:
def run_prompt(prompt="", model="text-davinci-003", temperature=0.4, stop=None):
    if len(prompt.split()) < 100:
      max_tokens = 50
    else:
      max_tokens = 200
    response = openai.Completion.create(
        model=model,
        prompt=prompt,
        temperature=temperature,
        max_tokens=max_tokens,
        stop=stop
    )
    return response

In [None]:
#get the model from openai AskWiki Organization
askwiki_davinci_fine_tune = 'davinci:ft-askwiki-2023-04-10-05-41-22'

generate response function enables NLG generation using run_prompt function

In [None]:
def generate_response(input, model=askwiki_davinci_fine_tune, stop=None):
    response = run_prompt(f"{input} ->", model=model, stop=stop)
    # print(response)
    translation = response['choices'][0]['text']
    # print(translation)
    if translation is None or len(translation) == 0:
        return None
    return translation

In [None]:
generate_response(input,  model=askwiki_davinci_fine_tune, stop=[" \n"])

# Approach for feature generation
1. Wikidata supports more than 10K property values per object, please refer to https://www.wikidata.org/wiki/Wikidata:Database_reports/List_of_properties/all for additional information

2. All NLG models have limitations on tokenizers, to make sure we control contextual answer generation we have organized triples into 2 levels 

 2.1 Entity level descriptions [these are priortized first]

 2.2 Wikiprop level descriptions [these are priortized based on wikiprop values after]


3. For AskWiki capstone we are using 2000 wikiprop values [about 20% of the total property space].For the capstone phase of AskWiki we are traking sampling approach, and are baselinining all entities to the std 2K properties.

4. Every wikidata object is an entity, and ideally for structured knowledge based question answer system we should perform entity profiling to generate appropriate feature and label generation. In this phase we are only extracting entity label, where other entity information is extracted from baselined wikiprop valies.

5. Sampling approach does put a limitation on feature generation, i.e certain random feature might be picked up along with the actual answer from the sparql query 

6. To further enhance performance of our feature generation, for every wikiobject we only take first 5 property values [which might be any 5 from the baseline of 2K property], this way we can control the amount of triples that are fed into the NLG tokenizer in the pipeline. 

7. our NLG model has not seen all the 2K properties, it was few shot trained on 256 samples, our fine tuned NLG model is expected to handle language generation portion for the unseen prop values.

###WIkiprop code values

In [None]:
wikiprop = {
  "P6": "head of government",
  "P7": "brother",
  "P9": "sister",
  "P10": "video",
  "P14": "highway marker",
  "P15": "route map",
  "P16": "highway system",
  "P17": "country",
  "P18": "image",
  "P19": "place of birth",
  "P20": "place of death",
  "P21": "sex or gender",
  "P22": "father",
  "P25": "mother",
  "P26": "spouse",
  "P27": "country of citizenship",
  "P30": "continent",
  "P31": "instance of",
  "P35": "head of state",
  "P36": "capital",
  "P37": "official language",
  "P38": "currency",
  "P39": "position held",
  "P40": "child",
  "P41": "flag image",
  "P43": "stepfather",
  "P44": "stepmother",
  "P47": "shares border with",
  "P50": "author",
  "P51": "audio",
  "P53": "noble family",
  "P54": "member of sports team",
  "P57": "director",
  "P58": "screenwriter",
  "P59": "constellation",
  "P61": "discoverer or inventor",
  "P65": "site of astronomical discovery",
  "P66": "ancestral home",
  "P69": "educated at",
  "P78": "top-level internet domain",
  "P81": "connecting line",
  "P84": "architect",
  "P85": "anthem",
  "P86": "composer",
  "P87": "librettist",
  "P88": "commissioned by",
  "P91": "sexual orientation",
  "P92": "main regulatory text",
  "P94": "coat of arms image",
  "P97": "noble title",
  "P98": "editor",
  "P101": "field of work",
  "P102": "member of political party",
  "P103": "native language",
  "P105": "taxon rank",
  "P106": "occupation",
  "P108": "employer",
  "P109": "signature",
  "P110": "illustrator",
  "P111": "measured physical quantity",
  "P112": "founder",
  "P113": "airline hub",
  "P114": "airline alliance",
  "P115": "home venue",
  "P117": "chemical structure",
  "P118": "league",
  "P119": "place of burial",
  "P121": "item operated",
  "P122": "basic form of government",
  "P123": "publisher",
  "P126": "maintained by",
  "P127": "owned by",
  "P128": "regulates (molecular biology)",
  "P129": "physically interacts with",
  "P131": "located in the administrative territorial entity",
  "P134": "has dialect",
  "P135": "movement",
  "P136": "genre",
  "P137": "operator",
  "P138": "named after",
  "P140": "religion",
  "P141": "IUCN conservation status",
  "P143": "imported from",
  "P144": "based on",
  "P149": "architectural style",
  "P150": "contains administrative territorial entities",
  "P154": "logo image",
  "P155": "follows",
  "P156": "followed by",
  "P157": "killed by",
  "P158": "seal image",
  "P159": "headquarters location",
  "P161": "cast member",
  "P162": "producer",
  "P163": "flag",
  "P166": "award received",
  "P167": "structure replaced by",
  "P169": "chief executive officer",
  "P170": "creator",
  "P171": "parent taxon",
  "P172": "ethnic group",
  "P175": "performer",
  "P176": "manufacturer",
  "P177": "crosses",
  "P178": "developer",
  "P179": "series",
  "P180": "depicts",
  "P181": "taxon range map image",
  "P183": "endemic to",
  "P184": "doctoral advisor",
  "P185": "doctoral student",
  "P186": "material used",
  "P189": "location of discovery",
  "P190": "sister city",
  "P193": "main building contractor",
  "P194": "legislative body",
  "P195": "collection",
  "P196": "minor planet group",
  "P197": "adjacent station",
  "P199": "business division",
  "P200": "lake inflows",
  "P201": "lake outflow",
  "P205": "basin country",
  "P206": "located next to body of water",
  "P207": "bathymetry image",
  "P208": "executive body",
  "P209": "highest judicial authority",
  "P210": "party chief representative",
  "P212": "ISBN-13",
  "P213": "ISNI",
  "P214": "VIAF ID",
  "P215": "spectral class",
  "P217": "inventory number",
  "P218": "ISO 639-1 code",
  "P219": "ISO 639-2 code",
  "P220": "ISO 639-3 code",
  "P221": "ISO 639-6 code",
  "P223": "galaxy morphological type",
  "P225": "taxon name",
  "P227": "GND ID",
  "P229": "IATA airline designator",
  "P230": "ICAO airline designator",
  "P231": "CAS registry number",
  "P232": "EINECS number",
  "P233": "canonical SMILES",
  "P234": "InChI",
  "P235": "InChIKey",
  "P236": "ISSN",
  "P237": "coat of arms",
  "P238": "IATA airport code",
  "P239": "ICAO airport code",
  "P240": "FAA airport code",
  "P241": "military branch",
  "P242": "locator map image",
  "P243": "OCLC control number",
  "P244": "LCAuth ID",
  "P245": "ULAN ID",
  "P246": "element symbol",
  "P247": "COSPAR ID",
  "P248": "stated in",
  "P249": "ticker symbol",
  "P263": "official residence",
  "P264": "record label",
  "P267": "ATC code",
  "P268": "BnF ID",
  "P269": "SUDOC authorities",
  "P270": "CALIS",
  "P271": "CiNii author ID",
  "P272": "production company",
  "P274": "chemical formula",
  "P275": "license",
  "P276": "location",
  "P277": "programming language",
  "P278": "GOST 7.75-97 code",
  "P279": "subclass of",
  "P281": "postal code",
  "P282": "writing system",
  "P286": "head coach",
  "P287": "designer",
  "P289": "vessel class",
  "P291": "place of publication",
  "P296": "station code",
  "P297": "ISO 3166-1 alpha-2 code",
  "P298": "ISO 3166-1 alpha-3 code",
  "P299": "ISO 3166-1 numeric code",
  "P300": "ISO 3166-2 code",
  "P301": "category's main topic",
  "P303": "EE breed number",
  "P304": "page",
  "P305": "IETF language tag",
  "P306": "operating system",
  "P344": "director of photography",
  "P345": "IMDb ID",
  "P347": "Joconde ID",
  "P348": "software version",
  "P349": "NDL ID",
  "P350": "RKDimages",
  "P351": "Entrez Gene ID",
  "P352": "UniProt ID",
  "P353": "HGNC gene symbol",
  "P354": "HGNC ID",
  "P355": "subsidiary",
  "P356": "DOI",
  "P357": "(OBSOLETE) title (use P1476, \"title\")",
  "P358": "discography",
  "P359": "Rijksmonument ID",
  "P360": "is a list of",
  "P361": "part of",
  "P364": "original language of work",
  "P366": "use",
  "P367": "astronomic symbol image",
  "P368": "Sandbox-CommonsMediaFile",
  "P369": "Sandbox-Item",
  "P370": "Sandbox-String",
  "P371": "presenter",
  "P373": "Commons category",
  "P374": "INSEE municipality code",
  "P375": "space launch vehicle",
  "P376": "located on astronomical body",
  "P377": "SCN",
  "P380": "Mérimée ID",
  "P381": "PCP reference number",
  "P382": "CBS municipality code",
  "P393": "edition number",
  "P395": "licence plate code",
  "P396": "SBN ID",
  "P397": "parent astronomical body",
  "P398": "child astronomical body",
  "P399": "companion of",
  "P400": "platform",
  "P402": "OpenStreetMap Relation identifier",
  "P403": "mouth of the watercourse",
  "P404": "game mode",
  "P405": "taxon author",
  "P406": "soundtrack album",
  "P407": "language of work or name",
  "P408": "software engine",
  "P409": "NLA (Australia) ID",
  "P410": "military rank",
  "P411": "canonization status",
  "P412": "voice type",
  "P413": "position played on team / speciality",
  "P414": "stock exchange",
  "P415": "radio format",
  "P416": "quantity symbol",
  "P417": "patron saint",
  "P418": "seal description",
  "P421": "located in time zone",
  "P423": "shooting handedness",
  "P424": "Wikimedia language code",
  "P425": "field of this occupation",
  "P426": "aircraft registration",
  "P427": "taxonomic type",
  "P428": "botanist author abbreviation",
  "P429": "dantai code",
  "P432": "callsign of airline",
  "P433": "issue",
  "P434": "MusicBrainz artist ID",
  "P435": "MusicBrainz work ID",
  "P436": "MusicBrainz release group ID",
  "P437": "distribution",
  "P439": "German municipality key",
  "P440": "German district key",
  "P442": "China administrative division code",
  "P443": "pronunciation audio",
  "P444": "review score",
  "P447": "score by",
  "P448": "location of spacecraft launch",
  "P449": "original network",
  "P450": "astronaut mission",
  "P451": "partner",
  "P452": "industry",
  "P453": "character role",
  "P454": "Structurae ID (structure)",
  "P455": "Emporis building ID",
  "P457": "foundational text",
  "P458": "IMO ship number",
  "P459": "determination method",
  "P460": "said to be the same as",
  "P461": "opposite of",
  "P462": "color",
  "P463": "member of",
  "P464": "NOR",
  "P465": "sRGB color hex triplet",
  "P466": "occupant",
  "P467": "legislated by",
  "P468": "dan/kyu rank",
  "P469": "lakes on river",
  "P470": "Eight Banner register",
  "P473": "local dialing code",
  "P474": "country calling code",
  "P476": "CELEX number",
  "P477": "Canadian Register of Historic Places ID",
  "P478": "volume",
  "P479": "input device",
  "P480": "FilmAffinity ID",
  "P481": "Palissy ID",
  "P483": "recorded at",
  "P484": "IMA Number, broad sense",
  "P485": "archives at",
  "P486": "MeSH ID",
  "P487": "Unicode character",
  "P488": "chairperson",
  "P489": "currency symbol description",
  "P490": "provisional designation",
  "P491": "orbit diagram",
  "P492": "OMIM ID",
  "P493": "ICD-9",
  "P494": "ICD-10",
  "P495": "country of origin",
  "P496": "ORCID",
  "P497": "CBDB ID",
  "P498": "ISO 4217 code",
  "P500": "exclave of",
  "P501": "enclave within",
  "P502": "HURDAT identifier",
  "P503": "ISO standard",
  "P504": "home port",
  "P505": "general manager",
  "P506": "ISO 15924 alpha-4 code",
  "P507": "Swedish county code",
  "P508": "BNCF Thesaurus",
  "P509": "cause of death",
  "P511": "honorific prefix",
  "P512": "academic degree",
  "P513": "(OBSOLETE) birth name (use P1477)",
  "P514": "interleaves with",
  "P515": "phase of matter",
  "P516": "powerplant",
  "P517": "interaction",
  "P518": "applies to part",
  "P520": "armament",
  "P521": "scheduled service destination",
  "P522": "type of orbit",
  "P523": "temporal range start",
  "P524": "temporal range end",
  "P525": "Swedish municipality code",
  "P527": "has part",
  "P528": "catalog code",
  "P529": "runway",
  "P530": "diplomatic relation",
  "P531": "diplomatic mission sent",
  "P532": "port of registry",
  "P533": "target",
  "P534": "streak color",
  "P535": "Find a Grave grave ID",
  "P536": "ATP ID",
  "P537": "twinning",
  "P538": "fracturing",
  "P539": "Museofile",
  "P541": "office contested",
  "P542": "officially opened by",
  "P543": "oath made by",
  "P545": "torch lit by",
  "P546": "docking port",
  "P547": "commemorates",
  "P548": "version type",
  "P549": "Mathematics Genealogy Project ID",
  "P550": "chivalric order",
  "P551": "residence",
  "P552": "handedness",
  "P553": "website account on",
  "P554": "website username",
  "P555": "doubles record",
  "P556": "crystal system",
  "P557": "DiseasesDB",
  "P558": "unit symbol",
  "P559": "terminus",
  "P560": "direction",
  "P561": "NATO reporting name",
  "P562": "central bank/issuer",
  "P563": "ICD-O",
  "P564": "singles record",
  "P565": "crystal habit",
  "P566": "basionym",
  "P567": "underlies",
  "P568": "overlies",
  "P569": "date of birth",
  "P570": "date of death",
  "P571": "inception",
  "P574": "date of taxon name publication",
  "P575": "time of discovery",
  "P576": "dissolved or abolished",
  "P577": "publication date",
  "P578": "Sandbox-TimeValue",
  "P579": "IMA status and/or rank",
  "P580": "start time",
  "P582": "end time",
  "P585": "point in time",
  "P586": "IPNI author ID",
  "P587": "MMSI",
  "P588": "coolant",
  "P589": "point group",
  "P590": "GNIS ID",
  "P591": "EC number",
  "P592": "ChEMBL ID",
  "P593": "HomoloGene ID",
  "P594": "Ensembl Gene ID",
  "P595": "IUPHAR ID",
  "P597": "WTA ID",
  "P598": "commander of",
  "P599": "ITF ID",
  "P600": "Wine AppDB-ID",
  "P604": "MedlinePlus ID",
  "P605": "NUTS code",
  "P606": "first flight",
  "P607": "conflict",
  "P608": "exhibition history",
  "P609": "terminus location",
  "P610": "highest point",
  "P611": "religious order",
  "P612": "mother house",
  "P613": "OS grid reference",
  "P617": "yard number",
  "P618": "source of energy",
  "P619": "time of spacecraft launch",
  "P620": "time of spacecraft landing",
  "P621": "time of spacecraft orbit decay",
  "P622": "spacecraft docking/undocking date",
  "P624": "guidance system",
  "P625": "coordinate location",
  "P626": "Sandbox-GeoCoordinateValue",
  "P627": "IUCN-ID",
  "P628": "E number",
  "P629": "edition or translation of",
  "P630": "Paris city digital code",
  "P631": "structural engineer",
  "P632": "cultural properties of Belarus reference number",
  "P633": "Québec cultural heritage directory ID",
  "P634": "captain",
  "P635": "ISTAT ID",
  "P636": "route of administration",
  "P637": "RefSeq Protein ID",
  "P638": "PDB ID",
  "P639": "RefSeq RNA ID",
  "P640": "Léonore ID",
  "P641": "sport",
  "P642": "of",
  "P644": "genomic start",
  "P645": "genomic end",
  "P646": "Freebase ID",
  "P647": "drafted by",
  "P648": "Open Library ID",
  "P649": "NRHP reference number",
  "P650": "RKDartists",
  "P651": "Biografisch Portaal number",
  "P652": "UNII",
  "P653": "PubMed Health",
  "P654": "direction relative to location",
  "P655": "translator",
  "P656": "RefSeq",
  "P657": "RTECS number",
  "P658": "tracklist",
  "P659": "genomic assembly",
  "P660": "EC classification",
  "P661": "ChemSpider ID",
  "P662": "PubChem ID (CID)",
  "P663": "DSM IV",
  "P664": "organizer",
  "P665": "KEGG ID",
  "P667": "ICPC 2 ID",
  "P668": "GeneReviews ID",
  "P669": "located on street",
  "P670": "street number",
  "P671": "Mouse Genome Informatics ID",
  "P672": "MeSH Code",
  "P673": "eMedicine",
  "P674": "characters",
  "P675": "Google Books ID",
  "P676": "lyrics by",
  "P677": "ÚSOP code",
  "P678": "incertae sedis",
  "P679": "ZVG number",
  "P680": "molecular function",
  "P681": "cell component",
  "P682": "biological process",
  "P683": "ChEBI ID",
  "P684": "ortholog",
  "P685": "NCBI Taxonomy ID",
  "P686": "Gene Ontology ID",
  "P687": "BHL Page ID",
  "P688": "encodes",
  "P689": "afflicts",
  "P690": "space group",
  "P691": "NKCR AUT ID",
  "P692": "Gene Atlas Image",
  "P693": "cleavage",
  "P694": "replaced synonym (for nom. nov.)",
  "P695": "UN number",
  "P696": "Neurolex ID",
  "P697": "ex taxon author",
  "P698": "PubMed ID",
  "P699": "Disease Ontology ID",
  "P700": "Kemler ID",
  "P701": "Dodis",
  "P702": "encoded by",
  "P703": "found in taxon",
  "P704": "Ensembl Transcript ID",
  "P705": "Ensembl Protein ID",
  "P706": "located on terrain feature",
  "P707": "satellite bus",
  "P708": "diocese",
  "P709": "Historic Scotland ID",
  "P710": "participant",
  "P711": "Strunz 8th edition (series ID)",
  "P712": "Nickel-Strunz 9th edition (updated 2009)",
  "P713": "Nickel-Strunz 10th (pending) edition",
  "P714": "Dana 8th edition",
  "P715": "Drugbank ID",
  "P716": "JPL Small-Body Database ID",
  "P717": "Minor Planet Center observatory code",
  "P718": "Canmore ID",
  "P720": "asteroid spectral type",
  "P721": "OKATO ID",
  "P722": "UIC station code",
  "P723": "DBNL ID",
  "P724": "Internet Archive ID",
  "P725": "voice actor",
  "P726": "candidate",
  "P727": "Europeana ID",
  "P728": "GHS hazard statement",
  "P729": "service entry",
  "P730": "service retirement",
  "P731": "Litholex ID",
  "P732": "BGS Lexicon ID",
  "P733": "DINOloket",
  "P734": "family name",
  "P735": "given name",
  "P736": "cover artist",
  "P737": "influenced by",
  "P739": "ammunition",
  "P740": "location of formation",
  "P741": "playing hand",
  "P742": "pseudonym",
  "P744": "asteroid family",
  "P745": "Low German Bibliography and Biography ID",
  "P746": "date of disappearance",
  "P747": "edition(s)",
  "P748": "appointed by",
  "P749": "parent organization",
  "P750": "distributor",
  "P751": "introduced feature",
  "P756": "removed feature",
  "P757": "World Heritage Site ID",
  "P758": "Kulturminne ID",
  "P759": "Alberta Register of Historic Places ID",
  "P760": "DPLA ID",
  "P761": "WISS_ID",
  "P762": "Czech cultural heritage ID",
  "P763": "PEI Register of Historic Places ID",
  "P764": "OKTMO ID",
  "P765": "surface played on",
  "P767": "contributor",
  "P768": "electoral district",
  "P769": "significant drug interaction",
  "P770": "cause of destruction",
  "P771": "Swiss municipality code",
  "P772": "INE municipality code",
  "P773": "ISO 3166-3",
  "P774": "FIPS 55-3 (locations in the US)",
  "P775": "Swedish urban area code",
  "P776": "Swedish minor urban area code",
  "P777": "Swedish civil parish code/ATA code",
  "P778": "Church of Sweden parish code",
  "P779": "Church of Sweden Pastoratskod",
  "P780": "symptoms",
  "P781": "Sikart",
  "P782": "LAU",
  "P783": "hymenium type",
  "P784": "mushroom cap shape",
  "P785": "hymenium attachment",
  "P786": "stipe character",
  "P787": "spore print color",
  "P788": "mushroom ecological type",
  "P789": "edibility",
  "P790": "approved by",
  "P791": "ISIL ID",
  "P792": "chapter",
  "P793": "significant event",
  "P794": "as",
  "P795": "distance along",
  "P796": "geo datum",
  "P797": "authority",
  "P798": "military designation",
  "P799": "Air Ministry specification ID",
  "P800": "notable work",
  "P802": "student",
  "P803": "professorship",
  "P804": "GNIS Antarctica ID",
  "P805": "subject of",
  "P806": "Italian cadastre code",
  "P807": "separated from",
  "P808": "code Bien de Interés Cultural",
  "P809": "WDPA id",
  "P811": "academic minor",
  "P812": "academic major",
  "P813": "retrieved",
  "P814": "IUCN protected areas category",
  "P815": "ITIS TSN",
  "P816": "decays to",
  "P817": "decay mode",
  "P818": "arXiv ID",
  "P819": "ADS bibcode",
  "P820": "arXiv classification",
  "P821": "CGNDB Unique ID",
  "P822": "mascot",
  "P823": "speaker",
  "P824": "Meteoritical Bulletin Database ID",
  "P825": "dedicated to",
  "P826": "tonality",
  "P827": "BBC program ID",
  "P828": "has cause",
  "P829": "OEIS ID",
  "P830": "Encyclopedia of Life ID",
  "P831": "parent club",
  "P832": "public holiday",
  "P833": "interchange station",
  "P834": "train depot",
  "P835": "author citation (zoology)",
  "P836": "GSS code (2011)",
  "P837": "day in year for periodic occurrence",
  "P838": "BioLib ID",
  "P839": "IMSLP ID",
  "P840": "narrative location",
  "P841": "feast day",
  "P842": "Fossilworks ID",
  "P843": "SIRUTA code",
  "P844": "UBIGEO code",
  "P845": "Saskatchewan Register of Heritage Property ID",
  "P846": "Global Biodiversity Information Facility ID",
  "P847": "United States Navy aircraft designation",
  "P849": "Japanese military aircraft designation",
  "P850": "World Register of Marine Species ID",
  "P852": "ESRB rating",
  "P853": "CERO rating",
  "P854": "reference URL",
  "P855": "Sandbox-URL",
  "P856": "official website",
  "P858": "ESPN SCRUM ID",
  "P859": "sponsor",
  "P860": "e-archiv.li ID",
  "P861": "premiershiprugby.com ID",
  "P862": "Operational Requirement of the UK Air Ministry",
  "P863": "InPhO ID",
  "P864": "ACM Digital Library author ID",
  "P865": "BMLO",
  "P866": "Perlentaucher ID",
  "P867": "ROME Occupation Code (v3)",
  "P868": "foods traditionally associated",
  "P870": "instrumentation",
  "P872": "printed by",
  "P873": "phase point",
  "P874": "UN class",
  "P875": "UN code classification",
  "P876": "UN packaging group",
  "P877": "NFPA Other",
  "P878": "avionics",
  "P879": "pennant number",
  "P880": "CPU",
  "P881": "type of variable star",
  "P882": "FIPS 6-4 (US counties)",
  "P883": "FIPS 5-2 (code for US states)",
  "P884": "State Water Register Code (Russia)",
  "P885": "origin of the watercourse",
  "P886": "LIR",
  "P887": "based on heuristic",
  "P888": "JSTOR article ID",
  "P889": "Mathematical Reviews ID",
  "P892": "RfC ID",
  "P893": "Social Science Research Network ID",
  "P894": "Zentralblatt MATH",
  "P897": "United States Army and Air Force aircraft designation",
  "P898": "IPA transcription",
  "P901": "FIPS 10-4 (countries and regions)",
  "P902": "HDS ID",
  "P905": "PORT film ID",
  "P906": "SELIBR",
  "P907": "allgame ID",
  "P908": "PEGI rating",
  "P909": "Nova Scotia Register of Historic Places ID",
  "P910": "topic's main category",
  "P911": "South African municipality code",
  "P912": "has facility",
  "P913": "notation",
  "P914": "USK rating",
  "P915": "filming location",
  "P916": "GSRR rating",
  "P917": "GRAU index",
  "P918": "NOC Occupation Code",
  "P919": "SOC Code (2010)",
  "P920": "Spanish subject headings for public libraries",
  "P921": "main subject",
  "P922": "magnetic ordering",
  "P923": "medical examinations",
  "P924": "medical treatment",
  "P925": "presynaptic connection",
  "P926": "postsynaptic connection",
  "P927": "anatomical location",
  "P928": "activating neurotransmitter",
  "P929": "color space",
  "P930": "type of electrification",
  "P931": "place served by airport",
  "P932": "PMCID",
  "P933": "heritagefoundation.ca ID",
  "P935": "Commons gallery",
  "P937": "work location",
  "P938": "FishBase species ID",
  "P939": "KSH code",
  "P940": "GHS precautionary statements",
  "P941": "inspired by",
  "P942": "theme music",
  "P943": "programmer",
  "P944": "Code of nomenclature",
  "P945": "allegiance",
  "P946": "ISIN",
  "P947": "RSL ID (person)",
  "P948": "Wikivoyage banner",
  "P949": "National Library of Israel ID",
  "P950": "BNE ID",
  "P951": "NSZL ID",
  "P952": "ISCO occupation code",
  "P953": "full text available at",
  "P954": "IBNR ID",
  "P957": "ISBN-10",
  "P958": "section, verse, or paragraph",
  "P959": "MSW ID",
  "P960": "Tropicos taxon name ID",
  "P961": "IPNI plant ID",
  "P962": "MycoBank taxon name ID",
  "P963": "streaming media URL",
  "P964": "Austrian municipality key",
  "P965": "burial plot reference",
  "P966": "MusicBrainz label ID",
  "P967": "guest of honor",
  "P968": "e-mail",
  "P969": "located at street address",
  "P970": "neurological function",
  "P971": "category combines topics",
  "P972": "catalog",
  "P973": "described at URL",
  "P974": "tributary",
  "P980": "code for weekend and holiday homes (Sweden)",
  "P981": "BAG-code for Dutch locations",
  "P982": "MusicBrainz area ID",
  "P984": "IOC country code",
  "P988": "Philippine Standard Geographic Code",
  "P989": "spoken text audio",
  "P990": "audio recording of the subject's spoken voice",
  "P991": "successful candidate",
  "P993": "NFPA Health",
  "P994": "NFPA Fire",
  "P995": "NFPA Reactivity",
  "P996": "scanned file on Wikimedia Commons",
  "P998": "dmoz ID",
  "P999": "ARICNS",
  "P1000": "record held",
  "P1001": "applies to jurisdiction",
  "P1002": "engine configuration",
  "P1003": "NLR (Romania) ID",
  "P1004": "MusicBrainz place ID",
  "P1005": "PTBNP ID",
  "P1006": "National Thesaurus for Author Names ID",
  "P1007": "Lattes Platform number",
  "P1010": "Iran statistics ID",
  "P1011": "excluding",
  "P1012": "including",
  "P1013": "criterion used",
  "P1014": "AAT ID",
  "P1015": "BIBSYS ID",
  "P1016": "asteroid taxonomy",
  "P1017": "BAV ID",
  "P1018": "language regulatory body",
  "P1019": "feed URL",
  "P1021": "KldB-2010 occupation code",
  "P1022": "CNO-11 occupation code",
  "P1023": "SBC-2010 occupation code",
  "P1024": "SBFI occupation code",
  "P1025": "SUDOC editions",
  "P1026": "doctoral thesis",
  "P1027": "conferred by",
  "P1028": "donated by",
  "P1029": "crew member",
  "P1030": "light characteristic of a lighthouse",
  "P1031": "legal citation of this text",
  "P1032": "Digital Rights Management system",
  "P1033": "GHS signal word",
  "P1034": "main food source",
  "P1035": "honorific suffix",
  "P1036": "Dewey Decimal Classification",
  "P1037": "manager/director",
  "P1038": "relative",
  "P1039": "type of kinship",
  "P1040": "film editor",
  "P1041": "sockets supported",
  "P1042": "ZDB ID",
  "P1043": "IDEO Job ID",
  "P1044": "SWB editions",
  "P1045": "Sycomore ID",
  "P1046": "discovery method",
  "P1047": "Catholic Hierarchy person ID",
  "P1048": "NCL ID",
  "P1049": "deity of",
  "P1050": "medical condition",
  "P1051": "PSH ID",
  "P1052": "Portuguese Job Code CPP-2010",
  "P1053": "ResearcherID",
  "P1054": "NDL bib id",
  "P1055": "NLM Unique ID",
  "P1056": "product",
  "P1057": "chromosome",
  "P1058": "ERA Journal ID",
  "P1059": "CVR number",
  "P1060": "pathogen transmission process",
  "P1064": "track gauge",
  "P1065": "archive URL",
  "P1066": "student of",
  "P1067": "Thailand central administrative unit code",
  "P1068": "instruction set",
  "P1069": "Statistics Denmarks classification of occupation (DISCO-08)",
  "P1070": "PlantList-ID",
  "P1071": "location of final assembly",
  "P1072": "readable file format",
  "P1073": "writable file format",
  "P1074": "fictional analog of",
  "P1075": "rector",
  "P1076": "ICTV virus ID",
  "P1077": "KOATUU identifier",
  "P1078": "valvetrain configuration",
  "P1079": "launch contractor",
  "P1080": "from fictional universe",
  "P1081": "Human Development Index",
  "P1082": "population",
  "P1083": "maximum capacity",
  "P1084": "EUL editions",
  "P1085": "LibraryThing work ID",
  "P1086": "atomic number",
  "P1087": "Elo rating",
  "P1088": "Mohs' hardness",
  "P1090": "redshift",
  "P1092": "total produced",
  "P1093": "gross tonnage",
  "P1096": "orbital eccentricity",
  "P1097": "g-factor",
  "P1098": "number of speakers",
  "P1099": "number of masts",
  "P1100": "number of cylinders",
  "P1101": "floors above ground",
  "P1102": "flattening",
  "P1103": "number of platform tracks",
  "P1104": "number of pages",
  "P1106": "sandbox-quantity",
  "P1107": "proportion",
  "P1108": "electronegativity",
  "P1109": "refractive index",
  "P1110": "attendance",
  "P1111": "votes received",
  "P1112": "Pokédex number",
  "P1113": "number of episodes",
  "P1114": "quantity",
  "P1115": "ATVK ID",
  "P1116": "ELSTAT geographical code",
  "P1117": "pKa",
  "P1120": "number of deaths",
  "P1121": "oxidation state",
  "P1122": "spin quantum number",
  "P1123": "parity",
  "P1124": "TEU",
  "P1125": "Gini coefficient",
  "P1126": "isospin quantum number",
  "P1127": "isospin z-component",
  "P1128": "employees",
  "P1129": "national team caps",
  "P1132": "number of participants",
  "P1133": "DGO4 identifier",
  "P1135": "nomenclatural status",
  "P1136": "solved by",
  "P1137": "fossil found in this unit",
  "P1138": "Kunstindeks Danmark Artist ID",
  "P1139": "floors below ground",
  "P1140": "EHAK id",
  "P1141": "number of processor cores",
  "P1142": "political ideology",
  "P1143": "BN (Argentine) editions",
  "P1144": "LCOC LCCN (bibliographic)",
  "P1145": "Lagrangian point",
  "P1146": "IAAF ID",
  "P1148": "neutron number",
  "P1149": "Library of Congress Classification",
  "P1150": "Regensburg Classification",
  "P1151": "topic's main Wikimedia portal",
  "P1153": "Scopus Author ID",
  "P1154": "Scopus EID",
  "P1155": "Scopus Affiliation ID",
  "P1156": "Scopus Source ID",
  "P1157": "US Congress Bio ID",
  "P1158": "location of landing",
  "P1159": "CODEN",
  "P1160": "ISO 4 abbreviation",
  "P1161": "Z39.5 abbreviation",
  "P1162": "Bluebook abbreviation",
  "P1163": "Internet media type",
  "P1164": "cardinality of the group",
  "P1165": "home world",
  "P1167": "USB ID",
  "P1168": "municipality code (Denmark)",
  "P1170": "transmitted signal",
  "P1171": "approximation algorithm",
  "P1172": "Geokod",
  "P1174": "visitors per year",
  "P1181": "numeric value",
  "P1182": "LIBRIS editions",
  "P1183": "Gewässerkennzahl",
  "P1184": "handle",
  "P1185": "Rodovid ID",
  "P1186": "MEP directory ID",
  "P1187": "Dharma Drum Buddhist College person ID",
  "P1188": "Dharma Drum Buddhist College place ID",
  "P1189": "Chinese Library Classification",
  "P1190": "Universal Decimal Classification",
  "P1191": "first performance",
  "P1192": "connecting service",
  "P1193": "prevalence",
  "P1194": "received signal",
  "P1195": "file extension",
  "P1196": "manner of death",
  "P1198": "unemployment rate",
  "P1199": "mode of inheritance",
  "P1200": "bodies of water basin category",
  "P1201": "space tug",
  "P1202": "carries scientific instrument",
  "P1203": "Finnish municipality number",
  "P1204": "Wikimedia portal's main topic",
  "P1207": "NUKAT (WarsawU) authorities",
  "P1208": "ISMN",
  "P1209": "CN",
  "P1210": "supercharger",
  "P1211": "fuel system",
  "P1212": "Atlas ID",
  "P1213": "NLC authorities",
  "P1214": "Riksdagen person-id",
  "P1215": "apparent magnitude",
  "P1216": "National Heritage List for England number",
  "P1217": "Internet Broadway Database venue ID",
  "P1218": "Internet Broadway Database production ID",
  "P1219": "Internet Broadway Database show ID",
  "P1220": "Internet Broadway Database person ID",
  "P1221": "compressor type",
  "P1222": "NARA person ID",
  "P1223": "NARA organization ID",
  "P1224": "NARA geographic ID",
  "P1225": "NARA topical subject ID",
  "P1226": "NARA specific records type ID",
  "P1227": "astronomical filter",
  "P1229": "Openpolis ID",
  "P1230": "JSTOR journal code",
  "P1231": "NARA catalog record ID",
  "P1232": "Linguist list code",
  "P1233": "ISFDB author ID",
  "P1234": "ISFDB publication ID",
  "P1235": "ISFDB series ID",
  "P1236": "Parsons code",
  "P1237": "Box Office Mojo film ID",
  "P1238": "Swedish Football Association ID",
  "P1239": "ISFDB publisher ID",
  "P1240": "Danish Bibliometric Research Indicator level",
  "P1241": "Swiss Football Association Club Number",
  "P1242": "Theatricalia play ID",
  "P1243": "International Standard Recording Code",
  "P1244": "phone number (URL)",
  "P1245": "OmegaWiki Defined Meaning",
  "P1246": "patent number",
  "P1247": "compression ratio",
  "P1248": "KulturNav-id",
  "P1249": "time of earliest written record",
  "P1250": "Danish Bibliometric Research Indicator (BFI) SNO/CNO",
  "P1251": "ABS ASCL code",
  "P1252": "AUSTLANG code",
  "P1253": "BCU Ecrivainsvd",
  "P1254": "Slovenska biografija ID",
  "P1255": "Helveticarchives ID",
  "P1256": "Iconclass notation",
  "P1257": "depicts Iconclass notation",
  "P1258": "Rotten Tomatoes ID",
  "P1259": "coordinates of the point of view",
  "P1260": "Cultural heritage database in Sweden",
  "P1261": "Rundata",
  "P1262": "RAÄ-nummer",
  "P1263": "NNDB people ID",
  "P1264": "valid in period",
  "P1265": "AlloCiné Movie ID",
  "P1266": "AlloCiné person ID",
  "P1267": "AlloCiné series ID",
  "P1268": "represents organisation",
  "P1269": "facet of",
  "P1270": "Norway Database for Statistics on Higher education periodical ID",
  "P1271": "Norway Database for Statistics on Higher education publisher ID",
  "P1272": "Norway Import Service and Registration Authority periodical code",
  "P1273": "CANTIC-ID",
  "P1274": "ISFDB title ID",
  "P1275": "Norway Import Service and Registration Authority publisher code",
  "P1276": "Dictionnaire du Jura ID",
  "P1277": "Jufo ID",
  "P1278": "Legal Entity ID",
  "P1279": "inflation rate",
  "P1280": "CONOR ID",
  "P1281": "WOEID",
  "P1282": "OpenStreetMap tag or key",
  "P1283": "filmography",
  "P1284": "Munzinger IBA",
  "P1285": "Munzinger Sport number",
  "P1286": "Munzinger Pop ID",
  "P1287": "KDG Komponisten der Gegenwart",
  "P1288": "KLG Kritisches Lexikon der Gegenwartsliteratur",
  "P1289": "KLfG Kritisches Lexikon der fremdsprachigen Gegenwartsliteratur",
  "P1290": "godparent",
  "P1291": "Association Authors of Switzerland ID",
  "P1292": "DNB editions",
  "P1293": "Royal Aero Club Aviator's Certificate ID",
  "P1294": "WWF ecoregion code",
  "P1295": "emissivity",
  "P1296": "Gran Enciclopèdia Catalana ID",
  "P1297": "IRS Employer ID",
  "P1299": "depicted by",
  "P1300": "bibcode",
  "P1301": "number of elevators",
  "P1302": "primary destinations",
  "P1303": "instrument",
  "P1304": "central bank",
  "P1305": "Skyscraper Center ID",
  "P1307": "Swiss parliament ID",
  "P1308": "officeholder",
  "P1309": "EGAXA ID",
  "P1310": "statement disputed by",
  "P1311": "lostbridges.org ID",
  "P1312": "has facet polytope",
  "P1313": "office held by head of government",
  "P1314": "number of spans",
  "P1315": "People Australia ID",
  "P1316": "SMDB ID",
  "P1317": "floruit",
  "P1318": "proved by",
  "P1319": "earliest date",
  "P1320": "OpenCorporates ID",
  "P1321": "place of origin (Switzerland)",
  "P1322": "dual to",
  "P1323": "Terminologia Anatomica 98",
  "P1324": "source code repository",
  "P1325": "external data available at",
  "P1326": "latest date",
  "P1327": "professional or sports partner",
  "P1329": "phone number",
  "P1330": "MusicBrainz instrument ID",
  "P1331": "PACE member ID",
  "P1332": "coordinate of northernmost point",
  "P1333": "coordinate of southernmost point",
  "P1334": "coordinate of easternmost point",
  "P1335": "coordinate of westernmost point",
  "P1336": "territory claimed by",
  "P1338": "EPSG ID",
  "P1339": "number of injured",
  "P1340": "eye color",
  "P1341": "Italian Chamber of Deputies ID",
  "P1342": "number of seats",
  "P1343": "described by source",
  "P1344": "participant of",
  "P1345": "number of victims made by killer",
  "P1346": "winner",
  "P1347": "military casualty classification",
  "P1348": "AlgaeBase URL",
  "P1349": "ploidy",
  "P1350": "number of matches played",
  "P1351": "number of points/goals scored",
  "P1352": "ranking",
  "P1353": "original spelling",
  "P1354": "shown with features",
  "P1355": "wins",
  "P1356": "losses",
  "P1357": "matches/games drawn/tied",
  "P1358": "points for",
  "P1359": "number of points/goals conceded",
  "P1360": "Monte Carlo Particle Number",
  "P1362": "Theaterlexikon der Schweiz online ID",
  "P1363": "points/goal scored by",
  "P1364": "ITTF ID",
  "P1365": "replaces",
  "P1366": "replaced by",
  "P1367": "Art UK artist ID",
  "P1368": "LNB ID",
  "P1369": "Iranian National Heritage registration number",
  "P1370": "IHSI ID",
  "P1371": "ASI Monument ID",
  "P1372": "binding of software library",
  "P1373": "daily ridership",
  "P1375": "NSK ID",
  "P1376": "capital of",
  "P1377": "MTR station code",
  "P1378": "China railway TMIS station code",
  "P1380": "uglybridges.com ID",
  "P1381": "bridgehunter.com ID",
  "P1382": "coincident with",
  "P1383": "contains settlement",
  "P1385": "Enciclopédia Açoriana ID",
  "P1386": "Japanese High School Code",
  "P1387": "political alignment",
  "P1388": "German regional key",
  "P1389": "product certification",
  "P1390": "match time of score (minutes)",
  "P1391": "Index Fungorum ID",
  "P1392": "ComicBookDB ID",
  "P1393": "proxy",
  "P1394": "Glottolog code",
  "P1395": "National Cancer Institute ID",
  "P1396": "Linguasphere code",
  "P1397": "State Catalogue of Geographical Names (Russia) ID",
  "P1398": "structure replaces",
  "P1399": "convicted of",
  "P1400": "FCC Facility ID",
  "P1401": "bug tracking system",
  "P1402": "Foundational Model of Anatomy ID",
  "P1403": "original combination",
  "P1404": "World Glacier Inventory ID",
  "P1406": "script directionality",
  "P1407": "MusicBrainz series ID",
  "P1408": "licensed to broadcast to",
  "P1409": "Cycling Archives ID (cyclist)",
  "P1410": "number of representatives in an organization/legislature",
  "P1411": "nominated for",
  "P1412": "languages spoken or written",
  "P1414": "GUI toolkit or framework",
  "P1415": "Oxford Biography Index Number",
  "P1416": "affiliation",
  "P1417": "Encyclopædia Britannica Online ID",
  "P1418": "orbits completed",
  "P1419": "shape",
  "P1420": "taxon synonym",
  "P1421": "GRIN URL",
  "P1422": "Sandrart.net person ID",
  "P1423": "template's main topic",
  "P1424": "topic's main template",
  "P1425": "ecoregion (WWF)",
  "P1427": "start point",
  "P1428": "Lost Art-ID",
  "P1429": "pet",
  "P1430": "OpenPlaques subject ID",
  "P1431": "executive producer",
  "P1432": "b-side",
  "P1433": "published in",
  "P1434": "describes the fictional universe",
  "P1435": "heritage status",
  "P1436": "collection or exhibition size",
  "P1437": "plea",
  "P1438": "Jewish Encyclopedia ID (Russian)",
  "P1439": "Norsk filmografi ID",
  "P1440": "Fide ID",
  "P1441": "present in work",
  "P1442": "image of grave",
  "P1443": "score method",
  "P1444": "destination point",
  "P1445": "fictional universe described in",
  "P1446": "number of missing",
  "P1447": "Sports Reference ID",
  "P1448": "official name",
  "P1449": "nickname",
  "P1450": "Sandbox Monolingual text",
  "P1451": "motto text",
  "P1453": "catholic.ru ID",
  "P1454": "legal form",
  "P1455": "list of works",
  "P1456": "list of monuments",
  "P1457": "absolute magnitude",
  "P1458": "color index",
  "P1459": "Cadw Building ID",
  "P1460": "NIEA building ID",
  "P1461": "Patientplus ID",
  "P1462": "standards body",
  "P1463": "PRDL Author ID",
  "P1464": "category for people born here",
  "P1465": "category for people who died here",
  "P1466": "WALS lect code",
  "P1467": "WALS genus code",
  "P1468": "WALS family code",
  "P1469": "FIFA player code",
  "P1470": "maximum glide ratio",
  "P1471": "reporting mark",
  "P1472": "Commons Creator page",
  "P1473": "Nupill Literatura Digital - Author",
  "P1474": "Nupill Literatura Digital - Document",
  "P1476": "title",
  "P1477": "birth name",
  "P1478": "has immediate cause",
  "P1479": "has contributing factor",
  "P1480": "sourcing circumstances",
  "P1481": "vici.org ID",
  "P1482": "Stack Exchange tag",
  "P1483": "kulturnoe-nasledie.ru ID",
  "P1529": "Glad identifier",
  "P1531": "parent(s) of this hybrid",
  "P1532": "country for sport",
  "P1533": "family name identical to this first name",
  "P1534": "end cause",
  "P1535": "used by",
  "P1536": "immediate cause of",
  "P1537": "contributing factor of",
  "P1538": "number of households",
  "P1539": "female population",
  "P1540": "male population",
  "P1541": "Cycling Quotient (cyclist, man) ID",
  "P1542": "cause of",
  "P1543": "monogram",
  "P1544": "Federal Register Document Number",
  "P1545": "series ordinal",
  "P1546": "motto",
  "P1547": "depends on software",
  "P1548": "maximum Strahler number",
  "P1549": "demonym",
  "P1550": "Orphanet ID",
  "P1551": "Exceptional heritage of Wallonia ID",
  "P1552": "has quality",
  "P1553": "Yandex.Music artist ID",
  "P1554": "UBERON ID",
  "P1555": "Executive Order number",
  "P1556": "zbMATH author ID",
  "P1557": "manifestation of",
  "P1558": "tempo marking",
  "P1559": "name in native language",
  "P1560": "given name version for other gender",
  "P1561": "number of survivors",
  "P1562": "AllMovie Movie ID",
  "P1563": "MacTutor id (biographies)",
  "P1564": "At the Circulating Library ID",
  "P1565": "Enciclopedia de la Literatura en México ID",
  "P1566": "GeoNames ID",
  "P1567": "NIS/INS code",
  "P1568": "domain",
  "P1571": "codomain",
  "P1573": "BBC Genome ID",
  "P1574": "exemplar of",
  "P1575": "RISS catalog",
  "P1576": "lifestyle",
  "P1577": "Gregory-Aland-Number",
  "P1578": "Gmelin number",
  "P1579": "Beilstein Registry Number",
  "P1580": "University of Barcelona authority ID",
  "P1581": "official blog",
  "P1582": "natural product of taxon",
  "P1583": "MalaCards ID",
  "P1584": "Pleiades ID",
  "P1585": "Brazilian municipality code",
  "P1586": "Catalan object of cultural interest ID",
  "P1587": "Slovene Cultural Heritage Register ID",
  "P1588": "Desa code of Indonesia",
  "P1589": "deepest point",
  "P1590": "number of casualties",
  "P1591": "defendant",
  "P1592": "prosecutor",
  "P1593": "defender",
  "P1594": "judge",
  "P1595": "charge",
  "P1596": "penalty",
  "P1598": "consecrator",
  "P1599": "Cambridge Alumni Database ID",
  "P1600": "code Inventari del Patrimoni Arquitectònic de Catalunya",
  "P1601": "Esperantist ID",
  "P1602": "Art UK venue ID",
  "P1603": "number of cases",
  "P1604": "biosafety level",
  "P1605": "has natural reservoir",
  "P1606": "natural reservoir of",
  "P1607": "Dialnet author ID",
  "P1608": "Dialnet book",
  "P1609": "Dialnet journal",
  "P1610": "Dialnet article",
  "P1611": "NATO code for grade",
  "P1612": "Commons Institution page",
  "P1613": "IRC channel",
  "P1614": "History of Parliament ID",
  "P1615": "CLARA-ID",
  "P1616": "SIREN number",
  "P1617": "BBC Things ID",
  "P1618": "sport number",
  "P1619": "date of official opening",
  "P1620": "plaintiff",
  "P1621": "detail map",
  "P1622": "driving side",
  "P1624": "MarineTraffic Port ID",
  "P1625": "has melody",
  "P1626": "Thai cultural heritage ID",
  "P1627": "Ethnologue.com code",
  "P1628": "equivalent property",
  "P1629": "subject item of this property",
  "P1630": "formatter URL",
  "P1631": "China Vitae ID",
  "P1632": "Hermann-Mauguin notation",
  "P1635": "religious name",
  "P1636": "date of baptism in early childhood",
  "P1637": "undercarriage",
  "P1638": "working title",
  "P1639": "pendant of",
  "P1640": "curator",
  "P1641": "port",
  "P1642": "acquisition transaction",
  "P1643": "departure transaction",
  "P1644": "EgliseInfo ID",
  "P1645": "NIST/CODATA ID",
  "P1646": "mandatory qualifier",
  "P1647": "subproperty of",
  "P1648": "Dictionary of Welsh Biography ID",
  "P1649": "KMDb person ID",
  "P1650": "BBF ID",
  "P1651": "YouTube video ID",
  "P1652": "referee",
  "P1653": "TERYT municipality code",
  "P1654": "wing configuration",
  "P1656": "unveiled by",
  "P1657": "MPAA film rating",
  "P1658": "number of faces",
  "P1659": "see also",
  "P1660": "has index case",
  "P1661": "Alexa rank",
  "P1662": "DOI Prefix",
  "P1663": "ProCyclingStats ID (cyclist)",
  "P1664": "Cycling Database ID",
  "P1665": "Chess Games ID",
  "P1666": "Chess Club ID",
  "P1667": "TGN ID",
  "P1668": "ATCvet",
  "P1669": "CONA ID",
  "P1670": "LAC ID",
  "P1671": "route number",
  "P1672": "this taxon is source of",
  "P1673": "general formula",
  "P1674": "number confirmed",
  "P1675": "number probable",
  "P1676": "number suspected",
  "P1677": "index case of",
  "P1678": "has vertex figure",
  "P1679": "Art UK artwork ID",
  "P1680": "subtitle",
  "P1683": "quote",
  "P1684": "inscription",
  "P1685": "Pokémon browser number",
  "P1686": "for work",
  "P1687": "Wikidata property",
  "P1688": "AniDB ID",
  "P1689": "central government debt as a percent of GDP",
  "P1690": "ICD-10-PCS",
  "P1691": "operations and procedures key (OPS)",
  "P1692": "ICD-9-CM",
  "P1693": "Terminologia Embryologica",
  "P1694": "Terminologia Histologica",
  "P1695": "NLP ID",
  "P1696": "inverse of",
  "P1697": "total valid votes",
  "P1699": "SkyscraperPage building id",
  "P1700": "SIPA ID",
  "P1702": "IGESPAR ID",
  "P1703": "is pollinated by",
  "P1704": "is pollinator of",
  "P1705": "native label",
  "P1706": "together with",
  "P1707": "DAAO ID",
  "P1708": "LfDS object ID",
  "P1709": "equivalent class",
  "P1710": "Sächsische Biografie",
  "P1711": "British Museum person-institution",
  "P1712": "Metacritic ID",
  "P1713": "biography at the Bundestag of Germany",
  "P1714": "Journalisted ID",
  "P1715": "RKD/ESD (Slovenia) ID",
  "P1716": "brand",
  "P1717": "SANDRE ID",
  "P1721": "pinyin transliteration",
  "P1725": "beats per minute",
  "P1726": "Florentine musea Inventario 1890  ID",
  "P1727": "Flora of North America taxon ID",
  "P1728": "AllMusic artist ID",
  "P1729": "AllMusic album ID",
  "P1730": "AllMusic song ID",
  "P1731": "Fach",
  "P1732": "Naturbase ID",
  "P1733": "Steam Application ID",
  "P1734": "oath of office date",
  "P1735": "Comedien.ch ID",
  "P1736": "Information Center for Israeli Art artist ID",
  "P1738": "Merck Index monograph",
  "P1739": "CiNii book ID",
  "P1740": "category for films shot at this location",
  "P1741": "GTAA id",
  "P1743": "Bradley and Fletcher checklist number",
  "P1744": "Agassiz et al checklist number",
  "P1745": "VASCAN ID",
  "P1746": "ZooBank nomenclatural act",
  "P1747": "Flora of China ID",
  "P1748": "NCI Thesaurus ID",
  "P1749": "Parlement & Politiek ID",
  "P1750": "name day",
  "P1751": "Art UK collection ID",
  "P1752": "scale",
  "P1753": "list related to category",
  "P1754": "category related to list",
  "P1755": "Aviation Safety Network accident ID",
  "P1760": "Aviation Safety Network Wikibase Occurrence",
  "P1761": "Watson & Dallwitz family ID",
  "P1762": "Hornbostel-Sachs classification",
  "P1763": "National Pipe Organ Register ID",
  "P1764": "Flemish organization for Immovable Heritage relict ID",
  "P1766": "place name sign",
  "P1769": "denkXweb identifier",
  "P1770": "Romania LMI code",
  "P1771": "Integrated Postsecondary Education Data System ID",
  "P1772": "USDA PLANTS ID",
  "P1773": "attributed to",
  "P1774": "workshop of",
  "P1775": "follower of",
  "P1776": "circle of",
  "P1777": "manner of",
  "P1778": "forgery after",
  "P1779": "possible creator",
  "P1780": "school of",
  "P1782": "courtesy name",
  "P1785": "temple name",
  "P1786": "posthumous name",
  "P1787": "art-name",
  "P1788": "DVN ID",
  "P1789": "chief operating officer",
  "P1791": "category of people buried here",
  "P1792": "category of associated people",
  "P1793": "format as a regular expression",
  "P1794": "bureau du patrimoine de Seine-Saint-Denis ID",
  "P1795": "Smithsonian American Art Museum: person/institution thesaurus id",
  "P1796": "International Standard Industrial Classification code",
  "P1798": "ISO 639-5 code",
  "P1799": "Maltese Islands National Inventory of Cultural Property ID",
  "P1800": "Wikimedia database name",
  "P1801": "commemorative plaque image",
  "P1802": "EMLO person ID",
  "P1803": "Masaryk University person ID",
  "P1804": "DNF film ID",
  "P1806": "ABoK number",
  "P1807": "Great Aragonese Encyclopedia ID",
  "P1808": "senat.fr ID",
  "P1809": "choreographer",
  "P1810": "named as",
  "P1811": "list of episodes",
  "P1813": "short name",
  "P1814": "name in kana",
  "P1815": "RSL scanned book's identifier",
  "P1816": "National Portrait Gallery (London) person ID",
  "P1817": "addressee",
  "P1818": "Kaiserhof ID",
  "P1819": "genealogics.org person ID",
  "P1820": "Open Food Facts food additive slug",
  "P1821": "Open Food Facts food category slug",
  "P1822": "DSH object ID",
  "P1823": "BAnQ ID",
  "P1824": "road number",
  "P1825": "Baseball-Reference.com major league player ID",
  "P1826": "Baseball-Reference.com minor league player ID",
  "P1827": "ISWC",
  "P1828": "IPI number",
  "P1829": "Roud Folk Song Index number",
  "P1830": "owner of",
  "P1831": "electorate",
  "P1832": "GrassBase ID",
  "P1833": "number of registered users/contributors",
  "P1836": "draft pick number",
  "P1837": "Gaoloumi ID",
  "P1838": "PSS-archi ID",
  "P1839": "US Federal Election Commission ID",
  "P1840": "investigated by",
  "P1841": "Swedish district code",
  "P1842": "Global Anabaptist Mennonite Encyclopedia Online ID",
  "P1843": "taxon common name",
  "P1844": "HathiTrust id",
  "P1845": "anti-virus alias",
  "P1846": "distribution map",
  "P1847": "Nasjonalbiblioteket photographer ID",
  "P1848": "INPN Code",
  "P1849": "SSR WrittenForm ID",
  "P1850": "SSR Name ID",
  "P1851": "input set",
  "P1852": "Perry Index",
  "P1853": "blood type",
  "P1854": "Kiev street code",
  "P1855": "Wikidata property example",
  "P1866": "Catholic Hierarchy diocese ID",
  "P1867": "eligible voters",
  "P1868": "ballots cast",
  "P1869": "Hall of Valor ID",
  "P1870": "Name Assigning Authority Number",
  "P1871": "CERL ID",
  "P1872": "minimum number of players",
  "P1873": "maximum number of players",
  "P1874": "Netflix ID",
  "P1875": "represented by",
  "P1876": "spacecraft",
  "P1877": "after a work by",
  "P1878": "Vox-ATypI classification",
  "P1879": "income classification (Philippines)",
  "P1880": "measured by",
  "P1881": "list of characters",
  "P1882": "Web Gallery of Art ID",
  "P1883": "Declarator.org ID",
  "P1884": "hair color",
  "P1885": "cathedral",
  "P1886": "Smithsonian volcano ID",
  "P1888": "Dictionary of Medieval Names from European Sources entry",
  "P1889": "different from",
  "P1890": "BNC ID",
  "P1891": "signatory",
  "P1893": "OpenPlaques plaque ID",
  "P1894": "Danish urban area code",
  "P1895": "Fauna Europaea ID",
  "P1896": "source website for the property",
  "P1897": "highest note",
  "P1898": "lowest note",
  "P1899": "Librivox author ID",
  "P1900": "EAGLE id",
  "P1901": "BALaT person/organisation id",
  "P1902": "Spotify artist ID",
  "P1903": "volcanic explosivity index",
  "P1905": "FundRef registry name",
  "P1906": "office held by head of state",
  "P1907": "Australian Dictionary of Biography ID",
  "P1908": "CWGC person ID",
  "P1909": "side effect",
  "P1910": "decreased expression in",
  "P1911": "increased expression in",
  "P1912": "deletion association with",
  "P1913": "gene duplication association with",
  "P1914": "gene insertion association with",
  "P1915": "gene inversion association with",
  "P1916": "gene substitution association with",
  "P1917": "posttranslational modification association with",
  "P1918": "altered regulation leads to",
  "P1919": "Ministry of Education of Chile school ID",
  "P1920": "CWGC burial ground ID",
  "P1921": "URI pattern for RDF resource",
  "P1922": "first line",
  "P1923": "participating teams",
  "P1924": "vaccine for",
  "P1925": "VIOLIN ID",
  "P1928": "Vaccine Ontology ID",
  "P1929": "ClinVar accession",
  "P1930": "DSM V",
  "P1931": "NIOSH Pocket Guide ID",
  "P1932": "stated as",
  "P1933": "MobyGames ID",
  "P1934": "Animator.ru film ID",
  "P1935": "DBCS ID",
  "P1936": "Digital Atlas of the Roman Empire ID",
  "P1937": "UN/LOCODE",
  "P1938": "Project Gutenberg author ID",
  "P1939": "Dyntaxa ID",
  "P1940": "conifers.org ID",
  "P1942": "McCune-Reischauer romanization",
  "P1943": "location map",
  "P1944": "relief location map",
  "P1945": "street key",
  "P1946": "National Library of Ireland authority",
  "P1947": "Mapillary ID",
  "P1948": "BerlPap identifier",
  "P1949": "CulturaItalia ID",
  "P1950": "second surname in Spanish name",
  "P1951": "investor",
  "P1952": "Encyclopaedia Metallum band ID",
  "P1953": "Discogs artist ID",
  "P1954": "Discogs master ID",
  "P1955": "Discogs label ID",
  "P1956": "takeoff and landing capability",
  "P1957": "Wikisource index page",
  "P1958": "Trismegistos Geo ID",
  "P1959": "Dutch Senate person ID",
  "P1960": "Google Scholar ID",
  "P1961": "Comité des travaux historiques et scientifiques ID",
  "P1962": "patron",
  "P1963": "properties for this type",
  "P1966": "Biblioteca Nacional de Chile catalogue number",
  "P1967": "BoxRec ID",
  "P1968": "Foursquare venue ID",
  "P1969": "MovieMeter director ID",
  "P1970": "MovieMeter Movie ID",
  "P1971": "number of children",
  "P1972": "Open Hub ID",
  "P1973": "RSL editions",
  "P1976": "INEGI locality ID",
  "P1977": "lesarchivesduspectacle ID",
  "P1978": "USDA NDB number",
  "P1979": "Righteous Among The Nations ID",
  "P1980": "PolSys ID",
  "P1981": "FSK film rating",
  "P1982": "Anime News Network person ID",
  "P1983": "Anime News Network company ID",
  "P1984": "Anime News Network manga ID",
  "P1985": "Anime News Network anime ID",
  "P1986": "Dizionario Biografico degli Italiani",
  "P1987": "MCN code",
  "P1988": "Delarge ID",
  "P1989": "Encyclopaedia Metallum artist ID",
  "P1990": "species kept",
  "P1991": "LPSN URL",
  "P1992": "Plazi ID",
  "P1993": "TeX string",
  "P1994": "AllMusic composition ID",
  "P1995": "medical specialty",
  "P1996": "parliament.uk ID",
  "P1997": "Facebook Places ID",
  "P1998": "UCI code",
  "P1999": "UNESCO language status",
  "P2000": "CPDL ID",
  "P2001": "Revised Romanisation",
  "P2002": "Twitter username",
  "P2003": "Instagram username",
  "P2004": "NALT ID",
  "P2005": "Catalogus Professorum Halensis",
  "P2006": "ZooBank author ID",
  "P2007": "ZooBank publication ID",
  "P2008": "IPNI publication ID",
  "P2009": "Exif model",
  "P2010": "Exif make",
  "P2011": "Cooper-Hewitt Person ID",
  "P2012": "cuisine",
  "P2013": "Facebook ID",
  "P2014": "MoMA artwork id",
  "P2015": "Hansard ID",
  "P2016": "Catalogus Professorum Academiae Groninganae id",
  "P2017": "isomeric SMILES",
  "P2018": "Teuchos ID",
  "P2019": "AllMovie artist ID",
  "P2020": "worldfootball.net ID",
  "P2021": "Erdős number",
  "P2024": "German cattle breed ID",
  "P2025": "Find A Grave cemetery ID",
  "P2026": "Avibase ID",
  "P2027": "Colour Index International constitution ID",
  "P2028": "United States Armed Forces service number",
  "P2029": "Dictionary of Ulster Biography ID",
  "P2030": "NASA biographical ID",
  "P2031": "work period (start)",
  "P2032": "work period (end)",
  "P2033": "Category for pictures taken with camera",
  "P2034": "Project Gutenberg ebook ID",
  "P2035": "LinkedIn personal profile URL",
  "P2036": "African Plant Database",
  "P2037": "GitHub username",
  "P2038": "ResearchGate person ID",
  "P2040": "CITES Species+ ID",
  "P2041": "National Gallery of Victoria artist ID",
  "P2042": "Artsy artist ID",
  "P2043": "length",
  "P2044": "elevation above sea level",
  "P2045": "orbital inclination",
  "P2046": "area",
  "P2047": "duration",
  "P2048": "height",
  "P2049": "width",
  "P2050": "wingspan",
  "P2051": "M sin i",
  "P2052": "speed",
  "P2053": "watershed area",
  "P2054": "density",
  "P2055": "electrical conductivity",
  "P2056": "heat capacity",
  "P2057": "HMDB ID",
  "P2058": "depositor",
  "P2060": "luminosity",
  "P2061": "aspect ratio",
  "P2062": "HSDB ID",
  "P2063": "LIPID MAPS ID",
  "P2064": "KNApSAcK ID",
  "P2065": "NIAID ChemDB ID",
  "P2066": "fusion enthalpy",
  "P2067": "mass",
  "P2068": "thermal conductivity",
  "P2069": "magnetic moment",
  "P2070": "Fellow of the Royal Society",
  "P2071": "Mémoire des hommes",
  "P2072": "CDB Chemical ID",
  "P2073": "vehicle range",
  "P2074": "internetmedicin.se ID",
  "P2075": "speed of sound",
  "P2076": "temperature",
  "P2077": "pressure",
  "P2078": "user manual link",
  "P2079": "fabrication method",
  "P2080": "AcademiaNet",
  "P2081": "BLDAM object ID",
  "P2082": "M.49 code",
  "P2083": "Leadscope ID",
  "P2084": "ZINC ID",
  "P2085": "Nikkaji",
  "P2086": "CDD Public ID",
  "P2087": "CrunchBase person ID",
  "P2088": "CrunchBase organisation ID",
  "P2089": "Library of Congress JukeBox ID",
  "P2090": "Power of 10 athlete ID",
  "P2091": "FISA ID",
  "P2092": "Bildindex der Kunst und Architektur ID",
  "P2093": "short author name",
  "P2094": "competition class",
  "P2095": "co-driver",
  "P2096": "image legend",
  "P2097": "term length of office",
  "P2098": "substitute/deputy/replacement of office/officeholder",
  "P2099": "BC Geographical Names ID",
  "P2100": "Banque de noms de lieux du Québec id",
  "P2101": "melting point",
  "P2102": "boiling point",
  "P2103": "size of team at start",
  "P2105": "size of team at finish",
  "P2106": "RXNO Ontology",
  "P2107": "decomposition point",
  "P2108": "Kunstindeks Danmark artwork ID",
  "P2109": "power output",
  "P2112": "wing area",
  "P2113": "sublimation temperature",
  "P2114": "half-life",
  "P2115": "NDF-RT ID",
  "P2116": "enthalpy of vaporization",
  "P2117": "combustion enthalpy",
  "P2118": "kinematic viscosity",
  "P2119": "vapor pressure",
  "P2120": "radius",
  "P2121": "prize money",
  "P2123": "YerelNet village ID",
  "P2124": "membership",
  "P2125": "Revised Hepburn romanization",
  "P2126": "Georgian national system of romanization",
  "P2127": "International Nuclear Event Scale",
  "P2128": "flash point",
  "P2129": "IDLH",
  "P2130": "cost",
  "P2131": "nominal GDP",
  "P2132": "nominal GDP per capita",
  "P2133": "total debt",
  "P2134": "total reserves",
  "P2135": "total exports",
  "P2136": "total imports",
  "P2137": "total equity",
  "P2138": "total liabilities",
  "P2139": "total revenue",
  "P2140": "foreign direct investment net outflow",
  "P2141": "foreign direct investment net inflow",
  "P2142": "box office",
  "P2143": "genome size",
  "P2144": "frequency",
  "P2145": "explosive energy equivalent",
  "P2146": "orbital period",
  "P2147": "rotation period",
  "P2148": "distance from river mouth",
  "P2149": "clock speed",
  "P2150": "FSB speed",
  "P2151": "focal length",
  "P2152": "antiparticle",
  "P2153": "PubChem Substance ID (SID)",
  "P2154": "binding energy",
  "P2155": "Solid solution series with",
  "P2156": "Pseudo crystal habit",
  "P2157": "lithography",
  "P2158": "Cell line ontology ID",
  "P2159": "solves",
  "P2160": "mass excess",
  "P2161": "Guthrie code",
  "P2162": "Deutsche Ultramarathon-Vereinigung ID",
  "P2163": "FAST-ID",
  "P2164": "SIGIC author ID",
  "P2165": "SIGIC group ID",
  "P2166": "SIGIC institution ID",
  "P2167": "UNSPSC Code",
  "P2168": "SFDb person ID",
  "P2169": "PublicWhip ID",
  "P2170": "Hansard (currents session) ID",
  "P2171": "They Work for You ID",
  "P2172": "Parliamentary record identifier",
  "P2173": "BBC News Democracy Live ID",
  "P2174": "MoMA artist id",
  "P2175": "medical condition treated",
  "P2176": "drug used for treatment",
  "P2177": "solubility",
  "P2178": "solvent",
  "P2179": "ACM Classification Code (2012)",
  "P2180": "Kansallisbiografia ID",
  "P2181": "Finnish MP ID",
  "P2182": "Finnish Ministers database ID",
  "P2183": "ISO 9:1995",
  "P2184": "history of topic",
  "P2185": "DLI ID",
  "P2186": "Wiki Loves Monuments ID",
  "P2187": "BiblioNet publication ID",
  "P2188": "BiblioNet author ID",
  "P2189": "BiblioNet publisher ID",
  "P2190": "C-SPAN identifier of a person",
  "P2191": "NILF author id",
  "P2192": "endangeredlanguages.com ID",
  "P2193": "Soccerbase player id",
  "P2194": "PSS-Archi architect id",
  "P2195": "Soccerbase manager id",
  "P2196": "students count",
  "P2197": "production rate",
  "P2198": "average gradient",
  "P2199": "autoignition temperature",
  "P2200": "electric charge",
  "P2201": "electric dipole moment",
  "P2202": "lower flammable limit",
  "P2203": "upper flammable limit",
  "P2204": "minimum explosive concentration",
  "P2205": "Spotify album ID",
  "P2206": "Discogs release ID",
  "P2207": "Spotify track ID",
  "P2208": "average shot length",
  "P2209": "SourceForge project",
  "P2210": "relative to",
  "P2211": "position angle",
  "P2212": "angular distance",
  "P2213": "longitude of ascending node",
  "P2214": "parallax",
  "P2215": "proper motion",
  "P2216": "radial velocity",
  "P2217": "cruise speed",
  "P2218": "net worth",
  "P2219": "real gross domestic product growth rate",
  "P2220": "household wealth",
  "P2221": "flux",
  "P2222": "gyromagnetic ratio",
  "P2223": "decay width",
  "P2224": "spectral line",
  "P2225": "discharge",
  "P2226": "market capitalization",
  "P2227": "metallicity",
  "P2228": "maximum thrust",
  "P2229": "thermal design power",
  "P2230": "torque",
  "P2231": "explosive velocity",
  "P2232": "cash",
  "P2233": "semi-major axis",
  "P2234": "volume as quantity",
  "P2235": "external superproperty",
  "P2236": "external subproperty",
  "P2237": "units used for this property",
  "P2238": "official symbol",
  "P2239": "first aid measures",
  "P2240": "median lethal dose",
  "P2241": "reason for deprecation",
  "P2242": "Florentine musea catalogue ID",
  "P2243": "apoapsis",
  "P2244": "periapsis",
  "P2248": "argument of periapsis",
  "P2249": "Refseq Genome ID",
  "P2250": "life expectancy",
  "P2252": "NGA artist id",
  "P2253": "DfE URN",
  "P2254": "maximum operating altitude",
  "P2255": "Debrett's People of Today ID",
  "P2257": "frequency of event",
  "P2258": "mobile country code",
  "P2259": "mobile network code",
  "P2260": "ionization energy",
  "P2261": "beam",
  "P2262": "draft",
  "P2263": "ISOCAT id",
  "P2264": "mix'n'match catalogue ID",
  "P2266": "Fashion Model Directory model ID",
  "P2267": "Politifact Personality ID",
  "P2268": "Musée d'Orsay artist ID",
  "P2270": "Emporis building complex ID",
  "P2271": "Wikidata property example for properties",
  "P2272": "Hederich encyclopedia article ID",
  "P2273": "Heidelberg Academy for Sciences and Humanities member ID",
  "P2275": "World Health Organisation International Nonproprietary Name",
  "P2276": "UEFA player code",
  "P2277": "Magdeburger Biographisches Lexikon",
  "P2278": "Member of the Hellenic Parliament ID",
  "P2279": "ambitus",
  "P2280": "Austrian Parliament ID",
  "P2281": "iTunes album ID",
  "P2282": "Groeningemuseum work PID",
  "P2283": "uses",
  "P2284": "price",
  "P2285": "periapsis date",
  "P2286": "arterial supply",
  "P2287": "CRIStin ID",
  "P2288": "lymphatic drainage",
  "P2289": "venous drainage",
  "P2290": "Danish parish code",
  "P2291": "charted in",
  "P2292": "consumption rate",
  "P2293": "genetic association",
  "P2294": "balance of trade",
  "P2295": "net profit",
  "P2296": "money supply",
  "P2297": "employment by economic sector",
  "P2298": "NSDAP membership number (1925–1945)",
  "P2299": "PPP GDP per capita",
  "P2300": "minimal lethal dose",
  "P2302": "property constraint",
  "P2303": "exception to constraint",
  "P2304": "group by",
  "P2305": "qualifier of property constraint",
  "P2306": "property",
  "P2307": "namespace",
  "P2308": "class",
  "P2309": "relation",
  "P2310": "min date",
  "P2311": "max date",
  "P2312": "max quantity",
  "P2313": "min quantity",
  "P2315": "comment",
  "P2316": "constraint status",
  "P2317": "call sign",
  "P2318": "debut participant",
  "P2319": "elector",
  "P2320": "aftershocks",
  "P2321": "general classification",
  "P2322": "article ID",
  "P2323": "Swedish Olympic Committee athlete ID",
  "P2324": "quantity buried",
  "P2325": "mean anomaly",
  "P2326": "GNS Unique Feature ID",
  "P2327": "ProCyclingStats ID (race)",
  "P2328": "ProCyclingStats ID (team)",
  "P2329": "antagonist muscle",
  "P2330": "Cycling Archives ID (race)",
  "P2331": "Cycling Archives ID (team)",
  "P2332": "Dictionary of Art Historians ID",
  "P2333": "Norwegian organization number",
  "P2334": "SFDb movie ID",
  "P2335": "SFDb company ID",
  "P2336": "SFDb soundtrack ID",
  "P2337": "SFDb group ID",
  "P2338": "Musopen composer ID",
  "P2339": "BoardGameGeek ID",
  "P2340": "CESAR person ID",
  "P2341": "indigenous to",
  "P2342": "AGORHA person/institution ID",
  "P2343": "playing range image",
  "P2344": "AGORHA work ID",
  "P2345": "AGORHA event identifier",
  "P2346": "Elonet movie ID",
  "P2347": "YSO ID",
  "P2348": "period",
  "P2349": "Stuttgart Database of Scientific Illustrators ID",
  "P2350": "Speedskatingbase.eu ID",
  "P2351": "number of graves",
  "P2352": "applies to taxon",
  "P2353": "statistical unit",
  "P2354": "has list",
  "P2355": "UNESCO endangered language ID",
  "P2357": "Classification of Instructional Programs code",
  "P2358": "Roman praenomen",
  "P2359": "Roman nomen gentilicium",
  "P2360": "intended public",
  "P2361": "online service",
  "P2362": "time to altitude",
  "P2363": "NMHH film rating",
  "P2364": "production code",
  "P2365": "Roman cognomen",
  "P2366": "Roman agnomen",
  "P2367": "Australian Stratigraphic Units Database ID",
  "P2368": "Sandbox-Property",
  "P2369": "Soccerway player ID",
  "P2370": "conversion to SI unit",
  "P2371": "FAO risk status",
  "P2372": "ODIS ID",
  "P2373": "Genius artist ID",
  "P2374": "natural abundance",
  "P2375": "has superpartner",
  "P2376": "superpartner of",
  "P2377": "MediaWiki hooks used",
  "P2378": "issued by",
  "P2379": "deprecated in version",
  "P2380": "French Sculpture Census ID",
  "P2381": "Academic Tree ID",
  "P2382": "Chemins de mémoire ID",
  "P2383": "CTHS person ID",
  "P2384": "statement describes",
  "P2385": "French diocesan architects ID",
  "P2386": "diameter",
  "P2387": "Elonet actor ID",
  "P2388": "office held by head of the organisation",
  "P2389": "organisation directed from the office",
  "P2390": "Ballotpedia ID",
  "P2391": "OKPO ID",
  "P2392": "teaching method",
  "P2393": "NCBI Locus tag",
  "P2394": "MGI gene symbol",
  "P2396": "image of function",
  "P2397": "YouTube channel ID",
  "P2398": "MLSSoccer.com ID",
  "P2399": "British Council artist ID",
  "P2400": "JMDb film ID",
  "P2401": "Six Degrees of Francis Bacon ID",
  "P2402": "total expenditure",
  "P2403": "total assets",
  "P2404": "time-weighted average exposure limit",
  "P2405": "ceiling exposure limit",
  "P2406": "maximum peak exposure limit",
  "P2407": "short-term exposure limit",
  "P2408": "set in period",
  "P2409": "NII Article ID",
  "P2410": "WikiPathways ID",
  "P2411": "Artsy gene",
  "P2412": "Fashion Model Directory designer ID",
  "P2413": "Fashion Model Directory magazine ID",
  "P2414": "substrate of",
  "P2415": "personal best",
  "P2416": "sports discipline competed in",
  "P2417": "stage classification",
  "P2418": "Structurae person ID",
  "P2421": "Prosopographia Attica",
  "P2422": "number of awards",
  "P2423": "FIE ID",
  "P2424": "Berlin cultural heritage ID",
  "P2425": "service ribbon image",
  "P2426": "Xeno-canto species ID",
  "P2427": "grid global research id",
  "P2428": "RePEc Short-ID",
  "P2429": "expected completeness",
  "P2430": "takeoff roll",
  "P2431": "Thyssen-Bornemisza artist id",
  "P2432": "J. Paul Getty Museum artist id",
  "P2433": "gender of a scientific name of a genus",
  "P2434": "Panarctic Flora ID",
  "P2435": "PORT person ID",
  "P2436": "voltage",
  "P2437": "number of seasons",
  "P2438": "narrator",
  "P2439": "language",
  "P2440": "transliteration",
  "P2441": "literal translation",
  "P2442": "conversion to standard unit",
  "P2443": "stage reached",
  "P2444": "homoglyph",
  "P2445": "metasubclass of",
  "P2446": "transfermarkt.com footballer id",
  "P2447": "transfermarkt manager id",
  "P2448": "Turkish Football Federation player ID",
  "P2449": "Turkish Football Federation manager ID",
  "P2450": "Encyclopædia Britannica contributor ID",
  "P2451": "MAME ROM",
  "P2452": "GeoNames feature code",
  "P2453": "nominee",
  "P2454": "KNAW past member ID",
  "P2455": "Species Profile and Threats Database ID",
  "P2456": "dblp ID",
  "P2457": "Australian National Shipwreck Database Shipwreck ID number",
  "P2458": "Mackolik.com footballer ID",
  "P2459": "IBU biathlete ID",
  "P2460": "Persons of Ancient Athens",
  "P2461": "ComLaw ID",
  "P2462": "member of the deme",
  "P2463": "elibrary.ru organisation ID",
  "P2464": "BugGuide ID",
  "P2465": "allcinema film ID",
  "P2467": "Global Geoparks Network ID",
  "P2468": "Theatricalia theatre ID",
  "P2469": "Theatricalia person ID",
  "P2470": "Talouselämän vaikuttajat ID",
  "P2471": "Models.com person ID",
  "P2472": "ACMA Register of Radiocommunications Licences Client Identifier",
  "P2473": "Asset of Local Relevance ID",
  "P2474": "CDLI ID",
  "P2475": "NAVA ID",
  "P2476": "HNI person/institution ID",
  "P2477": "TBRC Resource ID",
  "P2478": "Railways Archive event ID",
  "P2479": "SPDX ID",
  "P2480": "IHO Hydrographic Dictionary (S-32) Number",
  "P2481": "Eliteprospects.com player ID",
  "P2482": "SABR ID",
  "P2483": "NCES District ID",
  "P2484": "NCES School ID",
  "P2485": "Fashion Model Directory photographer ID",
  "P2486": "Fashion Model Directory brand ID",
  "P2487": "page at website of Belarus Geocenter",
  "P2488": "page at Belarus Globe website",
  "P2489": "page at hram.by",
  "P2490": "page at OSTIS Belarus Wiki",
  "P2491": "Radzima.org ID",
  "P2492": "MTMT author ID",
  "P2493": "OM institution ID",
  "P2494": "Latvian cultural heritage register ID",
  "P2496": "Latvian toponymic names database ID",
  "P2497": "Latvian National Address Register ID",
  "P2498": "Catalan Biographical Dictionary of Women ID",
  "P2499": "level above",
  "P2500": "level below",
  "P2501": "results",
  "P2502": "classification of",
  "P2503": "Genealogical Gazetteer (GOV) ID",
  "P2504": "Norwegian municipality number",
  "P2505": "carries",
  "P2506": "INSEE canton code",
  "P2507": "Corrigendum / Erratum",
  "P2508": "KINENOTE film ID",
  "P2509": "Movie Walker ID",
  "P2510": "National Discography of Italian Song ID",
  "P2511": "MSK Gent work PID",
  "P2512": "spin-off",
  "P2513": "Jamendo album ID",
  "P2514": "Jamendo artist ID",
  "P2515": "costume designer",
  "P2516": "Australian Wetlands Database Australian Ramsar site number",
  "P2517": "category for recipients of this award",
  "P2518": "Scope.dk film ID",
  "P2519": "Scope.dk person ID",
  "P2520": "UNESCO Biosphere Reserve url",
  "P2521": "female form of label",
  "P2522": "victory",
  "P2524": "SEED number",
  "P2525": "Ramsar Sites Information Service ID",
  "P2526": "National Historic Sites of Canada ID",
  "P2527": "moment magnitude scale",
  "P2528": "Richter magnitude scale",
  "P2529": "ČSFD film ID",
  "P2530": "Box Office Mojo franchise ID",
  "P2531": "Box Office Mojo studio ID",
  "P2532": "lowest atmospheric pressure",
  "P2533": "WomenWriters ID",
  "P2534": "defining formula",
  "P2535": "Sandbox-Mathematical expression",
  "P2536": "Sandbox-External identifier",
  "P2537": "Free Software Directory entry",
  "P2538": "Nationalmuseum Sweden artist ID",
  "P2539": "Nationalmuseum Sweden artwork ID",
  "P2540": "Aarne–Thompson–Uther Tale Type Index",
  "P2541": "operating area",
  "P2542": "acceptable daily intake",
  "P2545": "bowling style",
  "P2546": "sidekick of",
  "P2547": "perimeter",
  "P2548": "strand orientation",
  "P2549": "Italian Senate of the Republic ID",
  "P2550": "recording or performance of",
  "P2551": "used metre",
  "P2552": "quantitative metrical pattern",
  "P2553": "in work",
  "P2554": "production designer",
  "P2555": "fee",
  "P2556": "bore",
  "P2557": "stroke",
  "P2558": "autores.uy database id",
  "P2559": "Wikidata usage instructions",
  "P2560": "GPU",
  "P2561": "name",
  "P2562": "married name",
  "P2563": "superhuman feature or ability",
  "P2564": "Köppen climate classification",
  "P2565": "global-warming potential",
  "P2566": "ECHA InfoCard ID",
  "P2567": "amended by",
  "P2568": "repealed by",
  "P2570": "Saros cycle of eclipse",
  "P2571": "uncertainty corresponds to",
  "P2572": "Twitter hashtag",
  "P2573": "number of out of school children",
  "P2574": "National-Football-Teams.com player ID",
  "P2575": "measures",
  "P2576": "UCSC Genome Browser assembly ID",
  "P2577": "admissible rule in",
  "P2578": "studies",
  "P2579": "studied by",
  "P2580": "Baltisches Biographisches Lexikon digital ID",
  "P2581": "BabelNet id",
  "P2582": "J. Paul Getty Museum object id",
  "P2583": "distance from Earth",
  "P2584": "Australian Wetlands Database Directory of Important Wetlands Reference Code",
  "P2585": "INSEE region code",
  "P2586": "INSEE department code",
  "P2587": "has phoneme",
  "P2588": "administrative code of Indonesia",
  "P2589": "Statistics Indonesia ethnicity code",
  "P2590": "Statistics Indonesia language code",
  "P2591": "grammatical option indicates",
  "P2592": "Québec cultural heritage directory people identifier",
  "P2593": "Latvian Olympic Committee athlete ID",
  "P2595": "maximum gradient",
  "P2596": "culture",
  "P2597": "Gram staining",
  "P2598": "serial number",
  "P2599": "block size",
  "P2600": "Geni.com profile ID",
  "P2601": "Eurohockey.com player ID",
  "P2602": "Hockeydb.com player ID",
  "P2603": "Kinopoisk film ID",
  "P2604": "Kinopoisk person ID",
  "P2605": "ČSFD person ID",
  "P2606": "PlayStation ID",
  "P2607": "BookBrainz creator ID",
  "P2608": "Valencian Property of Local Relevance id",
  "P2610": "thickness",
  "P2611": "TED speaker ID",
  "P2612": "TED topic ID",
  "P2613": "TED talk ID",
  "P2614": "World Heritage criteria (2005)",
  "P2618": "SHOWA ID",
  "P2619": "Hungarian company ID",
  "P2620": "ISO 15924 numeric code",
  "P2621": "Site of Special Scientific Interest (England) ID",
  "P2622": "Companies House ID",
  "P2623": "MEK ID",
  "P2624": "MetroLyrics ID",
  "P2625": "PASE ID",
  "P2626": "DNF person ID",
  "P2627": "ISO 9362 SWIFT/BIC code",
  "P2628": "German tax authority ID",
  "P2629": "BBFC rating",
  "P2630": "cost of damage",
  "P2631": "Turner Classic Movies film ID",
  "P2632": "place of detention",
  "P2633": "geography of topic",
  "P2634": "model",
  "P2635": "number of parts of a work of art",
  "P2636": "Minkultury Film ID",
  "P2637": "RARS rating",
  "P2638": "TV.com ID",
  "P2639": "Filmportal ID",
  "P2640": "Swimrankings.net swimmer ID",
  "P2641": "Davis Cup player ID",
  "P2642": "FedCup player ID",
  "P2643": "Carnegie Classification of Institutions of Higher Education",
  "P2645": "mean lifetime",
  "P2646": "mirTarBase ID",
  "P2647": "source of material",
  "P2648": "Cycling Quotient identifier (races for men)",
  "P2649": "Cycling Quotient url (mens team)",
  "P2650": "interested in",
  "P2651": "CRICOS Provider Code",
  "P2652": "partnership with",
  "P2655": "Estyn ID",
  "P2656": "FIFA World Ranking",
  "P2657": "EU transparency register ID",
  "P2658": "Scoville grade",
  "P2659": "topographic isolation",
  "P2660": "topographic prominence",
  "P2661": "target interest rate",
  "P2662": "consumption rate per capita",
  "P2663": "tier 1 capital ratio (CETI)",
  "P2664": "units sold",
  "P2665": "alcohol by volume",
  "P2666": "Datahub page",
  "P2667": "corresponding template",
  "P2668": "stability of property value",
  "P2669": "discontinued date",
  "P2670": "has parts of the class",
  "P2671": "Google Knowledge Graph identifier",
  "P2672": "SOATO ID",
  "P2673": "next crossing upstream",
  "P2674": "next crossing downstream",
  "P2675": "reply to",
  "P2676": "rating certificate ID",
  "P2677": "relative position within image",
  "P2678": "Russiancinema.ru film ID",
  "P2679": "author of foreword",
  "P2680": "author of afterword",
  "P2681": "is recto of",
  "P2682": "is verso of",
  "P2683": "Bekker Number",
  "P2684": "Kijkwijzer rating",
  "P2685": "Basketball-Reference.com NBA player ID",
  "P2686": "Opensecrets Identifier",
  "P2687": "NDL JPNO",
  "P2688": "Box Office Mojo person ID",
  "P2689": "BARTOC ID",
  "P2690": "New York Times Semantic Concept: Person",
  "P2691": "New York Times Semantic Concept: Organization",
  "P2692": "New York Times Semantic Concept: Location",
  "P2693": "New York Times Semantic Concept: Descriptor",
  "P2694": "ISU figure skater identifier",
  "P2695": "type locality",
  "P2696": "FIG gymnast identifier",
  "P2697": "ESPNcricinfo player ID",
  "P2698": "CricketArchive player ID",
  "P2699": "URL",
  "P2700": "protocol",
  "P2701": "file format",
  "P2702": "dataset distribution",
  "P2703": "British Film Institute identifier",
  "P2704": "EIDR identifier",
  "P2705": "Karate Records ID",
  "P2708": "Cycling Quotient identifier (women races)",
  "P2709": "Cycling Quotient identifier (cyclist, woman)",
  "P2710": "minimal lethal concentration",
  "P2712": "median lethal concentration",
  "P2713": "sectional view",
  "P2715": "elected in",
  "P2716": "collage image",
  "P2717": "no-observed-adverse-effect level",
  "P2718": "lowest-observed-adverse-effect level",
  "P2719": "Hungarian-style transcription",
  "P2720": "embed URL",
  "P2721": "Encyclopaedia Metallum release ID",
  "P2722": "Deezer artist ID",
  "P2723": "Deezer album ID",
  "P2724": "Deezer track ID",
  "P2725": "GOG application ID",
  "P2726": "UIPM ID",
  "P2727": "United World Wrestling ID",
  "P2728": "CageMatch worker ID",
  "P2729": "Badminton World Federation ID",
  "P2730": "ISSF ID",
  "P2731": "Projeto Excelências ID",
  "P2732": "Persée author ID",
  "P2733": "Persée journal ID",
  "P2734": "UNZ author identifier",
  "P2735": "UNZ journal identifier",
  "P2736": "Biographical Directory of Federal Judges id",
  "P2737": "union of",
  "P2738": "disjoint union of",
  "P2739": "typeface/font",
  "P2740": "ResearchGate institute ID",
  "P2741": "Tate artist identifier",
  "P2742": "Australian Geological Provinces Database Identifier",
  "P2743": "this zoological name is coordinate with",
  "P2744": "PASE name",
  "P2745": "DNZB",
  "P2746": "production statistics",
  "P2747": "Filmiroda rating",
  "P2748": "PRONOM file format identifier",
  "P2749": "PRONOM software identifier",
  "P2750": "Photographers' Identities Catalog ID",
  "P2751": "Roller Coaster Database ID",
  "P2752": "New Zealand Organisms Register ID",
  "P2753": "Dictionary of Canadian Biography ID",
  "P2754": "production year",
  "P2755": "exploitation visa number",
  "P2756": "EIRIN film rating",
  "P2758": "CNC film rating",
  "P2759": "AUSNUT Food Identifier",
  "P2760": "NUTTAB Food Identifier",
  "P2761": "Research Papers in Economics Series handle",
  "P2762": "Skyscraper Center building complex ID",
  "P2763": "Danish protected area ID",
  "P2764": "Wrestlingdata person id",
  "P2765": "blue-style.com ID",
  "P2766": "ISO 4063 process number",
  "P2767": "JudoInside.com ID",
  "P2768": "BNE journal ID",
  "P2769": "budget",
  "P2770": "source of income",
  "P2771": "D-U-N-S",
  "P2772": "FIS alpine skier ID",
  "P2773": "FIS cross-country skier ID",
  "P2774": "FIS freestyle skier ID",
  "P2775": "FIS ski jumper ID",
  "P2776": "FIS Nordic combined skier ID",
  "P2777": "FIS snowboarder ID",
  "P2778": "IAT triathlete ID",
  "P2779": "IAT weightlifter ID",
  "P2780": "IAT diver ID",
  "P2781": "race time",
  "P2782": "Models.com client ID",
  "P2783": "Danish listed buildings case ID",
  "P2784": "Mercalli intensity scale",
  "P2786": "aerodrome reference point",
  "P2787": "longest span",
  "P2788": "Czech neighbourhood ID code",
  "P2789": "connects with",
  "P2790": "net tonnage",
  "P2791": "power consumed",
  "P2792": "ASF KID Cave Tag Number",
  "P2793": "clearance",
  "P2794": "Index Hepaticarum ID",
  "P2795": "directions",
  "P2796": "3DMet ID",
  "P2797": "sound power level",
  "P2798": "Loop ID",
  "P2799": "BVMC person ID",
  "P2800": "Beach Volleyball Database ID",
  "P2801": "FIVB beach volleyball player ID",
  "P2802": "fleet or registration number",
  "P2803": "Wikidata time precision",
  "P2804": "International Sailing Federation ID",
  "P2805": "goratings ID",
  "P2806": "vibration",
  "P2807": "molar volume",
  "P2808": "wavelength",
  "P2809": "Australasian Pollen and Spore Atlas Code",
  "P2810": "LPGA Tour ID",
  "P2811": "PGA Tour ID",
  "P2812": "MathWorld identifier",
  "P2813": "mouthpiece",
  "P2814": "P-number",
  "P2815": "ESR station code",
  "P2816": "HowLongToBeat ID",
  "P2817": "appears in the heritage monument list",
  "P2818": "Sherdog ID",
  "P2819": "Yandex.Music album ID",
  "P2820": "cardinality of this set",
  "P2821": "by-product",
  "P2822": "by-product of",
  "P2823": "Belgian Football ID",
  "P2824": "Gazetteer of Planetary Nomenclature ID",
  "P2825": "via",
  "P2826": "Megogo ID",
  "P2827": "flower color",
  "P2828": "corporate officer",
  "P2829": "Internet Wrestling Database ID",
  "P2830": "Online World of Wrestling ID",
  "P2831": "totem",
  "P2832": "Joint Electronics Type Designation Automated System designation",
  "P2833": "ARKive ID",
  "P2834": "individual tax rate",
  "P2835": "lowest income threshold",
  "P2836": "highest income threshold",
  "P2837": "month number",
  "P2838": "professional name (Japan)"
}

# Define Functions 

sparql_to_df enables execution of query against sparql end point

In [None]:
#This function allows execution of the sparql query against the end point 
#extracts structured information from wikidata
#returns dataframe and list of wikidata objects 
def sparql_to_df(endpoint_url, query):
    #set the wikidata sparql end point
    user_agent = "Wikidata-Service Python/%s.%s" % (sys.version_info[0], sys.version_info[1])
    #execute the query and extract the output
    sparql = SPARQLWrapper(endpoint_url, agent=user_agent)
    sparql.setQuery(query)
    sparql.setReturnFormat(JSON)
    results = sparql.query().convert()
    rows = []
    wikiobjects =[]
    #address type 3 first 
    if 'boolean' in results.keys():
      wikiobjects.append(results['boolean'])
    else :
      columns = results['head']['vars']
      for result in results["results"]["bindings"]:
        row = {}
        for col in result:
          row[col] = result[col]['value']
          if result[col]['value'].find('http://www.wikidata.org/entity/') >= 0:
            wikiobjects.append(result[col]['value'].replace('http://www.wikidata.org/entity/',''))
        rows.append(row)
    return pd.DataFrame.from_records(rows, columns=columns),wikiobjects

generate_input controls number of tokens that will be fed into the NLG model, current limit is max 1500 [Open Ai allows for model d upto 2K tokens]

In [None]:
#This function makes sure that we limit input to NLG models to about 500 tokens
def generate_input(input_str,token_limit=1500):
  output=''
  # For performance reasons limiting the overall response to 1500 tokens
  if token_limit > 1500:
    token_limit = 1500
  for i in input_str.split(' && '):
    if len((output).split())+len ((i).split()) < token_limit:
      if output: 
        output =output+' && '+i
      else:
        output = i
    else:
      break
  return output

getwikiprop_3 peforms baseline entity feature and label generation and joins them together across both levels i.e entity descriptions and wikidata object property values

In [None]:
#Function to convert query results into position embeddings as input to the NLG model
def getwikiprop_3(wikidf,wikiobjects,token_limit=1500):
    client = Client()  
    descriptions =[] # labels for wikidata level 1 results
    descriptions_str=''
    details =[]
    details_str=''
    #Sparql gave no results ,df and wikiobjects are both empty 
    if wikidf.empty and not wikiobjects:
      descriptions.append('Wikidata | Answer | No matching records found')
      descriptions_str = (' && ').join(descriptions)
    elif not wikidf.empty and not wikiobjects:
      label=list(wd_df3.columns)[0]
      value=str(wikidf.iloc[0][0])
      descriptions.append('Answer | '+label+' | '+value)
      descriptions_str = (' && ').join(descriptions)
    else:
      #Extract wikidata objects summary
      #get level 1 descriptions first, so it covers all the answers 
      for w in wikiobjects:
        entity = client.get(w, load=True)
        entity_label = str(entity.label)
        descriptions.append(entity_label+' | Description | '+str(entity.description))
      #get additional details about the wikidata object 
        j=0
        y = entity.iterlists()
        x = iter(y)
        while j < 10:
          j = j+1
          try:
            i = x.__next__()
            label = wikiprop.get(str(i[0]).replace('<wikidata.entity.Entity','').replace('>','').strip())
            if label and str(i[1][0]).find('<wikidata.') < 0:
              details.append(entity_label+' | '+label+' | '+str(i[1][0]))
          except:
            continue
      details_str = (' && ').join(details)
      descriptions_str = (' && ').join(descriptions)
    if details_str:
      input_str = descriptions_str+' && '+details_str
    else:
      input_str = descriptions_str
    askwiki_openai_input = generate_input(input_str,token_limit)
    askwiki_nlg_input = 'AskWiki NLG: '+ askwiki_openai_input
    return askwiki_openai_input,askwiki_nlg_input

# Testing

## Multiple row and columns :eye color query

In [None]:
# put queries here for testing ...wikidata does put a limit so be aware
query_eye = """SELECT ?eyeColor ?eyeColorLabel ?rgb (COUNT(?human) AS ?count)
WHERE
{
  ?human wdt:P31 wd:Q5.
  ?human wdt:P1340 ?eyeColor.
  OPTIONAL { ?eyeColor wdt:P465 ?rgb. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?eyeColor ?eyeColorLabel ?rgb
ORDER BY DESC(?count)"""

In [None]:
#generate the result from wikidata, for every query that ran..all the objects that were returned are captured
start = time.time()
wd_dfeye,wikiobjects_eye = sparql_to_df(endpoint_url, query_eye)
askwiki_openai_input,askwiki_nlg_input = getwikiprop_3(wd_dfeye,wikiobjects_eye)

In [None]:
# askwiki_openai_input

In [None]:
print(generate_response(askwiki_openai_input,  model=askwiki_davinci_fine_tune, stop=[" \n"]))
end = time.time()
print(end - start)

 Brown is a polygenic phenotypic character. Eyes can be dark brown, brown, light brown, blue-green, dark green, light green, gold, amber, azure, teal, emerald, purple, red or yellow. The color of the unclouded sky at noon reflecting off a metallic surface is sky blue. Maroon is a brownish-red color. Gold is a color. The sRGB color hex triplet for gold is FFD700. The sRGB color hex triplet for sky blue is 77B5FE. The sRGB color hex triplet for light green is 704030. The sRGB color hex triplet for dark brown is 600000. The sRGB color hex triplet for dark green is 800000. The sRGB color hex triplet for maroon is 800000. The sRGB color hex triplet for purple is 50C878. The sRGB color hex triplet for amber is 608070.
26.82313585281372


## Single output scalar query

In [None]:
another_query = """SELECT ?answer WHERE { wd:Q169794 wdt:P26 ?X . ?X wdt:P22 ?answer}"""

In [None]:
another_query = """SELECT ?answer WHERE { wd:Q169794 wdt:P26 ?X . ?X wdt:P22 ?answer}"""
start=time.time()
wd_df3,wikiobjects_3 = sparql_to_df(endpoint_url, another_query)
askwiki_openai_input, askwiki_nlg_input = getwikiprop_3(wd_df3,wikiobjects_3,20)
print(generate_response(askwiki_openai_input,  model=askwiki_davinci_fine_tune, stop=[" \n"]))
end = time.time()
print(end - start)

 Andrianampoinimerina is the name of a King of Imerina and is in the Commons category.
3.074942111968994


## Paris query
WIkidata supports multiple calndar date day time formats in the property values but the python wikidata package does not support all of them , so we hit a package limitation, open issue on wikidata package about a date time data type https://github.com/dahlia/wikidata/issues/54 we have a work around implemented via exception handling and limiting number of properties to query for.

In [None]:
crash_query= """SELECT ?capital WHERE {wd:Q142 wdt:P36 ?capital .}"""

In [None]:
start=time.time()
crash_query= """SELECT ?capital WHERE {wd:Q142 wdt:P36 ?capital .}"""
wd_df3,wikiobjects_3 = sparql_to_df(endpoint_url, crash_query)
askwiki_openai_input, askwiki_nlg_input = getwikiprop_3(wd_df3,wikiobjects_3)
print(generate_response(askwiki_openai_input,  model=askwiki_davinci_fine_tune, stop=[" \n"]))
end = time.time()
print(end - start)

 Capital and most populous city of France, Paris has a TGN ID of 7008038, NDL ID of 00629026, VIAF ID of 158822968 and a GeoNames ID of 2968815. It has a
3.8584134578704834


## operating income question

In [None]:
query3="""SELECT ?operating_income WHERE { wd:Q32491 wdt:P3362 ?operating_income . }"""
start=time.time()
wd_df3,wikiobjects_3 = sparql_to_df(endpoint_url, query3)
askwiki_openai_input, askwiki_nlg_input = getwikiprop_3(wd_df3,wikiobjects_3)
print(generate_response(askwiki_openai_input,  model=askwiki_davinci_fine_tune, stop=[" \n"]))
end = time.time()
print(end - start)

 Operating income in 2012 was $1370000000.
1.2037980556488037


## athelete question

In [None]:
query3="""SELECT ?athlete_id WHERE { wd:Q235975 wdt:P3171 ?athlete_id . }"""
start=time.time()
wd_df3,wikiobjects_3 = sparql_to_df(endpoint_url, query3)
askwiki_openai_input, askwiki_nlg_input = getwikiprop_3(wd_df3,wikiobjects_3)
print(generate_response(askwiki_openai_input,  model=askwiki_davinci_fine_tune, stop=[" \n"]))
end = time.time()
print(end - start)

 The athlete id of Mary Lou Retton is mary-lou-retton.
1.7096307277679443


In [None]:
askwiki_openai_input

'Answer | athlete_id | mary-lou-retton'

## Number of kids question

In [None]:
query3="""SELECT (COUNT(?children) as ?count) WHERE { wd:Q1339 wdt:P40 ?children . }"""
start=time.time()
wd_df3,wikiobjects_3 = sparql_to_df(endpoint_url, query3)
askwiki_openai_input, askwiki_nlg_input = getwikiprop_3(wd_df3,wikiobjects_3)
print(generate_response(askwiki_openai_input,  model=askwiki_davinci_fine_tune, stop=[" \n"]))
end = time.time()
print(end - start)

 The answer has 20 counts.
1.1108102798461914


In [None]:
askwiki_openai_input

'Answer | count | 20'

## Multiple rows and wikiobjects and hierarchial relationship band query

In [None]:
sample1 ="""SELECT ?participant WHERE { ?participant wdt:P1344 wd:Q54554872 . }"""
start=time.time()
wd,wikiobjects = sparql_to_df(endpoint_url, sample1)
askwiki_openai_input, askwiki_nlg_input = getwikiprop_3(wd,wikiobjects,200)
print(generate_response(askwiki_openai_input,  model=askwiki_davinci_fine_tune, stop=[" \n"]))
end = time.time()
print(end - start)

 Malvina Reynolds, 1900-08-23, died 1978-03-17 was a Canadian folk singer. John Hammond, 66552549, was an American record producer, civil rights activist and music critic. Steve Goodman, xx0151343, was an American folk music singer-songwriter. Bill Reid, NKCR AUT, was a Canadian sculptor, jeweler, painter (
14.895061016082764


## Multiple rows but repetative information hepatitis query

In [None]:
sample1 ="""SELECT DISTINCT ?sbj ?sbj_label WHERE { ?sbj wdt:P31 wd:Q18123741 . ?sbj wdt:P689 wd:Q9368 . ?sbj rdfs:label ?sbj_label . FILTER(STRSTARTS(lcase(?sbj_label), 'h')) . FILTER (lang(?sbj_label) = 'en') } LIMIT 25"""
start=time.time()
wd,wikiobjects = sparql_to_df(endpoint_url, sample1)
askwiki_openai_input, askwiki_nlg_input = getwikiprop_3(wd,wikiobjects,200)
print(generate_response(askwiki_openai_input,  model=askwiki_davinci_fine_tune, stop=[" \n"]))
end = time.time()
print(end - start)

 Human viral infections include hepatitis B, hepatitis C and hepatitis A. Hepatitis B has the ID number 22624 at BNCF. Thesaurus of Diseases includes 5765 references about hepatitis B. eMedicine has the reference 177632 about hepatitis B. The NDL ID for hepatitis B is 00986940. Freebase has the reference m/0jdvt about hepatitis B. The GND ID is 4262007-7 about hepatitis B. The patientplus reference for hepatitis-c-pro is also about hepatitis B. Thesaurus 44499 includes hepatitis C. DiseasesDB 5783 includes hepatitis C. The ICD-9 code for hepatitis C is 070.2. The ICD-10 code for hepatitis C is B16. The reference for hepatitis C at eMedicine is 177792. The NDL ID for hepatitis C is 00986947. The freebase reference for hepatitis C is /m/0jdvt. The
10.076843023300171


## Sparql queries that return no results

In [None]:
nomatch ="""SELECT ?answer WHERE { wd:Q675176 wdt:P515 ?X . ?X wdt:P156 ?answer}"""
wd_nomatch,wikiobjects_nomatch = sparql_to_df(endpoint_url, nomatch)

In [None]:
askwiki_openai_input, askwiki_nlg_input = getwikiprop_3(wd_nomatch,wikiobjects_nomatch)

In [None]:
askwiki_openai_input

'Wikidata | Answer | No matching records found'

In [None]:
askwiki_nlg_input

'AskWiki NLG: Wikidata | Answer | No matching records found'

# Appendix

## AskWiki sample triples

In [None]:
triples

['Malvina Reynolds | Description | American folk singer ',
 ' John Hammond | Description | American record producer, civil rights activist and music critic ',
 ' Steve Goodman | Description | American folk music singer-songwriter ',
 ' Bill Reid | Description | Canadian sculptor, jeweler, painter (1920-1998) ',
 ' Alanis Obomsawin | Description | Abenaki artist and filmmaker in Montreal ',
 ' Willie Dunn | Description | Canadian politician, writer, filmmaker, and musician ',
 ' Gilles Losier | Description | Canadian pianist ',
 ' Joe Hickerson | Description | folklorist, songleader, librarian ',
 ' Robert Davidson | Description | Canadian artist ',
 ' Shelley Posen | Description | Curator of Canadian Folklife ',
 ' Duke Redbird | Description | Canadian poet, journalist, activist, actor ',
 ' Guy Sioui Durand | Description | sociologist and art critic ',
 ' Guy Sioui Durand | VIAF ID | 23493785 ',
 ' Guy Sioui Durand | ISNI | 0000 0000 6347 7309 ',
 ' Guy Sioui Durand | MusicBrainz arti

## AskWiki baselined input to NLG based on Triples

In [None]:
askwiki_openai_input

['Malvina Reynolds | Description | American folk singer ',
 ' Malvina Reynolds | VIAF ID | 23493785 ',
 ' Malvina Reynolds | ISNI | 0000 0000 6347 7309 ',
 ' Malvina Reynolds | MusicBrainz artist ID | 33253c93-a45a-4f56-96fc-0897fd95d8db ',
 ' Malvina Reynolds | date of birth | 1900-08-23 ',
 ' Malvina Reynolds | date of death | 1978-03-17 ',
 ' Malvina Reynolds | Freebase ID | /m/0512nz ',
 ' Malvina Reynolds | FAST-ID | 100541 ',
 ' Malvina Reynolds | LCAuth ID | n82139133 ',
 ' Malvina Reynolds | NNDB people ID | 607/000205989 ',
 ' Malvina Reynolds | Discogs artist ID | 615903 ',
 ' Malvina Reynolds | lesarchivesduspectacle ID | 208099 ',
 ' Malvina Reynolds | IMDb ID | nm0961618 ',
 ' Malvina Reynolds | BnF ID | 139572890 ',
 ' Malvina Reynolds | National Thesaurus for Author Names ID | 072760079 ']

## AskWiki sample sparql results with wikidata object links

In [None]:
wd

Unnamed: 0,participant
0,http://www.wikidata.org/entity/Q268478
1,http://www.wikidata.org/entity/Q549141
2,http://www.wikidata.org/entity/Q585159
3,http://www.wikidata.org/entity/Q615962
4,http://www.wikidata.org/entity/Q637195
5,http://www.wikidata.org/entity/Q2581437
6,http://www.wikidata.org/entity/Q3106360
7,http://www.wikidata.org/entity/Q3808659
8,http://www.wikidata.org/entity/Q7343412
9,http://www.wikidata.org/entity/Q7493815


## AskWiki Sample sparql output indicating multiple rows and columns

In [None]:
wd_dfeye

Unnamed: 0,eyeColor,eyeColorLabel,rgb,count
0,http://www.wikidata.org/entity/Q17122705,brown,800000,3622
1,http://www.wikidata.org/entity/Q17122834,blue,608090,1883
2,http://www.wikidata.org/entity/Q17122854,green,707020,841
3,http://www.wikidata.org/entity/Q17244894,dark brown,600000,657
4,http://www.wikidata.org/entity/Q17122740,hazel,5A391C,494
5,http://www.wikidata.org/entity/Q17244465,black,201010,449
6,http://www.wikidata.org/entity/Q17245659,gray,707070,171
7,http://www.wikidata.org/entity/Q3375649,blue-green,608070,114
8,http://www.wikidata.org/entity/Q79399164,light brown,704030,54
9,http://www.wikidata.org/entity/Q42845936,blue-gray,708090,40
