In [1]:
import os
import json
from dotenv import load_dotenv

load_dotenv()  # take environment variables from .env.

True

In [2]:
# establish a connection to the MongoDB database
from pymongo import MongoClient

# connect to your Atlas cluster
client = MongoClient(os.environ["MONGODB_URI"])

In [99]:
# establish a connection to the PostgreSQL database
import psycopg2 as pg

conn = pg.connect(
    dbname=os.environ["POSTGRES_DB"],
    user=os.environ["POSTGRES_USER"],
    password=os.environ["POSTGRES_PASSWORD"],
    host=os.environ["POSTGRES_HOST"]
)
cursor = conn.cursor()

# LangChain: Models, Prompts and Output Parsers


## Outline

 * Direct API calls to OpenAI
 * API calls through LangChain:
   * Prompts
   * Models
   * Output parsers

In [100]:
import openai

openai.api_key = os.environ['OPENAI_APIKEY']
llm_model = "gpt-4o-mini"
# llm_model = "chatgpt-4o-latest"


In [101]:
from langchain.chat_models import ChatOpenAI

In [102]:
# To control the randomness and creativity of the generated
# text by an LLM, use temperature = 0.0
chat = ChatOpenAI(temperature=0.0, model=llm_model, openai_api_key=openai.api_key)


### Clean up class names

In [103]:
template_string_properties = """
You are a highly skilled OWL (Web Ontology Language) ontology engineer. Your task is to assist in the creation, validation, and optimization of OWL ontologies, which are formal representations of knowledge. You have expert knowledge in knowledge engineering, description logic, and semantic web technologies. You also excel at defining classes, properties, and relationships between entities, ensuring logical consistency, and facilitating the sharing of knowledge across different domains.

An OWL ontology is a structured framework used to represent and share knowledge about a particular domain. It consists of classes (concepts), properties (relationships and attributes), and individuals (instances). OWL ontologies allow for the modeling of rich, complex relationships between data in a machine-readable format, enabling advanced reasoning, querying, and inference over that data.
Core OWL Concepts:

    Classes: These represent sets or collections of individuals, typically abstract concepts or types (e.g., "Character," "Weapon," "GameLevel").
    Individuals: Instances of classes (e.g., "Mario" is an individual of the class "Character"; "Sword of Flames" is an individual of the class "Weapon").
    Object Properties: Define relationships between two individuals (e.g., "wieldsWeapon" linking a character to a weapon they use, or "locatedIn" linking a character to a particular game level).
    Datatype Properties: Define relationships between an individual and a data value (e.g., "hasHealthPoints" linking a character to a numeric value representing their health).
    SubClassOf: A relation where one class is a subclass of another, inheriting properties (e.g., "BossCharacter" is a subclass of "Character").
    Equivalence: Used to state that two classes or properties are equivalent (e.g., "MagicWeapon" may be declared equivalent to "SpecialWeapon").
    Disjoint Classes: These are classes that cannot share instances (e.g., "Weapon" and "ConsumableItem" are disjoint classes, meaning an item cannot be both a weapon and a consumable).

OWL Inference and Reasoning:

One of the powerful aspects of OWL is that it allows for reasoning over data. Inference engines can deduce new facts based on the relationships and properties defined in the ontology. For example, if "BossCharacter" is a subclass of "Character" and "Bowser" is an individual of "BossCharacter," it can be inferred that "Bowser" is also an individual of "Character." Additionally, if a property like "wieldsWeapon" is defined, you could infer that "Bowser wields a FireballWeapon" if such an individual and relationship are defined.

Your role is organize and structure a list of entity property names derived from web data. Property names might be the same but have a slightly different spelling, capitalization, wording, abbreviation or plurality. First, you need to clean up these properties and provide a list of final properties. The final properties should be unique.

Second, you need merge some properties into a single property if that makes ontological sense. For example, hasLocation and locatedIn are the same properties and should be merged into a single properties. 

You are given a list of propert names as input. You need to provide a list of tuples as output: ("original property name", "cleaned up and merged property name"). For example, if the input is ["hasLocation", "location", "LocatedIn"], the output should be [("hasLocation", "hasLocation"), ("location", "hasLocation"), ("LocatedIn", "hasLocation")].

Original property names are case-insensitive. Cleaned up cnames should have camel case with the first letter in lowercase.
Very important! Only output tuples where the new property name is different from the original property name.

Input classes: ```{properties}```

"""

In [104]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(template_string_properties)

In [105]:
prompt_template.messages[0].prompt

PromptTemplate(input_variables=['properties'], input_types={}, partial_variables={}, template='\nYou are a highly skilled OWL (Web Ontology Language) ontology engineer. Your task is to assist in the creation, validation, and optimization of OWL ontologies, which are formal representations of knowledge. You have expert knowledge in knowledge engineering, description logic, and semantic web technologies. You also excel at defining classes, properties, and relationships between entities, ensuring logical consistency, and facilitating the sharing of knowledge across different domains.\n\nAn OWL ontology is a structured framework used to represent and share knowledge about a particular domain. It consists of classes (concepts), properties (relationships and attributes), and individuals (instances). OWL ontologies allow for the modeling of rich, complex relationships between data in a machine-readable format, enabling advanced reasoning, querying, and inference over that data.\nCore OWL Conc

In [106]:
prompt_template.messages[0].prompt.input_variables

['properties']

In [125]:
cursor.execute("SELECT distinct property_name FROM fandom_properties_clean WHERE property_name not in (select property from fandom_properties) ORDER BY property_name")
properties_list = cursor.fetchall()

# classes_1 = '\n'.join([c[0] for c in classes_list[:2000]])
# classes_2 = '\n'.join([c[0] for c in classes_list[2000:]])
properties = '\n'.join([p[0] for p in properties_list])

In [126]:
# conn.rollback()

In [127]:
len(properties_list)

1

In [119]:
fandom_input = prompt_template.format_messages(properties=properties)

In [120]:
# Call the LLM to translate to the style of the customer message
entity_info = chat.invoke(fandom_input)

In [121]:
print(entity_info.content)

Here is the cleaned-up and merged list of property names based on the provided input:

```python
[
    ("hasCV", "hasCV"),
    ("keyPlotPoint", "keyPlotPoint"),
    ("number", "number"),
    ("onPlanet", "onPlanet"),
    ("resultsInReward", "resultsInReward"),
    ("Starts in", "startsIn"),
    ("stopOnHit", "stopOnHit"),
    ("stoppingPercentage", "stoppingPercentage"),
    ("storage", "storage"),
    ("storageCapacity", "storageCapacity"),
    ("storageCategory", "storageCategory"),
    ("storageLocation", "storageLocation"),
    ("storageType", "storageType"),
    ("storedIn", "storedIn"),
    ("Stored in", "storedIn"),
    ("stores", "stores"),
    ("storesIn", "storesIn"),
    ("story", "story"),
    ("Story", "story"),
    ("storyboardBy", "storyboardBy"),
    ("story_boarded_by", "storyboardedBy"),
    ("storyboarded_by", "storyboardedBy"),
    ("storyboardedBy", "storyboardedBy"),
    ("storyBoardedBy", "storyboardedBy"),
    ("storyCharacter", "storyCharacter"),
    ("storyRel

In [122]:
# output = json.loads((entity_info.content[:-5] + "]").split('```python')[1])
output = [
    ("hasCV", "hasCV"),
    ("keyPlotPoint", "keyPlotPoint"),
    ("number", "number"),
    ("onPlanet", "onPlanet"),
    ("resultsInReward", "resultsInReward"),
    ("Starts in", "startsIn"),
    ("stopOnHit", "stopOnHit"),
    ("stoppingPercentage", "stoppingPercentage"),
    ("storage", "storage"),
    ("storageCapacity", "storageCapacity"),
    ("storageCategory", "storageCategory"),
    ("storageLocation", "storageLocation"),
    ("storageType", "storageType"),
    ("storedIn", "storedIn"),
    ("Stored in", "storedIn"),
    ("stores", "stores"),
    ("storesIn", "storesIn"),
    ("story", "story"),
    ("Story", "story"),
    ("storyboardBy", "storyboardBy"),
    ("story_boarded_by", "storyboardedBy"),
    ("storyboarded_by", "storyboardedBy"),
    ("storyboardedBy", "storyboardedBy"),
    ("storyBoardedBy", "storyboardedBy"),
    ("storyCharacter", "storyCharacter"),
    ("storyRelevance", "storyRelevance"),
    ("storyRole", "storyRole"),
    ("stream", "stream"),
    ("streamingLink", "streamLink"),
    ("streamLink", "streamLink"),
    ("strength", "strength"),
    ("Strength", "strength"),
    ("strengthRequirement", "strengthRequirement"),
    ("strengths", "strengths"),
    ("strikes", "strikes"),
    ("stringID", "stringID"),
    ("strongAgainst", "strongAgainst"),
    ("structure", "structure"),
    ("stunnable", "stunnable"),
    ("style", "style"),
    ("subAffiliation", "subAffiliation"),
    ("subEvents", "subEvents"),
    ("subject", "subject"),
    ("subjectOf", "subjectOf"),
    ("sublocation", "subLocation"),
    ("subLocation", "subLocation"),
    ("subMap", "subMap"),
    ("subModel", "subModel"),
    ("subordinate", "subordinate"),
    ("subordinateTo", "subordinateTo"),
    ("subordinateUnits", "subordinateUnits"),
    ("subtitles", "subtitles"),
    ("subWeapon", "subWeapon"),
    ("succeededBy", "succeededBy"),
    ("succeedingQuest", "succeedingQuest"),
    ("succeeds", "succeeds"),
    ("succeedsQuest", "succeedsQuest"),
    ("successor", "successor"),
    ("successorAbility", "successorAbility"),
    ("successorChapter", "successorChapter"),
    ("successorEvent", "successorEvent"),
    ("successorLevel", "successorLevel"),
    ("successorVersion", "successorVersion"),
    ("successRate", "successRate"),
    ("Success Rate", "successRate"),
    ("succumbedTo", "succumbedTo"),
    ("suckingAir", "suckingAir"),
    ("suit power", "suitPower"),
    ("suit style", "suitStyle"),
    ("suitStyleAvailable", "suitStyleAvailable"),
    ("summonAbility", "summonAbility"),
    ("summonCode", "summonCode"),
    ("summons", "summons"),
    ("sungBy", "sungBy"),
    ("Super admin", "superAdmin"),
    ("superchargeEffect", "superchargeEffect"),
    ("supplies", "supplies"),
    ("supply", "supply"),
    ("supplyBins", "supplyBins"),
    ("supplyPoints", "supplyPoints"),
    ("supportedByPot", "supportedByPot"),
    ("supportedByRod", "supportedByRod"),
    ("supportedByTrawl", "supportedByTrawl"),
    ("supportedGameMode", "supportedGameMode"),
    ("supportedGameModes", "supportedGameModes"),
    ("supportedSkillsIncreaseCriticalStrikeChance", "supportedSkillsIncreaseCriticalStrikeChance"),
    ("supportedSkillsIncreaseLightningDamage", "supportedSkillsIncreaseLightningDamage"),
    ("supports", "supports"),
    ("supportsFTPServer", "supportsFTPServer"),
    ("supportsGameMode", "supportsGameMode"),
    ("supportsHTTPServer", "supportsHTTPServer"),
    ("supportsLoadout", "supportsLoadout"),
    ("supportsMultiplayer", "supportsMultiplayer"),
    ("supportsSkills", "supportsSkills"),
    ("supportsSMTPServer", "supportsSMTPServer"),
    ("supportsSSH", "supportsSSH"),
    ("suspect", "suspect"),
    ("suspects", "suspects"),
    ("swingDelay", "swingDelay"),
    ("symbol", "symbol"),
    ("symbolCount", "symbolCount"),
    ("symbols", "symbols"),
    ("Symbols", "symbols"),
    ("synergyBurstDuration", "synergyBurstDuration"),
    ("system_requirements", "systemRequirements"),
    ("systemRequirements", "systemRequirements"),
    ("tacticalAbility", "tacticalAbility"),
    ("tag", "tag"),
    ("tags", "tags"),
    ("Tags", "tags"),
    ("takenBy", "takenBy"),
    ("takesDamageFrom", "takesDamageFrom"),
    ("takesDamageType", "takesDamageType"),
    ("takesPlaceAt", "takesPlaceAt"),
    ("takes_place_in", "takesPlaceIn"),
    ("takesPlaceIn", "takesPlaceIn"),
    ("takesPlaceInDistrict", "takesPlaceInDistrict"),
    ("takesPlaceInLocation", "takesPlaceInLocation"),
    ("takesPlaceInRegion", "takesPlaceInRegion"),
    ("takesPlaceInWorld", "takesPlaceInWorld"),
    ("takesPlaceOnMap", "takesPlaceOnMap"),
    ("talks", "talks"),
    ("target", "target"),
    ("Targeting Range", "targetingRange"),
    ("targetRange", "targetRange"),
    ("targets", "targets"),
    ("targetsType", "targetsType"),
    ("targetType", "targetType"),
    ("Target type", "targetType"),
    ("task", "task"),
    ("taskGiver", "taskGiver"),
    ("tasks", "tasks"),
    ("tasks_base_score", "tasksBaseScore"),
    ("tasteProfile", "tasteProfile"),
    ("taughtBy", "taughtBy"),
    ("taxonomyClade", "taxonomyClade"),
    ("taxonomyFamily", "taxonomyFamily"),
    ("taxonomyKingdom", "taxonomyKingdom"),
    ("taxonomyOrder", "taxonomyOrder"),
    ("team", "team"),
    ("teamMembersCount", "teamMembersCount"),
    ("teamName", "teamName"),
    ("techLevel", "techLevel"),
    ("techtree", "techtree"),
    ("Techtree", "techtree"),
    ("temperament", "temperament"),
    ("Temperament", "temperament"),
    ("temperature", "temperature"),
    ("Temperature", "temperature"),
    ("temperatureRange", "temperatureRange"),
    ("tendedBy", "tendedBy"),
    ("tenureEnd", "tenureEnd"),
    ("tenureStart", "tenureStart"),
    ("territory", "territory"),
    ("tertiaryAbility", "tertiaryAbility"),
    ("Tertiary Ability", "tertiaryAbility"),
    ("tertiaryType", "tertiaryType"),
    ("Tertiary Type", "tertiaryType"),
    ("theme", "theme"),
    ("Theme", "theme"),
    ("theoreticalAffiliation", "theoreticalAffiliation"),
    ("threatLevel", "threatLevel"),
    ("threeStarScore", "threeStarScore"),
    ("throwable", "throwable"),
    ("thrownCanFlinch", "thrownCanFlinch"),
    ("thrownHeadDamage", "thrownHeadDamage"),
    ("thrownLegDamage", "thrownLegDamage"),
    ("thrownStoneDamage", "thrownStoneDamage"),
    ("thrownTorsoDamage", "thrownTorsoDamage"),
    ("thrownWoodDamage", "thrownWoodDamage"),
    ("tier", "tier"),
    ("Tier", "tier"),
    ("tier3Effect", "tier3Effect"),
    ("tier6Effect", "tier6Effect"),
    ("tileID", "tileID"),
    ("time", "time"),
    ("Time", "time"),
    ("timeAvailable", "timeAvailable"),
    ("timed", "timed"),
    ("timeDuration", "timeDuration"),
    ("timeFound", "timeFound"),
    ("Time Found", "timeFound"),
    ("timeLimit", "timeLimit"),
    ("timeNeeded", "timeNeeded"),
    ("timeOfDay", "timeOfDay"),
    ("timeOfYear", "timeOfYear"),
    ("timePeriod", "timePeriod"),
    ("timePiecesNeeded", "timePiecesNeeded"),
    ("timePiecesRewarded", "timePiecesRewarded"),
    ("timeRequired", "timeRequired"),
    ("timeSignature", "timeSignature"),
    ("timeToResearch", "timeToResearch"),
    ("tinkerDifficulty", "tinkerDifficulty"),
    ("tinkererTier", "tinkererTier"),
    ("tinkeringLevel", "tinkeringLevel"),
    ("tinkeringParts", "tinkeringParts"),
    ("tinkeringRank", "tinkeringRank"),
    ("tipOffGiver", "tipOffGiver"),
    ("titaneseName", "titaneseName"),
    ("title", "title"),
    ("titles", "titles"),
    ("to", "to"),
    ("toolRequired", "toolRequired"),
    ("Tool Required", "toolRequired"),
    ("tooltip", "tooltip"),
    ("tooltipDescription", "tooltipDescription"),
    ("topicNumber", "topicNumber"),
    ("topSpeed", "topSpeed"),
    ("torque", "torque"),
    ("torsoDamage", "torsoDamage"),
    ("Total", "total"),
    ("totalAmmo", "totalAmmo"),
    ("totalAmount", "totalAmount"),
    ("totalAppearances", "totalAppearances"),
    ("totalBots", "totalBots"),
    ("totalCollectibles", "totalCollectibles"),
    ("totalCost", "totalCost"),
    ("totalCount", "totalCount"),
    ("totalEnemies", "totalEnemies"),
    ("totalGems", "totalGems"),
    ("Total Hull Integrity", "totalHullIntegrity"),
    ("total_length", "totalLength"),
    ("totalLength", "totalLength"),
    ("totalNumberOfMissions", "totalNumberOfMissions"),
    ("totalProfit", "totalProfit"),
    ("totalSecrets", "totalSecrets"),
    ("totalTreasurePods", "totalTreasurePods"),
    ("tourLength", "tourLength"),
    ("Tower HP", "towerHP"),
    ("towerLevel", "towerLevel"),
    ("towerNumber", "towerNumber"),
    ("Tower Number", "towerNumber"),
    ("town", "town"),
    ("Town", "town"),
    ("toxicity", "toxicity"),
    ("Toxicity", "toxicity"),
    ("toys", "toys"),
    ("toysCount", "toysCount"),
    ("TP_Cost", "tpCost"),
    ("Trace time", "traceTime"),
    ("track", "track"),
    ("Track #", "trackNumber"),
    ("trackCount", "trackCount"),
    ("trackLength", "trackLength"),
    ("trackNumber", "trackNumber"),
    ("TrackNumber", "trackNumber"),
    ("Track Number", "trackNumber"),
    ("trackPlayed", "trackPlayed"),
    ("tradable", "tradable"),
    ("Tradable", "tradable"),
    ("tradeability", "tradeability"),
    ("tradeCost", "tradeCost"),
    ("tradeName", "tradeName"),
    ("Trailer Stores", "trailerStores"),
    ("trainedBy", "trainedBy"),
    ("trainingLocation", "trainingLocation"),
    ("trait", "trait"),
    ("traits", "traits"),
    ("Traits", "traits"),
    ("transcript", "transcript"),
    ("transformType", "transformType"),
    ("transitionsTo", "transitionsTo"),
    ("transitionsToAreas", "transitionsToAreas"),
    ("transitionTo", "transitionTo"),
    ("transmissionType", "transmissionType"),
    ("transmutedHealthPoints", "transmutedHealthPoints"),
    ("transmutedHP", "transmutedHP"),
    ("Transport", "transport"),
    ("transportMode", "transportMode"),
    ("Transport Mode", "transportMode"),
    ("transportsTo", "transportsTo"),
    ("traveler", "traveler"),
    ("treatmentLocation", "treatmentLocation"),
    ("treatmentNeeded", "treatmentNeeded"),
    ("treatmentStaff", "treatmentStaff"),
    ("Tribe", "tribe"),
    ("triggerDelay", "triggerDelay"),
    ("triggeredBy", "triggeredBy"),
    ("triggeringEvent", "triggeringEvent"),
    ("triggerRange", "triggerRange"),
    ("triggersEvent", "triggersEvent"),
    ("triggersOn", "triggersOn"),
    ("trophy", "trophy"),
    ("Trophy", "trophy"),
    ("trophyAwarded", "trophyAwarded"),
    ("trophyLevel", "trophyLevel"),
    ("trophyRequirement", "trophyRequirement"),
    ("trophyTime", "trophyTime"),
    ("trophyType", "trophyType"),
    ("trueVaultHunterModeLevel", "trueVaultHunterModeLevel"),
    ("trueVaultHunterModeRewards", "trueVaultHunterModeRewards"),
    ("tuner", "tuner"),
    ("tuner_origin", "tunerOrigin"),
    ("turf", "turf"),
    ("turnCapX", "turnCapX"),
    ("turnCapY", "turnCapY"),
    ("turnDelay", "turnDelay"),
    ("turn in to", "turnInTo"),
    ("turnInTo", "turnInTo"),
    ("Turn in to", "turnInTo"),
    ("turquoise", "turquoise"),
    ("turquoiseCollectibles", "turquoiseCollectibles"),
    ("turrets", "turrets"),
    ("Turrets", "turrets"),
    ("Tutorial", "tutorial"),
    ("twitter", "twitter"),
    ("Twitter", "twitter"),
    ("twitterLink", "twitterLink"),
    ("twitterPage", "twitterPage"),
    ("type", "type"),
    ("Type", "type"),
    ("typeOfAttack", "typeOfAttack"),
    ("typeOfRelationship", "typeOfRelationship"),
    ("typeOfResourceProduced", "typeOfResourceProduced"),
    ("typical_mass", "typicalMass"),
    ("typicalMass", "typicalMass"),
    ("typical_value", "typicalValue"),
    ("typicalValue", "typicalValue"),
    ("ultimateAbility", "ultimateAbility"),
    ("ultimateVaultHunterModeLevel", "ultimateVaultHunterModeLevel"),
    ("ultimateVaultHunterModeRewards", "ultimateVaultHunterModeRewards"),
    ("ultimate_weapon", "ultimateWeapon"),
    ("umbranClimax", "umbranClimax"),
    ("uncle", "uncle"),
    ("uniqueAbilitiesCount", "uniqueAbilitiesCount"),
    ("uniqueFeatures", "uniqueFeatures"),
    ("uniqueTrait", "uniqueTrait"),
    ("uniqueWares", "uniqueWares"),
    ("Unit", "unit"),
    ("unitCost", "unitCost"),
    ("unitTier", "unitTier"),
    ("universe", "universe"),
    ("unlock", "unlock"),
    ("Unlock", "unlock"),
    ("unlockables", "unlockables"),
    ("unlockAction", "unlockAction"),
    ("unlockBy", "unlockBy"),
    ("unlockColor", "unlockColor"),
    ("unlock_condition", "unlockCondition"),
    ("unlockCondition", "unlockCondition"),
    ("unlockCost", "unlockCost"),
    ("unlockCostParts", "unlockCostParts"),
    ("unlockCriteria", "unlockCriteria"),
    ("unlocked", "unlocked"),
    ("Unlocked", "unlocked"),
    ("unlockedAt", "unlockedAt"),
    ("unlockedAtLevel", "unlockedAtLevel"),
    ("unlockedAtRank", "unlockedAtRank"),
    ("unlockedAtScore", "unlockedAtScore"),
    ("unlockedBy", "unlockedBy"),
    ("Unlocked by", "unlockedBy"),
    ("unlockedFacility", "unlockedFacility"),
    ("unlockedIn", "unlockedIn"),
    ("unlockedInChapter", "unlockedInChapter"),
    ("unlockedInGame", "unlockedInGame"),
    ("unlockedInLevel", "unlockedInLevel"),
    ("unlockedInLocation", "unlockedInLocation"),
    ("unlockedItems", "unlockedItems"),
    ("unlockedMessage", "unlockedMessage"),
    ("unlockedPowers", "unlockedPowers"),
    ("unlockedRecipes", "unlockedRecipes"),
    ("unlockedResearch", "unlockedResearch"),
    ("unlockedThrough", "unlockedThrough"),
    ("unlockingRequirement", "unlockingRequirement"),
    ("unlockLevel", "unlockLevel"),
    ("unlockMethod", "unlockMethod"),
    ("unlockObjective", "unlockObjective"),
    ("unlockRequirement", "unlockRequirement"),
    ("unlockRequirements", "unlockRequirements"),
    ("unlocks", "unlocks"),
    ("Unlocks", "unlocks"),
    ("unlocksAbility", "unlocksAbility"),
    ("unlocksAfter", "unlocksAfter"),
    ("unlocksAnalysis", "unlocksAnalysis"),
    ("unlocksAnalysisFor", "unlocksAnalysisFor"),
    ("unlocksAt", "unlocksAt"),
    ("unlocksAtLevel", "unlocksAtLevel"),
    ("unlocksBuilding", "unlocksBuilding"),
    ("unlocksBy", "unlocksBy"),
    ("unlocksByDefeating", "unlocksByDefeating"),
    ("unlocksChapter", "unlocksChapter"),
    ("unlocksCharacter", "unlocksCharacter"),
    ("unlocksConsultantUpgrades", "unlocksConsultantUpgrades"),
    ("unlocksFacility", "unlocksFacility"),
    ("unlocksFastTravelIn", "unlocksFastTravelIn"),
    ("unlocksFor", "unlocksFor"),
    ("unlocksHelp", "unlocksHelp"),
    ("unlocksHelpsUnlock", "unlocksHelpsUnlock"),
    ("unlocksIn", "unlocksIn"),
    ("unlocksItem", "unlocksItem"),
    ("unlocksItems", "unlocksItems"),
    ("unlocksLaw", "unlocksLaw"),
    ("unlocksMission", "unlocksMission"),
    ("unlocksPower", "unlocksPower"),
    ("unlocksPowerUp", "unlocksPowerUp"),
    ("unlocksProduction", "unlocksProduction"),
    ("unlocksProject", "unlocksProject"),
    ("unlocksProvingGroundProject", "unlocksProvingGroundProject"),
    ("unlocksProvingGroundProjects", "unlocksProvingGroundProjects"),
    ("unlocksRecipe", "unlocksRecipe"),
    ("unlocksStructure", "unlocksStructure"),
    ("unlocksThrough", "unlocksThrough"),
    ("unlockType", "unlockType"),
    ("unused", "unused"),
    ("upgradable", "upgradable"),
    ("upgrade", "upgrade"),
    ("upgrade1", "upgrade1"),
    ("upgrade2", "upgrade2"),
    ("upgradeCost", "upgradeCost"),
    ("upgradedFrom", "upgradedFrom"),
    ("upgradedTo", "upgradedTo"),
    ("upgradedUpkeepCost", "upgradedUpkeepCost"),
    ("upgradeDuration", "upgradeDuration"),
    ("upgradeLevel", "upgradeLevel"),
    ("upgradeOf", "upgradeOf"),
    ("upgradeOptions", "upgradeOptions"),
    ("upgradeRequirement", "upgradeRequirement"),
    ("upgrades", "upgrades"),
    ("Upgrades", "upgrades"),
    ("upgrades_from", "upgradesFrom"),
    ("upgradesFrom", "upgradesFrom"),
    ("upgradesSettlement", "upgradesSettlement"),
    ("upgrades_to", "upgradesTo"),
    ("upgradesTo", "upgradesTo"),
    ("upgradeTime", "upgradeTime"),
    ("upgradeType", "upgradeType"),
    ("upkeep", "upkeep"),
    ("upkeepCost", "upkeepCost"),
    ("upper_tool", "upperTool"),
    ("Upper Tool", "upperTool"),
    ("upper_tool_tip", "upperToolTip"),
    ("Upper Tool Tip", "upperToolTip"),
    ("usableBy", "usableBy"),
    ("usableByLevel", "usableByLevel"),
    ("usableIn", "usableIn"),
    ("usableWhileDead", "usableWhileDead"),
    ("usableWith", "usableWith"),
    ("usage", "usage"),
    ("Usage", "usage"),
    ("usageLimit", "usageLimit"),
    ("usageType", "usageType"),
    ("Usage Type", "usageType"),
    ("use", "use"),
    ("Use", "use"),
    ("Used", "used"),
    ("usedBy", "usedBy"),
    ("Used By", "usedBy"),
    ("usedByClass", "usedByClass"),
    ("usedByNations", "usedByNations"),
    ("usedByOperators", "usedByOperators"),
    ("usedFor", "usedFor"),
    ("usedIn", "usedIn"),
    ("Used in", "usedIn"),
    ("usedInEvent", "usedInEvent"),
    ("usedInGame", "usedInGame"),
    ("usedInLocation", "usedInLocation"),
    ("usedInLocations", "usedInLocations"),
    ("usedInQuest", "usedInQuest"),
    ("usedOn", "usedOn"),
    ("usedToCraft", "usedToCraft"),
    ("usedToMake", "usedToMake"),
    ("usedWealthFor", "usedWealthFor"),
    ("usedWith", "usedWith"),
    ("useEffect", "useEffect"),
    ("usefulTools", "usefulTools"),
    ("useMessage", "useMessage"),
    ("Use Message", "useMessage"),
    ("user", "user"),
    ("User", "user"),
    ("userCount", "userCount"),
    ("username", "username"),
    ("userOfAbility", "userOfAbility"),
    ("userRoles", "userRoles"),
    ("users", "users"),
    ("Users", "users"),
    ("uses", "uses"),
    ("Uses", "uses"),
    ("usesAbility", "usesAbility"),
    ("usesActionPoints", "usesActionPoints"),
    ("usesAmmo", "usesAmmo"),
    ("usesAmmoType", "usesAmmoType"),
    ("usesAmmunition", "usesAmmunition"),
    ("usesAmmunitionType", "usesAmmunitionType"),
    ("usesAsVessel", "usesAsVessel"),
    ("usesAttackStyle", "usesAttackStyle"),
    ("usesCaliber", "usesCaliber"),
    ("usesCartridge", "usesCartridge"),
    ("usesCombatStyle", "usesCombatStyle"),
    ("usesCurrency", "usesCurrency"),
    ("usesEngine", "usesEngine"),
    ("usesEquipment", "usesEquipment"),
    ("usesHealthForRepair", "usesHealthForRepair"),
    ("usesIn", "usesIn"),
    ("usesInstrument", "usesInstrument"),
    ("usesItem", "usesItem"),
    ("usesKillerWeapon", "usesKillerWeapon"),
    ("usesLightSource", "usesLightSource"),
    ("usesMainWeapon", "usesMainWeapon"),
    ("usesMeansOfTravel", "usesMeansOfTravel"),
    ("usesModeOfTravel", "usesModeOfTravel"),
    ("usesMoveset", "usesMoveset"),
    ("usesNegation", "usesNegation"),
    ("usesPerRound", "usesPerRound"),
    ("usesPlurality", "usesPlurality"),
    ("usesProcessingTool", "usesProcessingTool"),
    ("usesPronouns", "usesPronouns"),
    ("usesRecyclable", "usesRecyclable"),
    ("usesSlots", "usesSlots"),
    ("usesSpecialWeapon", "usesSpecialWeapon"),
    ("usesSpellType", "usesSpellType"),
    ("usesSubWeapon", "usesSubWeapon"),
    ("usesSystem", "usesSystem"),
    ("usesTechnology", "usesTechnology"),
    ("usesVehicle", "usesVehicle"),
    ("usesVehicles", "usesVehicles"),
    ("usesWeapon", "usesWeapon"),
    ("usesWeapons", "usesWeapons"),
    ("useTime", "useTime"),
    ("Use time", "useTime"),
    ("utilityType", "utilityType"),
    ("utilizesRecycling", "utilizesRecycling"),
    ("validPlacement", "validPlacement"),
    ("value", "value"),
    ("vampiricCorruption", "vampiricCorruption"),
    ("variance", "variance"),
    ("Variance", "variance"),
    ("variant", "variant"),
    ("variantOf", "variantOf"),
    ("variants", "variants"),
    ("Variants", "variants"),
    ("variations", "variations"),
    ("Variations", "variations"),
    ("vehicle", "vehicle"),
    ("vehicleClass", "vehicleClass"),
    ("vehicles", "vehicles"),
    ("vehicleType", "vehicleType"),
    ("velocity", "velocity"),
    ("vendorOffer", "vendorOffer"),
    ("Vendor Offer", "vendorOffer"),
    ("version", "version"),
    ("versionIntroduced", "versionIntroduced"),
    ("Version Introduced", "versionIntroduced"),
    ("versionNumber", "versionNumber"),
    ("versionType", "versionType"),
    ("veryHardHealth", "veryHardHealth"),
    ("vicePresident", "vicePresident"),
    ("victimOf", "victimOf"),
    ("views", "views"),
    ("visibility", "visibility"),
    ("visionariesPresent", "visionariesPresent"),
    ("visionaryLead", "visionaryLead"),
    ("visitedBy", "visitedBy"),
    ("vocalist", "vocalist"),
    ("vocation", "vocation"),
    ("voice", "voice"),
    ("Voice", "voice"),
    ("voice actor", "voiceActor"),
    ("voice_actor", "voiceActor"),
    ("voiceActor", "voiceActor"),
    ("VoiceActor", "voiceActor"),
    ("Voice Actor", "voiceActor"),
    ("voice_actor_english", "voiceActorEnglish"),
    ("voice_actor_japanese", "voiceActorJapanese"),
    ("voiceActors", "voiceActors"),
    ("voiceActress", "voiceActress"),
    ("voiceAppearsIn", "voiceAppearsIn"),
    ("voiced", "voiced"),
    ("voiced by", "voicedBy"),
    ("voiced_by", "voicedBy"),
    ("voicedBy", "voicedBy"),
    ("voicedCharacter", "voicedCharacter"),
    ("volts", "volts"),
    ("volume", "volume"),
    ("Volume", "volume"),
    ("volumeNumber", "volumeNumber"),
    ("volumes", "volumes"),
    ("Volumes", "volumes"),
    ("waitTimeReduction", "waitTimeReduction"),
    ("wanderingEliteChance", "wanderingEliteChance"),
    ("warden", "warden"),
    ("warePreference", "warePreference"),
    ("warmode", "warmode"),
    ("warnsAbout", "warnsAbout"),
    ("warpZone", "warpZone"),
    ("warrenCost", "warrenCost"),
    ("wasBornIn", "wasBornIn"),
    ("wasBornOn", "wasBornOn"),
    ("wasIntroducedIn", "wasIntroducedIn"),
    ("wasKilledBy", "wasKilledBy"),
    ("wasKilledIn", "wasKilledIn"),
    ("wasLocatedIn", "wasLocatedIn"),
    ("wasPortrayedBy", "wasPortrayedBy"),
    ("wasPreviously", "wasPreviously"),
    ("wastelessSupplyChain", "wastelessSupplyChain"),
    ("Watchtowers", "watchtowers"),
    ("Water Capacity", "waterCapacity"),
    ("waterUsage", "waterUsage"),
    ("waves", "waves"),
    ("weakAgainst", "weakAgainst"),
    ("weakenedVersion", "weakenedVersion"),
    ("weakness", "weakness"),
    ("Weakness", "weakness"),
    ("weaknesses", "weaknesses"),
    ("weakpoint", "weakpoint"),
    ("weakPoint", "weakpoint"),
    ("Weakpoint", "weakpoint"),
    ("Weak Point", "weakpoint"),
    ("weakSpot", "weakSpot"),
    ("weakToElement", "weakToElement"),
    ("weapon", "weapon"),
    ("weaponClass", "weaponClass"),
    ("weaponDamage", "weaponDamage"),
    ("weapon_hardpoints", "weaponHardpoints"),
    ("weaponHardpoints", "weaponHardpoints"),
    ("weaponModifiers", "weaponModifiers"),
    ("weaponModifiersCount", "weaponModifiersCount"),
    ("weaponOfChoice", "weaponOfChoice"),
    ("weaponPointsCount", "weaponPointsCount"),
    ("weaponPower", "weaponPower"),
    ("weapons", "weapons"),
    ("weaponSkills", "weaponSkills"),
    ("weaponSlot", "weaponSlot"),
    ("weaponsUsed", "weaponsUsed"),
    ("weaponType", "weaponType"),
    ("weaponTypes", "weaponTypes"),
    ("Weapon Types", "weaponTypes"),
    ("wearsClothes", "wearsClothes"),
    ("wearsClothing", "wearsClothing"),
    ("wearsDefaultBottom", "wearsDefaultBottom"),
    ("wearsDefaultTop", "wearsDefaultTop"),
    ("wearsEquipment", "wearsEquipment"),
    ("wearsHeadgear", "wearsHeadgear"),
    ("wearsShoes", "wearsShoes"),
    ("weather", "weather"),
    ("Weather", "weather"),
    ("website", "website"),
    ("weight", "weight"),
    ("Weight", "weight"),
    ("weightClass", "weightClass"),
    ("weight_distribution", "weightDistribution"),
    ("weightDistribution", "weightDistribution"),
    ("when", "when"),
    ("where", "where"),
    ("who", "who"),
    ("wickedWeaves", "wickedWeaves"),
    ("width", "width"),
    ("Width", "width"),
    ("wieldedBy", "wieldedBy"),
    ("wieldsArm", "wieldsArm"),
    ("wieldsBy", "wieldsBy"),
    ("wieldsEquipment", "wieldsEquipment"),
    ("wieldsItem", "wieldsItem"),
    ("wieldsKeepsake", "wieldsKeepsake"),
    ("wieldsKillerWeapon", "wieldsKillerWeapon"),
    ("wieldsPower", "wieldsPower"),
    ("wieldsPrimaryWeapon", "wieldsPrimaryWeapon"),
    ("wieldsSecondaryWeapon", "wieldsSecondaryWeapon"),
    ("wieldsSoulDevice", "wieldsSoulDevice"),
    ("wieldsTool", "wieldsTool"),
    ("wieldsWeapon", "wieldsWeapon"),
    ("wieldsWeaponType", "wieldsWeaponType"),
    ("wife", "wife"),
    ("wikipediaLink", "wikipediaLink"),
    ("wildLocation", "wildLocation"),
    ("Wild Stamina", "wildStamina"),
    ("Wild Tree Location", "wildTreeLocation"),
    ("will", "will"),
    ("windup", "windup"),
    ("windupTime", "windupTime"),
    ("winLossOutcomes", "winLossOutcomes"),
    ("winner", "winner"),
    ("Winner", "winner"),
    ("winners", "winners"),
    ("winningCondition", "winningCondition"),
    ("within", "within"),
    ("witnesses", "witnesses"),
    ("wizardModeType", "wizardModeType"),
    ("won't be confiscated", "wontBeConfiscated"),
    ("woodDamage", "woodDamage"),
    ("wordOrder", "wordOrder"),
    ("workedFor", "workedFor"),
    ("workerType", "workerType"),
    ("workingHours", "workingHours"),
    ("workProgressIncrease", "workProgressIncrease"),
    ("worksAgainst", "worksAgainst"),
    ("worksAs", "worksAs"),
    ("worksAt", "worksAt"),
    ("worksFor", "worksFor"),
    ("worksIn", "worksIn"),
    ("worksInDivision", "worksInDivision"),
    ("worksOn", "worksOn"),
    ("worksOnGame", "worksOnGame"),
    ("workstation", "workstation"),
    ("workToMake", "workToMake"),
    ("world", "world"),
    ("worldMap", "worldMap"),
    ("worldMapSize", "worldMapSize"),
    ("worldName", "worldName"),
    ("World Name", "worldName"),
    ("worldSize", "worldSize"),
    ("worldType", "worldType"),
    ("wornBy", "wornBy"),
    ("wornOn", "wornOn"),
    ("worships", "worships"),
    ("worstDrinkTrait", "worstDrinkTrait"),
    ("worstOutcome", "worstOutcome"),
    ("worstOutcomeAchievement", "worstOutcomeAchievement"),
    ("worstOutcomeDrinkTrait", "worstOutcomeDrinkTrait"),
    ("worstOutcomeTrait", "worstOutcomeTrait"),
    ("worth", "worth"),
    ("Worth", "worth"),
    ("wrappingPaper", "wrappingPaper"),
    ("writer", "writer"),
    ("Writer", "writer"),
    ("writers", "writers"),
    ("written_by", "writtenBy"),
    ("writtenBy", "writtenBy"),
    ("writtenOn", "writtenOn"),
    ("xpRequired", "xpRequired"),
    ("year", "year"),
    ("yearlyEarnings", "yearlyEarnings"),
    ("yearlyHealthScaling", "yearlyHealthScaling"),
    ("yearOfManufacture", "yearOfManufacture"),
    ("year_of_production", "yearOfProduction"),
    ("yearsActive", "yearsActive"),
    ("yearsExiled", "yearsExiled"),
    ("yearsOfService", "yearsOfService"),
    ("yields", "yields"),
    ("yieldsHarvest", "yieldsHarvest"),
    ("youtubeChannel", "youtubeChannel"),
    ("zodiac", "zodiac"),
    ("zodiacSign", "zodiacSign"),
    ("zone", "zone"),
    ("zoneCoinCost", "zoneCoinCost"),
    ("zoneCostDetails", "zoneCostDetails")
]


In [123]:
len(output)

735

In [124]:
# Create properties
from tqdm.notebook import tqdm

for o in tqdm(output):
    cursor.execute("SELECT id FROM fandom_properties WHERE property = %s", (o[1],))
    prop_id = cursor.fetchone()
    if prop_id is None:
        cursor.execute("INSERT INTO fandom_properties (property) VALUES (%s)", (o[1],))
        conn.commit()
    if o[0] != o[1]:
        cursor.execute("UPDATE fandom_properties_clean SET property_name = %s WHERE property_name = %s", (o[1], o[0]))
        # if cursor.rowcount > 0:
        #     print(f"NUmber of rows affected: {cursor.rowcount}")
        conn.commit()



  0%|          | 0/735 [00:00<?, ?it/s]

In [55]:
# Update subclass of:

for o in output:
    if o[2] is not None:
        cursor.execute("SELECT id FROM fandom_classes WHERE class = %s", (o[2],))
        class_id = cursor.fetchone()
        if class_id is not None:
            cursor.execute("UPDATE fandom_classes SET subclassof = %s WHERE class = %s", (class_id[0], o[1]))
            conn.commit()


In [39]:
# conn.rollback()

### Clean up property target classes

In [1]:
template_string_classes_for_properties = """
You are a highly skilled OWL (Web Ontology Language) ontology engineer. Your task is to assist in the creation, validation, and optimization of OWL ontologies, which are formal representations of knowledge. You have expert knowledge in knowledge engineering, description logic, and semantic web technologies. You also excel at defining classes, properties, and relationships between entities, ensuring logical consistency, and facilitating the sharing of knowledge across different domains.

An OWL ontology is a structured framework used to represent and share knowledge about a particular domain. It consists of classes (concepts), properties (relationships and attributes), and individuals (instances). OWL ontologies allow for the modeling of rich, complex relationships between data in a machine-readable format, enabling advanced reasoning, querying, and inference over that data.
Core OWL Concepts:

    Classes: These represent sets or collections of individuals, typically abstract concepts or types (e.g., "Character," "Weapon," "GameLevel").
    Individuals: Instances of classes (e.g., "Mario" is an individual of the class "Character"; "Sword of Flames" is an individual of the class "Weapon").
    Object Properties: Define relationships between two individuals (e.g., "wieldsWeapon" linking a character to a weapon they use, or "locatedIn" linking a character to a particular game level).
    Datatype Properties: Define relationships between an individual and a data value (e.g., "hasHealthPoints" linking a character to a numeric value representing their health).
    SubClassOf: A relation where one class is a subclass of another, inheriting properties (e.g., "BossCharacter" is a subclass of "Character").
    Equivalence: Used to state that two classes or properties are equivalent (e.g., "MagicWeapon" may be declared equivalent to "SpecialWeapon").
    Disjoint Classes: These are classes that cannot share instances (e.g., "Weapon" and "ConsumableItem" are disjoint classes, meaning an item cannot be both a weapon and a consumable).

OWL Inference and Reasoning:

One of the powerful aspects of OWL is that it allows for reasoning over data. Inference engines can deduce new facts based on the relationships and properties defined in the ontology. For example, if "BossCharacter" is a subclass of "Character" and "Bowser" is an individual of "BossCharacter," it can be inferred that "Bowser" is also an individual of "Character." Additionally, if a property like "wieldsWeapon" is defined, you could infer that "Bowser wields a FireballWeapon" if such an individual and relationship are defined.

Your role is organize and structure a list of classes derived from web data. Classes might be the same but have a slightly different spelling, capitalization, wording, abbreviation or plurality. First, you need to clean up these classes and provide a list of final classes. The final classes should be unique.

Second, you need merge some classes in a single class if that makes ontological sense. For example, GameLevel and Level are the same class and should be merged into a single class. All of the class that have In-Game- , or similar, prefix should drop the prefix and be merged with the class without the prefix.

Third, you need to map which classes are subclasses of other classes. For example, BossCharacter is a subclass of Character. 

You are given a list of classes as input. You need to provide a list of tuples as output: ("original class name", "cleaned up class name", "superclass of the class if it exists in the list of classes" or None). For example, if the input is ["Apple", "apple", "Banana", "Ba-Na-Na", "Fruit"], the output should be [("Apple", "Apple", "Fruit"), ("apple", "Apple", "Fruit"), ("Banana", "Banana", "Fruit"), ("Ba-Na-Na", "Banana", "Fruit"), ("Fruit", "Fruit", None)].

Original classes are case-insensitive. Cleaned up classes should be capitalized and have no special characters or spaces.

Input classes: ```{classes}```

"""