# A Schema of Groups
---

In this notebook we present a typology for modelling groups from natural language.

The schema is structured using the "five group problems" identified by Martin which any group must resolve to be successful. As group attributes of the schema they are applied in seven different calssifications of ideology. At the highest level of abstraction these features are:

1. Identity:the identification of who is in the group and who is out of the group to determine who should, versus who should not, benefit from the advantages of group living.
  * identity: named groups
  * ingroup: groups identifying an ingroup
  * outgroup: groups identifying an outgroup
  * entities: entities identifying the group context
2. Hierarchy: a system of governance to establish group leadership and resolve problems caused by status seeking to enable resource distribution.
  * Title: titles given to people within a particular context
  * People: who are the figures a group reveres
3. Disease control: the control of diseases for the maintenance of group health.
  * this feature is considered as a social category rather than a feature
4. Trade: independent of hierarchy, systems constituting the fair terms of exchange between individuals towards developing an underpinning concept of altruism in a society.
  * Good: terms of exchange considered to be good
  * Bad: terms of exchange considered to be bad
5. Punishment: the group jurisprudence for moderating the systems seeking to resolve the other group problems.
  * Right: considered to be just
  * Wrong: considered to be unjust and worthy of punishment
  
We identified sets of seed words as linguistic representations of religion and ideology. These sets were classified by a named concept then placed in schema according to attribute and group ideology. 

This process identified eight classifications for group ideologies referred to by each orator: social, academia, medical, geopolitics, religion, economic, justice and military.

## The Schema

The next cell contains the schema and seed terms classified by named concept

In [1]:
## Create a json object of the group typology

import json
import os
from datetime import datetime

group_typology = {
    "social" : {
        "identity" : {
            "SOCGROUP" : ["ladies", "gentlemen", "men", "women", "Women", "boys", "girls", "youth", "society", "people", "children", "minority", \
                          "mankind", "passengers", "stranger", "group", "community", "organization", "brotherhood", "network", "alliance", "brethren", \
                          "tribe", "population", "ummah", "Ummah", "human", "personnel", "person", "man", "Woman", "woman", "boy", "girl", "child", \
                          "humankind", "countryman", "volunteer", "individual", "freeman", "humanity", "society"]},
        "ingroup" : {
            "SELF" : ["we", "our", "i"],
            "FAMILY" : ["family", "parent", "children", "forefather", "spouse", "mother", "father", "husband", "wife", "mom", "mum", "dad", "son", \
                        "daughter", "brother", "sister", "grandson", "granddaughter", "descendent", "ancestor"],
            "AFFILIATE" : ["ourselves", "collaborator", "friend", "ally", "associate", "partner", "companion", "fellow", "kinship"]},
        "outgroup" : {
            "OUTCAST":  ["imbecile", "critic", "wolf", "snake", "dog", "hypocrite"]},
        "entity" : {
            "CAUSE" : ["goal", "cause", "struggle"],
            "CREDO" : ["philosophy", "Philosophy", "ideology", "Belief", "belief", "creed"],
            "LOCATION" : ["ground", "homeland", "sanctuary", "safe haven", "land", "sea", "site", "underworld"],
            "SOCFAC" : ["installation", "home", "shelter", "institution", "facility", "infrastructure", "refuge", "tower"],
            "SOCWORKOFART" : ["poetry", "song", "picture", "art"]},
        "hierarchy" : {
            "TITLE" : ["mr", "mrs", "miss", "ms", "leader"]},
            "LEADERSHIP" : ["leadership"]
        "trade" : {
            "BENEVOLANCE" : ["love", "kinship", "honesty", "tolerance", "patience", "decency", "sympathy", "peace", "good", "best", "great", \
                             "goodness", "hope", "courage", "resolve", "friendship", "loving", "peaceful", "rightness", "brave", \
                             "strong", "peaceful", "fierce", "honesty", "kind", "generous", "resourceful", "truth", "pride", "defiance", "strength", \
                             "comfort", "solace", "respect", "dignity", "honor", "danger", "freedom", "honorable", "grateful", "compassion", "condolence", \
                             "sympathy", "fulfillment", "dedication", "dignity", "noble", "truthfulness", "happy", "enthusiasm", "perseverance", \
                             "persistence", "toughness", "beauty", "beautiful", "commendable", "praiseworthy", "destiny", "generosity", "supremecy", \
                             "obedience", "superior"],
            
            "MALEVOLANCE" : ["grief", "sorrow", "tradegy", "damage", "bad", "misinformation", "confusion", "falsehood", "humiliation", "catastrophe", \
                             "terror", "fear", "threat", "cruelty", "danger", "anger", "harm", "suffering", "harrassment", "deceit", "death", "anger", \
                             "hate", "hatred", "adversity", "chaos", "loneliness", "sadness", "misery", "prejudice", "horrifying", "cynicism", "despair", \
                             "rogue", "hostile", "dangerous", "tears", "peril", "unfavourable", "vile", "sad", "cowardly", "grieve", "shame", "ugly", \
                             "insanity", "arrogance", "hypocrisy", "horror", "monstrous", "suspicious", "disaster", "malice", "menace", "repressive", \
                             "malicious", "nightmare", "isolation", "debauchery", "greedy"]},
        "punishment" : {
            "PERMITTED" : ["acceptable", "right", "laugh", "deed"],
            "FORBIDDEN" : ["unacceptable", "wrong", "catastrophic", "disastrous", "catastrophe", "mischief", "disappoint", "humiliate", "deceive", "lie"]}
    },
    
    "academia" : {
        "identity" : {
            "ACADEMICGROUP" : ["students", "graduates", "graduate", "scholar"]},
        "ingroup" : {},
        "outgroup" : {},
        "entity" : {
            "ACADEMICENTITY" : ["school", "university"]
        },
        "hierarchy" : {
            "ACADEMICTITLE" : ["teacher", "dr", "professor"]
        },
        "trade" : {},
        "punishment" : {}
        
    },
    
    "medical" : {
        "identity" : {
            "MEDICALGROUP" : ["blood donors", "disabled", "injured"]},
        "ingroup" : {},
        "outgroup" : {
            "VERMIN" : ["vermin", "parasite"]},
        "entity" : {
            "MEDICALENT" : ["heart", "soul", "skin graft", "tomb", "palm", "limb", "drug", "chemical", "biological", "heroin", "vaccine", "health", \
                            "blood", "body", "medicine", "remedy", "organ", "intestine", "tongue"],
            "SEXUALITY" : ["fornication", "homosexuality", "sex"],
            "MEDICALFAC" : ["hospital"],
            "INTOXICANT" : ["intoxicant", "sarin", "anthrax", "nerve agent", "nerve gas"],
            "DISEASE" : ["AIDS"]},
        "hierarchy" : {
            "MEDTITLE" : ["doctor", "nurse"]},
        "trade" : {
            "HEALTHY" : ["life"],
            "UNHEALTHY" : ["death", "unhealed", "illness", "disease", "filthy", "wound", "injury", "scar", "suffocation"]},
        "punishment" : {
            "CLEANSE" : ["cure", "remedy", "cleanse"],
            "POISON" : ["poison",  "pollute", "bleed"]}
    },
    
    "geopolitics" : {
        "identity" : {
            "GPEGROUP" : ["ministry", "Ministry", "government", "civilian", "nation", "Union", "civilization", "congress", "Congress", "alliance", \
                          "patriot", "citizen", "journalist", "diplomat", "agency", "delegate", "coalition", "axis", "compatriots", "administration", \
                          "monarchy", "political party", "communist"]},
        "ingroup" : {},
        "outgroup" : {
            "GPEOUTGROUP" : ["regime", "opponent"]},
        "entity" : {
            "GPEENTITY" : ["human rights", "unity", "diplomatic", "citizenship", "legislation", "senate", "secretary", "elect", "election", "reign", \
                           "embassy", "policy", "diplomacy", "media", "power", "edict"],
            "TERRITORY" : ["homeland", "territory", "planet", "land", "peninsula", "city", "country", "neighborhood", "home", "region", "place", \
                           "area", "peninsula", "continent", "reform", "kingdom", "empire"],
            "GPEFAC" : [],
            "GPEWORKOFART" : []},
        "hierarchy" : {
            "GPETITLE" : ["president", "minister", "speaker", "prime minister", "senator", "mayor", "governor", "President", "Minister", "Prime Minister", \
                          "Speaker", "Senator", "Mayor", "Governor", "King", "king", "Prince", "Leader", "commander-in-chief", "ruler", "chairman", \
                          "congressman", "amir", "pharaoh"]},
        "trade" : {
            "GPEIDEOLOGY" : ["liberty", "sovereignty", "pluralism", "patriotism", "democracy", "communism", "bipartisanship"],
            "AUTHORITARIANISM" : ["nationalism", "fascism", "nazism", "totalitarianism", "nazi", "tyranny", "sectarianism", "anti-semitism"],
            "CONFRONTATION" : ["jihad", "confrontation", "feud", "partisanship", "division", "dissociation"]},
        "punishment" : {
            "JUST" : [],
            "UNJUST" : ["inequality", "usury", "poverty", "slavery", "injustice", "oppression", "subdual", "misrule", "occupation", "usurpation", \
                        "starvation", "starving", "servitude", "surpression"]} 
    },
    
    "religion" : {
        "identity" : {
            "RELGROUP" : ["ulamah", "Ulamah", "ulema", "moguls", "Moguls", "Sunnah", "Seerah", "Christian", "sunnah", "christian", "muslims", "islamic", \
                          "sunnah", "seerah", "polytheist", "houries", "the people of the book", "merciful"]},
        "ingroup" : {
            "BELIEVER" : ["believer"]},
        "outgroup" : {
            "APOSTATE" : ["kufr", "Kufr", "kuffaar", "infidel", "evildoer", "cult", "mushrik", "unbeliever", "disbeliever", "mushrikeen", "pagan", \
                          "idolater", "apostate"]},
        "entity" : {
            "RELENTITY" : ["faith", "pray", "prayer", "mourn", "vigil", "prayer", "remembrance", "praise", "bless", "last rites", "angel", "soul", \
                           "memorial", "revelation", "sanctify", "grace", "religion", "repentance", "exalted", "repent", "seerah", "confession", \
                           "exaltation", "praise", "commandment", "wonderment", "supplication", "worship", "testament"],
            "RELIGIOUSLAW" : ["shari'ah", "Shari'ah", "shari'a", "Shari'a", "fatawa", "Fatawa", "fatwa", "Fatwas"],
            "FAITH" : ["da'ees", "piety", "creationism"],
            "RELFAC" : ["Mosque", "mosque", "sanctity", "cathedral", "Cathedral"],
            "RELWORKOFART" : ["sheerah"],
            "RELPLACE" : ["heaven", "paradise"]}, 
        "hierarchy" : {
            "DEITY" : ["all-wighty"],
            "RELFIGURE" : ["Apostle", "Prophet", "apostle", "prophet", "lord", "priest"],
            "RELTITLE" : ["priest", "cleric", "Immam", "immam", "saint", "st.", "sheikh"]},
        "trade" : {
            "HOLY" : ["pious", "devout", "holy", "righteous", "serve", "sacrifice", "forgive", "martyrdom", "piety", "polytheism", "divine", "miracle"],
            "UNHOLY" : ["hell", "hellish", "unholy", "unrighteous", "evil", "devil", "devilish", "demon", "demonic", "evildoe", "satan", "immoral",  \
                        "immorality", "non-righteous", "blasphemy"]},
        "punishment" : {
            "VIRTUE" : ["Grace", "grace", "halal", "forgiveness", "mercy", "righteous", "mercy", "righteousness", "purity"],
            "SIN" : ["iniquity", "haram", "adultery", "sin", "blashpeme"]}
    },
    
    "economic" : {
        "identity" : {
            "ECONGROUP" : ["merchant", "employee", "economist", "worker", "entrepreneur", "shopkeeper", "servant", "company", "shareholder", \
                           "contractor", "merchant", "consumer", "rich", "passenger", "corporation"]},
        "ingroup" : {},
        "outgroup" : {           
            "COMPETITOR" : ["competitor"]},
        "entity" : {
            "ECONENTITY" : ["economic", "tax", "trade", "work", "currency", "bank", "business", "economy", "asset", "fund", "sponsor", "shop", \
                            "financing", "innovation", "micromanage", "export", "job", "budget", "spending", "paycheck", "market", "growth", \
                            "investment", "factory", "welfare", "pension", "accounting", "retirement", "industry", "agriculture", "income", \
                            "spending", "expensive", "purchase", "wealth", "economical", "booty", "buy", "sell", "imprisonment", "denunciation", \
                            "price", "monie", "gambling", "production", "advertising", "tool", "cargo","reconstruction", "contract", "tax", \
                            "financial", "wealth", "power", "prosperity"],
            "COMMODITY" : ["money", "oil", "water", "energy", "currency", "jewel", "jewellery", "gold", "industry", "agriculture", "inflation", "debt", "income", "price"],
            "ECONFAC" : ["airport", "subway", "farm", "charity"],
            "ECONWORKOFART" : []},
        "hierarchy" : {
            "ECONTITLE" : ["ceo"],
            "ECONFIGURE" : []},
        "trade" : {
            "ECONOMICAL" : ["capitalism", "communism", "economical", "economic", "tourism", "commercial"],
            "UNECONOMICAL" : ["uncommercial", "boycott", "bankruptcy"]},
        "punishment" : {
            "EQUITABLE" : ["reward"],
            "UNEQUITABLE" : ["recession", "unemployment"]}            
    },
        
    "justice" : {
        "identity" : {
            "SECGROUP" : ["police", "officers", "policeman", "law enforcement", "firefighter", "rescuer", "lawyer", "agent", "authority", \
                          "protector", "guardian", "captive", "marshal"],
            "VICTIM" : ["victim", "dead", "casualty", "innocent", "persecuted", "slave"]},
        "ingroup" : {},
        "outgroup" : {
            "CRIMEGROUP" : ["criminal", "mafia", "prisoner", "murderer", "terrorist", "hijacker", "outlaw", "violator", "killer", "executioner", "thief"]},
        "entity": {
            "SECENTITY" : ["trial", "security", "law", "law-enforcement", "intelligence", "arrest", "decree",  "sailor", "surveillance", "warrant", \
                           "penalty", "statute", "investigate", "enforce", "attorney", "treaty", "duty",  "jail", "imprisonment", "justification", \
                           "judge", "prohibit", "custody", "shield"],
            "LAW" : [],
            "SECFAC" : ["prison", "sanctuary"],
            "LEGALWORKOFART" : []},
        "hierarchy" : {
            "LEGALPERSON" : [],
            "SECTITLE" : []},
        "trade" : {
            "LAWFUL" : ["duty", "justice", "morality"],
            "UNLAWFUL" : ["crime", "terrorism", "extremism", "murderous", "imtimidation", "harrassment", "trafficking", "criminal", "suicide", "plot", \
                          "brutality", "coercion", "subversion", "bioterrorism", "propaganda", "corrupt", "corruption", "scandal", "betray", "betrayal", \
                          "misappropriation"]},
        "punishment" : {
            "LEGAL" : ["legal", "protect", "legitimate", "liberation", "liberate"],
            "ILLEGAL" : ["illegal", "counterfeit", "money-laundering", "guilty", "blackmail", "threaten", "punishment", "conspiracy", "illegitimate", \
                         "infringement", "adultery", "wicked", "tricked", "harm", "incest"],
            "PHYSICALVIOLENCE" : ["murder", "hijack", "kill"]}
    },
    
    "military" : {
        "identity" : {
            "ARMEDGROUP" : ["commander", "vetran", "Vetran", "occupier", "guard", "invader", "military", "Mujahideen", "mujahideen", "army", "navy", \
                            "air force", "troops", "defender", "recruit", "guerrilla", "knight", "special forces", "fatality", "martyr", "vanguard"],
            "BELIGERENT" : ["aggressor", "troop", "fighter", "soldier", "warrior", "Mujahid", "mujahid", "soldier"]},
        "ingroup" : {},
        "outgroup" : {
            "ENEMY" : ["traitor", "oppressor", "enemy", "crusader", "aggressor", "invader", "occupier"]},
        "entity" : {
            "MILENTITY" : ["battlefield", "beachead", "campaign", "military", "training camp", "armed", "force", "uniform", "chain of command", "target", \
                           "defense", "nuclear", "mission", "battleship", "aircraft carriers", "infantry", "tank", "air power", "shooter", "cavalry", \
                           "arming", "enlist", "conquest", "base", "buy", "surrender", "soldier", "airman", "marine", "biodefense"],
            "WEAPON" : ["weapon", "weaponry", "bomb", "missile", "munition", "explosive", "arms", "bullet", "sword", "spear", "gun", "rocket"],
            "MILFAC" : ["fortress"],
            "MILWORKOFART" : []},
        "hierarchy" : {
            "MILRANK" : ["lieutenant", "commander", "adjutant", "mujahid"],
            "MILFIGURE" : []},
        "trade" : {
            "WARFARE" : ["victory", "war", "warfare", "battle", "blockade"],
            "MILACTION" : ["destruction", "violence", "conflict", "slain", "besiege", "massacre", "atrocity", "aggression", "attack", "assault", "fight",\
                           "explosion", "combat", "invasion", "ruin", "bombardment", "expel", "fighting", "defeat", "expel", "ambush", "overthrow", "destabilize"]},
        "punishment" : {
            "BARBARY" : ["genocide", "brutalize", "holocaust", "torture", "slaughter"]}
    }
}

filepath = "C:/Users/Steve/OneDrive - University of Southampton/CulturalViolence/KnowledgeBases/data/"

with open(os.path.join(filepath, "group_typology.json"), "wb") as f:
    f.write(json.dumps(group_typology).encode("utf-8"))
    
print("complete at: ", datetime.now().strftime("%d/%m/%Y - %H:%M:%S"))

complete at:  27/02/2020 - 22:31:54


## Structure of the Schema

The following cell displays the structure for the schema and how each named concept has been classified.

In [8]:
## https://www.datacamp.com/community/tutorials/joining-dataframes-pandas 
## Display a DataFrame of the Typology

import pandas as pd

labels = []
typology = dict()
typology_chart = dict()

## create a list of keys
typology = {ideology: {subcat: ', '.join(list(terms.keys())) for (subcat, terms) in value.items()} 
             for (ideology, value) in group_typology.items()}

keys = [list(cat.keys()) for cat in list(typology.values())][0]

## Create frames for table
frames = []
typology = {ideology: {subcat: list(terms.keys()) for (subcat, terms) in value.items()} 
             for (ideology, value) in group_typology.items()}

for frame in [list(cat.keys()) for cat in list(typology.values())][0]:
    frames.append(pd.DataFrame.from_dict({k : v[frame] for k, v in list(typology.items())}, orient = 'index').fillna("").T)

# display table
display(pd.concat(frames , keys = keys))

<class 'dict'>


Unnamed: 0,Unnamed: 1,social,academia,medical,geopolitics,religion,economic,justice,military
identity,0,SOCGROUP,ACADEMICGROUP,MEDICALGROUP,GPEGROUP,RELGROUP,ECONGROUP,SECGROUP,ARMEDGROUP
identity,1,,,,,,,VICTIM,BELIGERENT
ingroup,0,SELF,,,,BELIEVER,,,
ingroup,1,FAMILY,,,,,,,
ingroup,2,AFFILIATE,,,,,,,
outgroup,0,OUTCAST,,VERMIN,GPEOUTGROUP,APOSTATE,COMPETITOR,CRIMEGROUP,ENEMY
entity,0,CAUSE,ACADEMICENTITY,MEDICALENT,GPEENTITY,RELENTITY,ECONENTITY,SECENTITY,MILENTITY
entity,1,CREDO,,SEXUALITY,TERRITORY,RELIGIOUSLAW,COMMODITY,LAW,WEAPON
entity,2,LOCATION,,MEDICALFAC,GPEFAC,FAITH,ECONFAC,SECFAC,MILFAC
entity,3,SOCFAC,,INTOXICANT,GPEWORKOFART,RELFAC,ECONWORKOFART,LEGALWORKOFART,MILWORKOFART


## Results for Each Orator

The following cell shows how each of the ideologies are represented over each orators texts.

From the total number of concepts of each speech, these infographics shows the percentage of concepts used by each orator for each ideology. 

In all speeches they are addressing "the people" which is why "social" scores most highly. 

"Justice" scores most highly in Bush's speech on the 26/10/2001 in which he addresses the signing of the US Patriot Act. As might be expected, in his speech on the 14/09/2001 at the Episcopal National Cathedral for a day of Prayer and Remembrance, "religion" features most highly. 

For bin Laden in his second and third speeches, religion, military and geopolitics feature highly. In his speech following 9/11, religion features most highly in how he confers a divine legitimacy to the attacks. 


Using these terms, this annotation framework could be extended to create a new topic modelling schema specific for cultural violence.

In [7]:
import os
import json
import pandas as pd

filepath = r"C:/Users/Steve/OneDrive - University of Southampton/CulturalViolence/data/"
files = ['bushideologiesfile.json', 'binladenideologiesfile.json']

cmp = "Reds"

for file in files:
    with open(os.path.join(filepath, file), "r") as f:
          table = json.load(f)

    display(pd.DataFrame.from_dict(table, orient = 'index').fillna("0").T \
        .style.background_gradient(cmap=cmp).format("{:.0%}"))
    
          

Unnamed: 0,2001-09-11,2001-09-14,2001-09-15,2001-09-17,2001-09-20,2001-10-07,2001-10-11,2001-10-26,2001-11-10,2001-12-11,2002-01-29
social,43%,44%,36%,56%,37%,38%,40%,15%,35%,51%,32%
academia,0%,0%,0%,1%,0%,0%,0%,0%,0%,0%,1%
medical,5%,9%,3%,1%,3%,2%,2%,2%,4%,3%,5%
geopolitics,12%,12%,14%,14%,21%,12%,16%,22%,20%,13%,17%
religion,7%,18%,5%,11%,5%,6%,6%,2%,5%,7%,3%
economic,9%,4%,3%,6%,4%,4%,7%,8%,4%,1%,17%
justice,14%,7%,12%,6%,16%,17%,13%,42%,21%,10%,13%
military,11%,5%,27%,6%,13%,21%,15%,9%,10%,14%,12%


Unnamed: 0,1996-08-23,2001-10-07,2001-11-09,2002-11-24,2004-11-01
social,29%,33%,37%,27%,35%
academia,1%,0%,0%,0%,0%
medical,2%,4%,4%,4%,5%
geopolitics,17%,12%,13%,18%,18%
religion,14%,25%,22%,12%,5%
economic,9%,1%,1%,10%,12%
justice,9%,15%,13%,11%,11%
military,18%,11%,10%,18%,16%
