## Dota 2 Hero Lore Connnectivity
---
#### Data:
Dota 2 hero lores including associated heroes.
#### Source: 
Gamepedia [Dota 2 Heroes](https://dota2.gamepedia.com/Dota_2_Wiki) and respective [hero lores](https://dota2.gamepedia.com/Category:Hero_lore).
#### Method[<sup>PS</sup>](#Postscript-down-low):
[MediaWiki API](https://help.gamepedia.com/Bots) for Gamepedia.

See [here](https://help.gamepedia.com/Logging_in_to_third-party_tools#Using_Special:BotPasswords) for API registration and usage. Gamepedia also offers [Cargo data](https://help.gamepedia.com/Extension:Cargo) access. Dota-specific Cargo tables are found [here](https://dota2.gamepedia.com/Special:CargoTables).

#### Visualization:
A [variation on a radial dendogram](https://beta.observablehq.com/@youmikoh/dota-2-lore-connectivity) using [d3v5.js](https://d3js.org/).

$G = \{V, E\} : $ `heroes` $\mapsto V, $ `connections_set` $\mapsto E$ 

---

#### Postscript down-low
Although Gamepedia's infrastructure allows for wiki contributors and users to get pretty fancy (see above links for details), it isn't really warranted for these ad-hoc micro projects. Since read-only data can be easily compiled using barebone MediaWiki API requests and [Cargo table queries](https://dota2.gamepedia.com/api.php?action=help&modules=cargoquery), skip on the extensions and third-party tools.

For a more thourough example, see similar here: [Dota 2 Pro Circuit Connectivity](http://nbviewer.jupyter.org/github/youmikoh/dpc-connectivity/blob/master/part1_collect.ipynb).


---

In [1]:
import re
import requests
import requests_cache
from urllib.parse import urlencode

import time
from datetime import datetime

import wikitextparser as wtp

In [2]:
requests_cache.install_cache(
    cache_name='dota2_hero_lore',
    backend='sqlite',
    expire_after=60*60*24
)

In [3]:
headers = {
    'User-Agent': 'Dota 2 Hero Lore Connectivity 1.0/youmikoh@github',
    'Accept-Encoding': 'gzip'
}
auth = ('playumiko@bot', 'mvk41h8nj3a4hbovohs146utr48kc996')

In [4]:
def throttle(f):
    def wrap(*args, **kwds):
        now = datetime.now()
        if (now - wrap.last).seconds < 2: 
            time.sleep(2)
        wrap.last = now
        return f(*args, **kwds)
    wrap.last = datetime.now()
    return wrap

In [5]:
single = lambda c: len(c)==1

@throttle
def gamepedia_api(query):
    api = f'https://dota2.gamepedia.com/api.php?{urlencode(query)}'
    response = requests.get(api, headers=headers, auth=auth)
    return response.json()

def gamepedia_content(query):
    k, v = query
    query = {'action': 'query', 'prop': 'revisions', 'rvprop': 'content', 'format': 'json'}
    query[k] = v
    
    data = gamepedia_api(query)
    pages = list(data['query']['pages'].values())
    
    source = lambda page: page['revisions'].pop().get('*')
    content = [source(p) for p in pages]
    
    if single(content): content = content.pop()
    return content
    
def gamepedia_cargo_content(query):
    query['action'] = 'cargoquery'
    query['format'] = 'json'
    data = gamepedia_api(query)
    return data.get('cargoquery')

In [6]:
heroes_query = {'tables': 'heroes', 'fields': 'title,primary_attribute,page_id', 'where': 'game is null', 'limit': 200}
heroes = gamepedia_cargo_content(heroes_query)

hero = lambda h: h['title']
heroes = {hero(h)['title']:hero(h) for h in heroes}
heroes_set = set(heroes.keys()) 

In [7]:
heroes_bio_query = {'tables': 'heroes_bio', 'fields': 'name,title,quote,lore', 'limit': 200}
heroes_bio = gamepedia_cargo_content(heroes_bio_query)

key = lambda h: h['title'] +' '+ h['name']
heroes_bio = {key(h['title']):h['title'] for h in heroes_bio}

for bio in heroes_bio.values(): bio['quote'] = re.sub('&quot;', '', bio['quote'])

finicky = set()

for key, bio in heroes_bio.items():
    similar = lambda key: [h for h in heroes_set if h in key]
    k = similar(key)
    if len(k)==1:
        k = k.pop()
        heroes[k]['lore'] = bio.pop('lore')
        heroes[k]['quote'] = bio.pop('quote')
    else: finicky.add(key)

for f in finicky:
    remaining_without_bio = {k for k in heroes_set if not heroes[k].get('lore')}
    keys = f.split(' ')
    similar_remaining = lambda key: [h for h in remaining_without_bio if key in h] 
    for key in keys:
        k = similar_remaining(key)
        if len(k)==1:
            k = k.pop()
            heroes[k]['lore'] = heroes_bio[f].pop('lore')
            heroes[k]['quote'] = heroes_bio[f].pop('quote')

not {k for k in heroes_set if not heroes[k].get('lore')}

True

In [8]:
def hero_factions(faction_page_id):
    faction = gamepedia_content(('pageids', faction_page_id))
    parsed = wtp.parse(faction)
    affiliated = [p for p in parsed.sections if 'Affiliated' in p][-1]
    return re.findall(r'{{H\|(.*?)}}', affiliated.string)

dire_heroes = hero_factions(127924)
radiant_heroes = hero_factions(137875)

In [9]:
def hero_connections(hero):
    lore = gamepedia_content(('titles', f'{hero}/Lore'))
    parsed = wtp.parse(lore)
    parsed = [p for p in parsed.templates if 'Lore infobox' in p].pop()
    parsed = [p for p in parsed.arguments if 'heroes' in p].pop()
    return set(re.findall(r'{{H.?\|(.*?)}}', parsed.string))

connections_list = []

for hero in heroes_set:
    connected = hero_connections(hero)
    heroes[hero]['count'] = len(connected)
    heroes[hero]['connections'] = connected & heroes_set
    connections_list += [frozenset([hero, c]) for c in connected]
    heroes[hero]['faction'] = 'radiant' if hero in radiant_heroes else 'dire'
    
connections_set = {c for c in connections_list}

In [13]:
graph = {'nodes':heroes, 'links':connections_set}

import json
from random import shuffle

class DPCEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, set): 
            return list(obj)
        if isinstance(obj, frozenset): 
            listset = list(obj)
            shuffle(listset)
            return listset
        elif isinstance(obj, datetime): 
            return obj.isoformat()
        else:
            return super(DPCEncoder, self).default(obj)
        
with open('data/hero_lore_connectivity_data.json', 'w') as outfile:  
    json.dump(graph, outfile, cls=DPCEncoder)

---
## Continue to [Visualization](https://beta.observablehq.com/@youmikoh/dota-2-lore-connectivity)
___
<br><br>

In [12]:
from IPython.core.display import HTML
HTML(open("css/lore_ipynb.css", "r").read()) #IPYNB STYLING