In [1]:
from gensim.models.word2vec import Word2Vec
from IPython.display import Markdown, display
from tabulate import tabulate

In [4]:
urban_model = Word2Vec.load('../data/models/urban.bin')

In [2]:
rural_model = Word2Vec.load('../data/models/rural.bin')

In [7]:
def compare(token):
    
    u_similar = urban_model.most_similar(token)
    r_similar = rural_model.most_similar(token)
    
    u_tokens = [[t] for t, _ in u_similar]
    r_tokens = [[t] for t, _ in r_similar]
    
    display(Markdown(f'# {token}'))
    display(Markdown(f'#### Urban'))
    print(tabulate(u_tokens))
    display(Markdown(f'#### Rural'))
    print(tabulate(r_tokens))

"Black" is about race in cities, just a color in small towns.

In [8]:
compare('black')

# black

#### Urban

--------
white
hispanic
trans
queer
male
gay
asian
blue
brown
colored
--------


#### Rural

-------
white
yellow
blue
purple
red
gray
colored
pink
male
orange
-------


Likewise, in the rural areas, "race" is about nascar / sports, not race as a social category.

In [9]:
compare('race')

# race

#### Urban

---------
racial
gender
racism
movement
identity
debate
fascism
politics
narrative
violence
---------


#### Rural

---------
races
nascar
racing
semi
division
sprint
indy
series
match
nationals
---------


"Bond" in rural areas seems heavily used in the sense of social bonds, social interaction, social capital - "contract," "friendship," "connection," "guardian," "agreement," etc. Whereas in cities this is either absent or drowned out. (Though not sure by what?)

In [21]:
compare('bond')

# bond

#### Urban

---------
evans
cameron
james
baldwin
contract
murray
founder
potential
adam
casey
---------


#### Rural

----------
contract
agreement
friendship
divorce
replacing
connection
recovery
parker
increased
guardian
----------


"Track" is a music track in cities, running track in small towns. This is a trend - lots of sports in the rural corpus, especially high school sports.

In [10]:
compare('track')

# track

#### Urban

------------
tracks
album
banger
freestyle
song
ep
instrumental
wave
verse
intro
------------


#### Rural

------------
field
invitational
lax
tennis
volleyball
lacrosse
sectional
hoops
nationals
soccer
------------


Not meaningful, but amusing - "sin" in cities is the Spanish for "without," as opposed to sin in the religious sense in small towns.

In [11]:
compare('sin')

# sin

#### Urban

-------
fue
mal
ser
mejor
hacer
amor
dios
tiene
fear
siempre
-------


#### Rural

---------
christ
jesus
fear
satan
ye
religion
god
heaven
therefore
unto
---------


"Drama" is essentially "dramatic art" in cities (movies, films), as opposed to "melodrama," essentially, in small towns (bullshit, nonsense).

In [12]:
compare('drama')

# drama

#### Urban

---------
movies
comedy
films
spoilers
shameless
horror
feud
movie
romance
binge
---------


#### Rural

----------
bullshit
bs
politics
shit
negativity
nonsense
stress
arguing
comedy
bullying
----------


Transportation in cities, food in small towns!

In [13]:
compare('subway')

# subway

#### Urban

---------
train
bus
freeway
rail
street
airport
terminal
passenger
plane
bridge
---------


#### Rural

----------
mcdonalds
starbucks
chipotle
walmart
pizza
donuts
sushi
restaurant
mall
chinese
----------


"Clear" in an intellectual sense in cities (obvious, correct, simple), as opposed to clear weather in rural areas.

In [14]:
compare('clear')

# clear

#### Urban

----------
obvious
reasonable
valid
simple
consistent
possible
correct
helpful
useful
aware
----------


#### Rural

--------
overcast
fog
cloudy
skies
partly
clouds
dry
moderate
appears
bright
--------


Hannah Montana in cities, Montana the state in small towns.

In [15]:
compare('montana')

# montana

#### Urban

-------
baker
french
frankie
trey
minaj
terry
hannah
migos
drake
calvin
-------


#### Rural

-----------
virginia
georgia
mississippi
southern
delaware
wisconsin
texas
california
louisiana
vermont
-----------


I think this is interesting - in cities, "newspaper" seems associated with individual media outlets / sources (reuters, bbc, breitbart), whereas in small towns it looks more mixed in with "mediums" (and maybe also education?) in an abstract sense - magazine, journal, editor, writer.

In [17]:
compare('newspaper')

# newspaper

#### Urban

----------
reuters
amid
bbc
uk
terror
report
breitbart
australian
british
canadian
----------


#### Rural

---------
bookstore
writer
editor
magazine
headline
journal
reference
classroom
teacher
history
---------


Keith Urban in the country!

In [18]:
compare('urban')

# urban

#### Urban

------------
contemporary
underground
landscape
industrial
architecture
arts
innovation
innovative
institute
revolution
------------


#### Rural

-----------
toby
keith
rural
cave
bass
southern
metal
electric
rock
underground
-----------


Turkey - a country in the city, a food in the country.

In [19]:
compare('turkey')

# turkey

#### Urban

--------
iran
china
crab
syria
germany
iraq
republic
israel
rice
sweden
--------


#### Rural

-------
salmon
pork
bacon
crab
chicken
shrimp
tomato
meat
chili
garlic
-------


Interesting that "lone wolf" is strong in small towns, but missing in cities.

In [22]:
compare('wolf')

# wolf

#### Urban

-------
eyed
blue
heather
alice
rebel
jean
walker
ivy
knight
robin
-------


#### Rural

--------
lone
bear
mountain
spider
wolves
shark
deer
survivor
tiger
tornado
--------


Noticed that "justice," in cities, has some positive connotations - equality and unity. Whereas in small towns it's all criminality.

In [33]:
compare('justice')

# justice

#### Urban

--------------
equality
reform
violence
discrimination
terrorism
environmental
unity
criminal
corruption
racial
--------------


#### Rural

-----------
treason
enforcement
doj
racial
violence
corruption
criminal
terrorism
crimes
judges
-----------
