# Discovery and Representation of Open Making Related Terms

This notebook sketches the initial exercise on discovering the open making related keywords. The input text is harvested via a Web crawler that identifies and crawls semantically related wikipedia articles.   

In [1]:
from utils import tokenizer
import nltk
from nltk import FreqDist
from math import log
import json, csv

## 1. Loading a reference English language corpus

In [2]:
from nltk.corpus import brown
brown.categories()

['adventure',
 'belles_lettres',
 'editorial',
 'fiction',
 'government',
 'hobbies',
 'humor',
 'learned',
 'lore',
 'mystery',
 'news',
 'religion',
 'reviews',
 'romance',
 'science_fiction']

## 2. Stop words

### 2.1 Standard stop words

In [3]:
with open("data/stopwords_standard.txt", "r") as f:
    STOP_WORDS_STANDARD = set(f.read().strip().split("\n"))
print(STOP_WORDS_STANDARD)

{"you're", 'own', 'not', 'just', 'down', 'those', 'about', "we've", 'has', "aren't", 'r', 'under', 'other', 'get', 'they', "couldn't", 'doing', "he'll", "we're", 'themselves', 'through', 'ours ', 'himself', 'had', "they've", 'any', 'like', 'between', 'on', 'a', 'cannot', "i'm", 'how', 'when', 'further', 'then', 'if', "didn't", 'his', "she'd", "they'll", 'while', 'him', 'your', 'we', 'be', 'should', 'have', 'he', 'could', "don't", 'or', 'would', 'were', 'very', 'www', 'until', "you'd", 'to', 'having', 'why', "who's", 'into', 'our', 'each', 'all', 'again', 'com', 'where', 'am', 'ourselves', 'itself', 'do', 'there', 'most', "won't", "weren't", 'and', 'it', 'so', 'below', 'more', 'than', 'yours', "doesn't", 'out', 'who', 'up', 'too', 'does', "he's", "he'd", "isn't", 'hers', 'an', 'such', "there's", 'only', 'me', 'for', "i've", 'off', "she's", "we'd", "i'd", "where's", 'my', 'can', "can't", 'some', 'did', 'above', 'both', "how's", "hadn't", 'over', "that's", 'being', 'know', 'as', 'http', '

### 2.2 Open-making related stop words

In [4]:
with open("data/stopwords_openmaker.txt", "r") as f:
    STOP_WORDS_OPENMAKER = set(f.read().strip().split("\n"))
print(STOP_WORDS_OPENMAKER)

{'may', 'almost', 'often', 'one', 'well', 'many', 'also'}


## 3. Removing stop words from the reference English corpus

In [5]:
# merging the two list together
STOP_WORDS = STOP_WORDS_STANDARD.union(STOP_WORDS_OPENMAKER)
print(STOP_WORDS)

{'just', 'down', 'those', 'about', 'has', 'under', 'they', 'doing', "he'll", 'ours ', 'himself', "they've", 'like', 'between', 'a', "i'm", 'when', 'then', 'if', "didn't", "she'd", "they'll", 'while', 'him', 'we', 'should', 'have', 'he', "don't", 'or', 'were', 'very', 'until', 'why', "who's", 'into', 'each', 'again', 'well', 'ourselves', 'do', 'and', 'it', 'below', "doesn't", 'out', 'who', 'up', 'too', 'does', "he's", "there's", 'only', 'me', 'for', "i've", "she's", "we'd", "i'd", "can't", 'did', 'above', "how's", "that's", "hadn't", 'as', 'yourself', "what's", 'herself', 'many', 'nor', "shan't", 'she', "you'll", 'with', "haven't", 'same', 'here', 'you', 'are', 'them', 'before', "we'll", 'which', 'but', 'i', "let's", 'yourselves', "you've", 'their', 'during', 'myself', 'no', 'because', "she'll", 'that', 'these', "shouldn't", "you're", 'own', 'not', "we've", 'r', "aren't", 'other', 'get', "couldn't", "we're", 'themselves', 'through', 'had', 'any', 'on', 'cannot', 'how', 'further', 'his',

In [6]:
# load english words from the Brown corpus removing stop words.
english_freq_dist = FreqDist([w.lower() for w in nltk.corpus.brown.words()
                              if w not in STOP_WORDS])

## 4. Removing the rare words.

Below we remove rare words and get total count. The code below keeps all words with a occurance frequency above 2. 

In [7]:
english_freq_dist = {k:v for k,v in english_freq_dist.items() if v > 2}

## 5. Loading the input Open Maker corpus

In [8]:
# load the harvested text from wikipedia.
with open("data/wikipedia.json", "r") as f: OM_Corpus_text = f.read()
OM_Corpus = json.loads(OM_Corpus_text)

In [9]:
# The total number of wiki articles used:
print(len(OM_Corpus))

152


In [10]:
# Column names of the the corpus.
OM_Corpus[0].keys()

dict_keys(['theme.id', 'title', 'url', 'depth', 'text'])

In [11]:
def display_pages(tid):
    meme = [page for page in OM_Corpus if page['theme.id'] == tid]
    for m in meme:
        print(m['depth'],m['title'], m['url'])

In [12]:
display_pages(0)

0 Do it yourself https://en.wikipedia.org/wiki/Do_it_yourself
1 Edupunk https://en.wikipedia.org/wiki/Edupunk
1 Prosumer https://en.wikipedia.org/wiki/Prosumer
1 How-to https://en.wikipedia.org/wiki/How-to
1 Kludge https://en.wikipedia.org/wiki/Kludge
1 Bricolage https://en.wikipedia.org/wiki/Bricolage
1 Junk box https://en.wikipedia.org/wiki/Junk_box
1 Number 8 wire https://en.wikipedia.org/wiki/Number_8_wire
1 Ready-to-assemble furniture https://en.wikipedia.org/wiki/Ready-to-assemble_furniture
1 Open design https://en.wikipedia.org/wiki/Open_Design
1 Hackerspace https://en.wikipedia.org/wiki/Hackerspace
1 Instructables https://en.wikipedia.org/wiki/Instructables
1 Handyman https://en.wikipedia.org/wiki/Handyman
1 Circuit bending https://en.wikipedia.org/wiki/Circuit_bending
1 Project GreenWorld International https://en.wikipedia.org/wiki/Project_GreenOman
1 3D printing https://en.wikipedia.org/wiki/3D_printing


In [13]:
display_pages(1)

0 Open design https://en.wikipedia.org/wiki/Open_design
1 Knowledge commons https://en.wikipedia.org/wiki/Knowledge_commons
1 Open Source Ecology https://en.wikipedia.org/wiki/Open_Source_Ecology
1 Computer-aided design https://en.wikipedia.org/wiki/Computer-aided_design
1 Open Source Initiative https://en.wikipedia.org/wiki/Open_Source_Initiative
1 Open Architecture Network https://en.wikipedia.org/wiki/Open_Architecture_Network
1 Open-source architecture https://en.wikipedia.org/wiki/Open-source_architecture
1 Commons-based peer production https://en.wikipedia.org/wiki/Commons-based_peer_production
1 Open standard https://en.wikipedia.org/wiki/Open_standard
1 OpenCores https://en.wikipedia.org/wiki/OpenCores
1 Co-creation https://en.wikipedia.org/wiki/Co-creation
1 OpenBTS https://en.wikipedia.org/wiki/OpenBTS
1 Open manufacturing https://en.wikipedia.org/wiki/Open_manufacturing
1 Open-source hardware https://en.wikipedia.org/wiki/Open-source_hardware
1 Open source appropriate techno

In [14]:
display_pages(2)

0 Sustainability https://en.wikipedia.org/wiki/Sustainability
1 Sustainability standards and certification https://en.wikipedia.org/wiki/Sustainability_standards_and_certification
1 Appropriate technology https://en.wikipedia.org/wiki/Appropriate_technology
1 Sustainable development https://en.wikipedia.org/wiki/Sustainable_development
1 Environmental issue https://en.wikipedia.org/wiki/Environmental_issue
1 World Cities Summit https://en.wikipedia.org/wiki/World_Cities_Summit
1 Ecopsychology https://en.wikipedia.org/wiki/Ecopsychology
1 Book:Sustainability https://en.wikipedia.org/wiki/Book:Sustainability
1 Sustainable design https://en.wikipedia.org/wiki/Sustainable_design
1 Circles of Sustainability https://en.wikipedia.org/wiki/Circles_of_Sustainability
1 Sustainability science https://en.wikipedia.org/wiki/Sustainability_science
1 Sustainable living https://en.wikipedia.org/wiki/Sustainable_living
1 Index of sustainability articles https://en.wikipedia.org/wiki/List_of_sustainabil

In [15]:
display_pages(3)

0 Maker culture https://en.wikipedia.org/wiki/Maker_culture
1 Modular design https://en.wikipedia.org/wiki/Modular_design
1 Open-source car https://en.wikipedia.org/wiki/Open-source_car
1 Electric vehicle conversion https://en.wikipedia.org/wiki/Electric_vehicle_conversion
1 Thingiverse https://en.wikipedia.org/wiki/Thingiverse
1 Fab lab https://en.wikipedia.org/wiki/Fab_Lab_(fabrication_laboratory)
1 SparkFun Electronics https://en.wikipedia.org/wiki/SparkFun
1 RepRap project https://en.wikipedia.org/wiki/RepRap
1 Distributed manufacturing https://en.wikipedia.org/wiki/Distributed_manufacturing
1 Craft production https://en.wikipedia.org/wiki/Craft_production
1 Autonomous building https://en.wikipedia.org/wiki/Autonomous_building
1 Open-source hardware https://en.wikipedia.org/wiki/Open_source_hardware
1 Kit car https://en.wikipedia.org/wiki/Kit_car


In [16]:
display_pages(4)

0 Innovation https://en.wikipedia.org/wiki/Innovation
1 Competitive intelligence https://en.wikipedia.org/wiki/Creative_competitive_intelligence
1 Multiple discovery https://en.wikipedia.org/wiki/Multiple_discovery
1 UNDP Innovation Facility https://en.wikipedia.org/wiki/UNDP_Innovation_Facility
1 Open Innovations (event) https://en.wikipedia.org/wiki/Open_Innovations_(Forum_and_Technology_Show)
1 Trans-cultural diffusion https://en.wikipedia.org/wiki/Diffusion_(anthropology)
1 Individual capital https://en.wikipedia.org/wiki/Individual_capital
1 Innovation system https://en.wikipedia.org/wiki/Innovation_system
1 Public domain https://en.wikipedia.org/wiki/Public_domain
1 Ingenuity https://en.wikipedia.org/wiki/Ingenuity
1 Sustainable Development Goals https://en.wikipedia.org/wiki/Sustainable_Development_Goals
1 Participatory design https://en.wikipedia.org/wiki/Participatory_design
1 Innovation management https://en.wikipedia.org/wiki/Innovation_management
1 Information revolution ht

In [17]:
display_pages(5)

0 Collaboration https://en.wikipedia.org/wiki/Collaboration
1 Wikinomics https://en.wikipedia.org/wiki/Wikinomics
1 Collaborative editing https://en.wikipedia.org/wiki/Collaborative_editing
1 Telepresence https://en.wikipedia.org/wiki/Telepresence
1 Knowledge management https://en.wikipedia.org/wiki/Knowledge_management
1 The Culture of Collaboration https://en.wikipedia.org/wiki/The_Culture_of_Collaboration
1 Collaborative governance https://en.wikipedia.org/wiki/Collaborative_governance
1 Community film https://en.wikipedia.org/wiki/Community_film
1 Collaborative innovation network https://en.wikipedia.org/wiki/Collaborative_innovation_network
1 Design thinking https://en.wikipedia.org/wiki/Design_thinking
1 Role-based collaboration https://en.wikipedia.org/wiki/Role-based_collaboration
1 Intranet portal https://en.wikipedia.org/wiki/Intranet_portal
1 Critical thinking https://en.wikipedia.org/wiki/Critical_thinking
1 Facilitation (business) https://en.wikipedia.org/wiki/Facilitation

## 6. Analyzing a specific corpus based on a theme

In [18]:
# Note that theme.id: 0 corresponds to the the Do IT YOURSELF
input_text = " ".join([page['text'] for page in OM_Corpus if page['theme.id'] == 0])

In [19]:
# Tokenizing the input text:
tokenized = tokenizer.tokenize_words(input_text)
number_of_words = len(tokenized)
print(number_of_words),OM_Corpus[0]['title']

30073


(None, 'Do it yourself')

### 6.1 Computing frequency distributions of each token, i.e word, term, pancuation, etc.

In [20]:
input_freq_dist = FreqDist(tokenized)

In [21]:
input_freq_dist.most_common(20)

[('\n', 3787),
 ('the', 1257),
 ('and', 776),
 ('of', 771),
 ('a', 661),
 ('to', 642),
 ('in', 563),
 ('"', 429),
 ('is', 303),
 ('as', 276),
 ('for', 257),
 ('that', 224),
 ('or', 206),
 ('by', 186),
 ('with', 182),
 ('on', 156),
 ('are', 151),
 ('3d', 142),
 ('from', 129),
 ('it', 119)]

### 6.2 Removing punctuation and stopwords from the input corpus

In [22]:
for stopword in STOP_WORDS:
    if stopword in input_freq_dist:
        del input_freq_dist[stopword]
        
for punctuation in tokenizer.CHARACTERS_TO_SPLIT:
    if punctuation in input_freq_dist:
        del input_freq_dist[punctuation]

# Re-control most common words after cleaning:
input_freq_dist.most_common(80)

[('3d', 142),
 ('printing', 94),
 ('design', 75),
 ('used', 72),
 ('open', 65),
 ('new', 56),
 ('kludge', 55),
 ('term', 53),
 ('diy', 52),
 ('manufacturing', 51),
 ('use', 50),
 ('project', 49),
 ('bricolage', 46),
 ('work', 45),
 ('hackerspaces', 44),
 ('handyman', 43),
 ('projects', 38),
 ('using', 38),
 ('parts', 37),
 ('music', 35),
 ('furniture', 34),
 ('people', 33),
 ('production', 33),
 ('software', 33),
 ('kluge', 33),
 ('technology', 32),
 ('home', 31),
 ('circuit', 31),
 ('common', 30),
 ('make', 30),
 ('first', 30),
 ('see', 29),
 ('free', 29),
 ('social', 29),
 ('culture', 29),
 ('process', 29),
 ('additive', 29),
 ('material', 27),
 ('example', 27),
 ('world', 27),
 ('printers', 27),
 ('electronic', 26),
 ('materials', 25),
 ('hackerspace', 25),
 ('prosumer', 25),
 ('digital', 25),
 ('processes', 24),
 ('printed', 24),
 ('repair', 24),
 ('metal', 24),
 ('uses', 23),
 ('part', 23),
 ('time', 23),
 ('include', 23),
 ('products', 23),
 ('layer', 23),
 ('building', 22),
 ('m

### 6.3 Removing rare words from input distribution

In [23]:
input_freq_dist = {k:v for k,v in input_freq_dist.items() if v > 1}

## 7. Comparing input vs English corpus volumes

### 7.1 Total words (after cleaning) 

In [24]:
n_input = sum(input_freq_dist.values())
n_english = sum(english_freq_dist.values())
n_input, n_english

(12668, 679519)

### 7.2 Unique words (after cleaning)

In [25]:
n_unique_word_input = len(input_freq_dist.items())
n_unique_word_brown = len(english_freq_dist.items())
n_unique_word_input, n_unique_word_brown

(2381, 20591)

### 7.3 Cleaned set of input words/terms

List of words in the corpus in case, for a visual inspection. Such inspections will be used both to improve tokenization as well as filtering.

In [26]:
input_freq_dist

{'uses': 23,
 'see': 29,
 'disambiguation': 5,
 'diy': 52,
 'redirects': 3,
 'article': 21,
 'multiple': 13,
 'issues': 10,
 'please': 9,
 'help': 20,
 'improve': 15,
 'discuss': 4,
 'page': 2,
 'learn': 14,
 'remove': 11,
 'template': 10,
 'possibly': 4,
 'contains': 3,
 'original': 11,
 'research': 17,
 'verifying': 2,
 'claims': 5,
 'made': 19,
 'adding': 11,
 'inline': 2,
 'citations': 7,
 'statements': 2,
 'consisting': 3,
 'removed': 8,
 'november': 3,
 'message': 10,
 'needs': 8,
 'additional': 5,
 'better': 6,
 'verification': 3,
 'reliable': 3,
 'sources': 12,
 'unsourced': 3,
 'material': 27,
 'challenged': 3,
 'september': 2,
 'part': 23,
 'series': 11,
 'individualism': 4,
 'topics': 4,
 'concepts': 2,
 'autonomy': 2,
 'free': 29,
 'love': 6,
 'freethought': 2,
 'human': 12,
 'rights': 8,
 'individual': 13,
 'reclamation': 2,
 'liberty': 4,
 'negative': 3,
 'personal': 7,
 'property': 12,
 'positive': 4,
 'private': 4,
 'self-ownership': 2,
 'mile': 2,
 'armand': 2,
 'alber

### 7.4 Set of terms/words that occure in both corpus.

In [27]:
common_words = [w for w in input_freq_dist.keys() & english_freq_dist.keys()]
print(len(common_words))

1940


In [28]:
for w in common_words: print(w)

previously
marks
door
discourse
chains
even
office
fundamental
address
patents
radical
candy
carpentry
swimming
founding
far
course
economy
welcome
firms
whether
politics
fields
drawn
writing
advanced
covers
against
rig
experimentation
meets
handy
essentially
pieces
sponsored
cutters
serve
inefficient
business
traditional
code
influence
acquisition
set
scanning
opened
akin
university
fence
mess
'
giving
integrate
notes
trusting
areas
melting
smaller
essay
possess
military
journals
album
yes
matter
wider
molding
bent
primary
reconstruction
reporter
nail
simple
wages
august
unpaid
complete
cause
version
photography
teaching
discussing
assessment
application
contest
significantly
sheds
process
week
asking
branch
powder
thinking
french
jim
name
homeowners
particle
motors
awareness
fully
europe
platform
change
related
materials
sectors
data
file
sophisticated
quantities
satisfy
open
myriad
assembled
ever
paper
useful
emerged
capabilities
record
jet
real
stewart
s
vary
definitions
object
gua

authors
post
pioneers
costs
went
greece
worker
kitchen
come
l
paul
love
resistors
usage
classroom
computing
years
mile
c
likely
folklore
initiative
contains
shelf
household
texas
research
invented
home
lower
order
free
money
commercial
steel
tend
india
fashion
technician
go
stores
experimental
theme
brother
spencer
turning
craft
creating
type
male
humming
controlled
citizens
biggest
toys
public
history
wrong
culture
law
engine
reduced
america
compared
companies
spell
murray
introduced
keeping
benefit
rest
always
entirely
concluded
jersey
spreading
sections
practical
electrical
teach
files
poland
cases
list
plastic
tech
mentality
widely
nuts
advantages
instead
improvement
mutual
awarded
finish
family
think
human
german
english
world
organizations
learning
manned
life
achieve
minor
wheel
tips
emma
now
manufacturers
collection
provide
lead
status
character
technology
require
great
vehicle
fledgling
terminology
benefits
societies
although
rebuild
estimate
government
continue
thrift
felt
ci

### 7.5 Set of terms/words that occure in the sample but not in the reference corpus.

TO BE EXAMINED: This specific set needs to be incorporated. In fact, it may capture specifity of the content to a great extend. We need to assign a mapping score for each words in this set.

In [29]:
input_specifics = dict()
for w in input_freq_dist.keys() - english_freq_dist.keys():
    input_specifics[w] = input_freq_dist[w]
    print(w)

hacker
how-tos
wipers
faire
abs
zines
polymer
emissions
kludged
hometalk
uv
atm
disposable
thoreau
poorly-matching
leone
buyer
labs
blurring
commons-based
copyright
hirst
josiah
prosumers'
ac
docking
kotler
kluhj
sub-cultures
collaborate
nepal
uganda
libertarianism
commercialization
module
software
manifesto
repairing
home-improvement
stereolithography
globalization
step-by-step
skylab
commentators
machining
handyman's
workaround
1980s
matrix
internet
klooj
futurist
high-tech
matured
xinchejian
lvi-strauss
infrastructure
wiring
vcrs
sprinkler
innovators
futurologist
armand
consumer-oriented
potentiometers
hams
friedrich
starring
wikipedia
makerbot
implemented
weblog
1960s
laser
vernacular
wiki-based
genre
beijing
anarchism
reuse
cognitive
landscaping
options
stylistic
jeans
greenoman
fiberglass
extrusion
zine
widgets
jugaad
homebrewers
wiki
synthesizers
forums
wiktionary
academics
hobbyists
redirected
console
isbn
pastiche
day'
verifying
dhofar
2010s
categorized
c-base
kluge
top-down
k

In [30]:
print(len(input_specifics))

441


## 8. Stemming (in case needed) 

In [31]:
from nltk.stem.porter import PorterStemmer
stemmer = PorterStemmer()
for k,v in input_freq_dist.items():
    stemmed = stemmer.stem(k)
    if stemmed != k: print(k, "->", stemmed)

uses -> use
disambiguation -> disambigu
redirects -> redirect
article -> articl
multiple -> multipl
issues -> issu
please -> pleas
improve -> improv
remove -> remov
template -> templat
possibly -> possibl
contains -> contain
original -> origin
verifying -> verifi
claims -> claim
adding -> ad
inline -> inlin
citations -> citat
statements -> statement
consisting -> consist
removed -> remov
november -> novemb
message -> messag
needs -> need
additional -> addit
verification -> verif
reliable -> reliabl
sources -> sourc
unsourced -> unsourc
material -> materi
challenged -> challeng
september -> septemb
series -> seri
individualism -> individu
topics -> topic
concepts -> concept
autonomy -> autonomi
rights -> right
individual -> individu
reclamation -> reclam
liberty -> liberti
negative -> neg
personal -> person
property -> properti
positive -> posit
private -> privat
lysander -> lysand
henry -> henri
james -> jame
anarchism -> anarch
anarcho-capitalism -> anarcho-capit
liberalism -> liber
f

distribution -> distribut
consume -> consum
panels -> panel
generating -> gener
electricity -> electr
gas -> ga
innovation -> innov
programme -> programm
leisure -> leisur
pursuits -> pursuit
initial -> initi
combination -> combin
hobbies -> hobbi
rising -> rise
profession -> profess
serious -> seriou
cooking -> cook
dedicated -> dedic
photography -> photographi
trends -> trend
factors -> factor
disposable -> dispos
sectors -> sector
prices -> price
towards -> toward
amateurs -> amateur
beginning -> begin
1980s -> 1980
forums -> forum
experience -> experi
considered -> consid
fence -> fenc
bells -> bell
expanded -> expand
challenge -> challeng
anticipated -> anticip
goods -> good
services -> servic
furthermore -> furthermor
forces -> forc
developed -> develop
marketing -> market
politics -> polit
schools -> school
infoanarchism -> infoanarch
philosophical -> philosoph
theory -> theori
practice -> practic
struggle -> struggl
democracy -> democraci
ecology -> ecolog
association -> associ

kincheloe -> kinchelo
denote -> denot
employed -> employ
foundation -> foundat
researchers -> research
rigorous -> rigor
provide -> provid
sophisticated -> sophist
understanding -> understand
capable -> capabl
sis -> si
competitive -> competit
advantage -> advantag
tinkering -> tinker
allowing -> allow
turkle -> turkl
workspace -> workspac
productivity -> product
advocates -> advoc
conventional -> convent
variety -> varieti
enables -> enabl
fully -> fulli
tasks -> task
successfully -> success
served -> serv
incorporating -> incorpor
purposes -> purpos
candy -> candi
immediately -> immedi
children's -> children'
weapons -> weapon
operators -> oper
hams -> ham
resistors -> resistor
capacitors -> capacitor
screws -> screw
nuts -> nut
bolts -> bolt
homebrewers -> homebrew
boxes -> box
homebrewing -> homebrew
keeping -> keep
provides -> provid
repairs -> repair
removing -> remov
quantities -> quantiti
treasure -> treasur
tracking -> track
commercially -> commerci
gauge -> gaug
entered -> en

## 9. Computing representation power of common words.

In [32]:
# combine
makerness = {}
# common_words = [w[0] for w in common_words]
for w in common_words:
    # Consider only words whose charcater length is larger than 1
    if len(w) > 1:
        # Log likelihood scores are computed:
        score = log((input_freq_dist[w] / n_input) / (english_freq_dist[w] / n_english))
        makerness[w] = score

In [33]:
common_words

['previously',
 'marks',
 'door',
 'discourse',
 'chains',
 'even',
 'office',
 'fundamental',
 'address',
 'patents',
 'radical',
 'candy',
 'carpentry',
 'swimming',
 'founding',
 'far',
 'course',
 'economy',
 'welcome',
 'firms',
 'whether',
 'politics',
 'fields',
 'drawn',
 'writing',
 'advanced',
 'covers',
 'against',
 'rig',
 'experimentation',
 'meets',
 'handy',
 'essentially',
 'pieces',
 'sponsored',
 'cutters',
 'serve',
 'inefficient',
 'business',
 'traditional',
 'code',
 'influence',
 'acquisition',
 'set',
 'scanning',
 'opened',
 'akin',
 'university',
 'fence',
 'mess',
 "'",
 'giving',
 'integrate',
 'notes',
 'trusting',
 'areas',
 'melting',
 'smaller',
 'essay',
 'possess',
 'military',
 'journals',
 'album',
 'yes',
 'matter',
 'wider',
 'molding',
 'bent',
 'primary',
 'reconstruction',
 'reporter',
 'nail',
 'simple',
 'wages',
 'august',
 'unpaid',
 'complete',
 'cause',
 'version',
 'photography',
 'teaching',
 'discussing',
 'assessment',
 'application',


In [34]:
# Sorting by scores:
for k,v in sorted(makerness.items(), key=lambda x:x[1], reverse=True): print(k,v)

additive 6.250989607578818
printer 5.879426051146336
printing 5.635229090634293
digital 5.4094224219006
franchise 5.1454568760661346
users 5.0809183549285635
global 5.0809183549285635
citation 5.011925483441612
bending 4.980834896371581
deposition 4.829603926647658
do-it-yourself 4.7707634266247245
manufacturing 4.736077868636834
homeowners 4.6754532468204
computers 4.6754532468204
non-profit 4.6754532468204
template 4.6754532468204
hardware 4.6754532468204
evolutionary 4.675453246820399
junk 4.675453246820399
jargon 4.675453246820399
layer 4.632893632401603
fabrication 4.610914725682828
bug 4.541921854195876
hack 4.493131690026445
lab 4.493131690026445
zealand 4.493131690026445
coined 4.493131690026445
individualist 4.493131690026445
portal 4.493131690026445
computer 4.461879146522341
media 4.461879146522341
commons 4.387771174368618
consumers 4.3500308463857715
maker 4.33061276052867
circuit 4.28079905481645
surname 4.269988138712235
enthusiasts 4.269988138712235
eric 4.2699881387122

restricted 2.372868153826354
drums 2.372868153826354
collage 2.372868153826354
supports 2.372868153826354
skilled 2.372868153826354
learning 2.372868153826354
discourse 2.3728681538263534
akin 2.3728681538263534
launch 2.3728681538263534
employing 2.3728681538263534
diminished 2.3728681538263534
museums 2.3728681538263534
descriptions 2.3728681538263534
vent 2.3728681538263534
clothing 2.3728681538263534
limitation 2.3728681538263534
cites 2.3728681538263534
forecast 2.3728681538263534
insulation 2.3728681538263534
merge 2.3728681538263534
founder 2.3728681538263534
purchased 2.3728681538263534
invention 2.3728681538263534
miscellaneous 2.3728681538263534
mister 2.3728681538263534
instructions 2.3728681538263534
emma 2.3728681538263534
describe 2.348175541235982
contest 2.3336474406730723
perspective 2.3336474406730723
smart 2.3240779896569217
michigan 2.3240779896569217
availability 2.3240779896569217
remove 2.319758328512405
feature 2.317298302671543
founding 2.3083296326887823
satis

useful 1.0204753443821444
liked 1.0204753443821444
neighborhood 1.0204753443821444
unique 1.0204753443821444
status 1.0170330001911714
furthermore 1.011891600690753
defined 1.011891600690753
joint 1.011891600690753
tv 1.011891600690753
holes 1.011891600690753
fast 1.011891600690753
controlled 1.011891600690753
finish 1.011891600690753
exist 1.0033809110228442
parties 1.0033809110228442
british 1.0033809110228442
reduced 0.999152574913323
change 0.9865737927064631
chemical 0.9865737927064631
described 0.9865737927064631
handsome 0.9865737927064628
bob 0.9865737927064628
existed 0.9865737927064628
proof 0.9865737927064628
regarding 0.9865737927064628
automatic 0.9865737927064628
according 0.9865737927064628
studying 0.9865737927064628
exception 0.9865737927064628
campaign 0.9741512727079058
due 0.9723891577145065
showing 0.9700444907552523
supply 0.9667711654102833
opposed 0.9618811801160915
spending 0.9618811801160915
today's 0.9618811801160915
teach 0.9618811801160915
societies 0.96188

thought -0.8794424471280845
probably -0.8890671605022941
city -0.8928912569406976
next -0.8954325543693701
asked -0.9055336503558741
government -0.9545630775961926
yet -0.9569525649935741
quite -0.9629014225133465
act -0.9699936508228383
far -0.9735209913408067
enough -0.9828668537590445
long -1.0309923452552854
three -1.0448585297870123
course -1.0611190506587926
turned -1.0928677489733731
country -1.1052902689719302
mind -1.1083719355093382
want -1.1175603615637444
thing -1.1326892431600444
say -1.1416579131428048
children -1.1966645426550166
said -1.2010084871375217
felt -1.2022825349592399
men -1.2686576039041124
going -1.3135081700694642
must -1.328927525554129
right -1.3374465810076477
nothing -1.3431399676758353
against -1.360028185704357
think -1.3952844811820906
away -1.4470395626939867
always -1.4514159372937856
went -1.553057756770784
go -1.7638971242796992
last -1.840739829222565
man -2.0149748661689677
state -2.0178704214495493
little -2.047176548035049


In [35]:
with open('makerness_diy.csv', 'w') as csvfile:
    thewriter = csv.writer(csvfile, delimiter=',')
    for k,v in sorted(makerness.items(), key=lambda x:x[1], reverse=True):
        thewriter.writerow([k,v])