![Ethereum Wallet loading screen](https://i.imgur.com/BDs8lGK.png)

# ETHPrize dev interviews analysis

Using manually tagged data

In [1]:
import pandas as pd

## Loading data

In [2]:
df = pd.read_csv('./ETHPrize Tagged Data - 20180923.csv')

In [3]:
# View a snippet of the data
df.head(10)

Unnamed: 0,Name,Who are you and what are you working on?,Topics,Projects,What are your biggest frustrations?,Topics.1,Projects.1,What tools don’t exist at the moment?,Topics.2,Projects.2,...,Projects.10,What are you most excited about in the short term?,Topics.11,Projects.11,Who are the other people you think we should talk to?,Topics.12,Projects.12,Are there any other questions we should be asking?,Topics.13,Projects.13
0,﻿Fabio Berger + Remco Bloemen,0x - Decentralized exchange protocol. It is a ...,"exchange, protocol, standards, transaction, ne...",0x,Getting a simple experimental environment up i...,"experimentation, tracing, profiling, code cove...","Solidity, Remix",,,,...,,,,,,,,,,
1,﻿Leo Logvinov,"Started in blockchain 2 years ago in Berlin, w...","usability, errors","IntelliJ, 0x","Event watching - unreliable, no support for ba...","events, ABI, contracts, debugger, standards, c...","Truffle, Solidity, IntelliJ, Ganache, LLL",Prettier type plugin for solidity. I don’t hav...,,Solidity,...,"Ganache, Solidity, EthereumJS-blockstream",,,,,,,,,
2,﻿Axel Ericsson,I have built 1Protocol\nIt lets smart contract...,"contracts, stake signing, signatures, tokens, ...","MEW, Raiden, 1Protocol",,,,There is no tooling or anything related to sta...,state channels,MEW,...,,Raiden is the golden egg in the space. The exp...,,"Raiden, Ethereum, Counterfactual",,,,,,
3,﻿Mike Goldin,Software developer at Consensys\nKnown for Tok...,"Token Curated Registries, TCRs, design, money,...","Consensys, AdChain",Truffle’s debugger is a bit disappointing. Wor...,"debugger, contracts, proxy, data, production, ...","Truffle, EthPM",Fuzz Testing and formal verification desired.\n,"fuzz testing, formal verification",,...,,Excited for Casper\nApplications implemented i...,state channels,"Casper, Plasma","Infura team, client development\nSpankchain - ...",,"Infura, Spankchain",,,
4,﻿Oleksii,Started working with smart contracts in early ...,"contracts, scalability, tokens, logic, deploym...","Ambisafe, Truffle",Our original vision was to do everything – tes...,"snapshot, deployment, contracts, deployment, t...","built_our_own, Truffle, Testrpc, Remix, Securify",,,,...,"Ethercamp, 4byte.io",,,,,,,,,
5,﻿Brett Sun,Working on Aragon entirely.\nThe end goal is t...,"netowkr, organisation, protocol, money, permis...","Aragon, Ethereum",My biggest general frustration comes with the ...,"security, productivity, ecosystem, best practi...",Ethereum,A nice debugger! Please…\nMore infrastructure ...,"debugger, infrastructure, caching, dapps, events",,...,,,,,,,,,,
6,﻿Jorge Izquierdo,Aragon - Decentralized Governance platform\nWe...,"governance, language, natspec, dapps, Ruby, up...",Aragon,,,,,,,...,"Augur, Solidity, EthereumJS-blockstream",,,,,,,,,
7,﻿Jack Peterson and Sparkle,,,,Lack of a debugger - by far the biggest issue....,"debugger, transactions",,Setting breakpoints in tests!\nSalesforce Deve...,"bounties, documentation, gas, yellow paper","Salesforce, Geth, React, IPFS, EVM, Kauri, Eth...",...,"Augur, Raiden, Airbit",Raiden is coming shortly and will be very cool...,,"Raiden, Proof of Steak, uPort, Augur",,,,,,
8,﻿Joey Krug,Co-Chief Investment Officer at Pantera Capital...,,"Pantera Capital, Augur","At the end of the day, it all comes down to th...","scalability, payments, state channels","Ethereum, L4, EOS","My answers changed over a time, 1 year ago it ...",static analysis,,...,"Ethereum, Bitcoin, Geth, parity","In the short term, he’s most excited about Mak...","stability, scalability, sharding",MakerDAO,,,,,,
9,﻿Mark Beylin,Creator of the Bounties Network. Bounties on a...,"bounties, websockets, transactions","Bounties.network, Ethereum, IPFS, Solidity, Me...",Not being able to upgrade my contracts easily....,"upgradability, data, gas, gas limit, deploymen...",IPFS,Better querying possibilities on the state of ...,"state, contracts, community",,...,,Sharding. That’s all I want.\nuPort is getting...,sharding,uPort,Joseph Vander (learning the be a solidity deve...,,"Gitcoin, Solidity",Curious to know about developer incentives. As...,"incentives, money",Augur


## Reformatting the columns for multi-labelling

The topics and projects can be applied to all questions so we can train two common models instead of a
a question specific models, one model for topics and one for projects.

We will also ignore the questions with no answers.

We need to keep the original (line, name, question) to pass from the "natural" representation to the one suitable for
automated multi-labelling.

In summary we will go convert representation as the following

```
   Name  | Q0   T0  P0  Q1    T1  P1        Obs  OldRow Name  Question Answer Topics Projects
0  Alice | foo  a,b  x   oof   b   y  ==>     0    0    Alice  Q0      foo     a,b      x
1  Bob   | bar   b  x,y  rab   c   z          1    0    Alice  Q1      oof       b      y
2  Eve   | baz   a   y   zab   a   x          2    1    Bob    Q0      bar       b    x,y
                                              3    1    Bob    Q1      rab       c      z
                                              4    2    Eve    Q0      baz       a      y
                                              5    2    Eve    Q1      zab       a      x
```

With have 43 columns with the first one being the Name.

### First besides the name we need to group them

In [4]:
def getMultiCol(columns):
    length = len(columns)
    qtp = [('Name', 'Name')] # questions, topics, projects
    for col_idx in range(1, length, 3):
        qtp.append((columns[col_idx], 'Answer'))
        qtp.append((columns[col_idx], 'Topics'))
        qtp.append((columns[col_idx], 'Projects'))
    return pd.MultiIndex.from_tuples(qtp, names=('Questions', 'id'))
        

In [5]:
# Check if we can get what we want for groups
pd.DataFrame(df.drop('Name', axis = 1), columns = getMultiCol(df.columns))

Questions,Name,Who are you and what are you working on?,Who are you and what are you working on?,Who are you and what are you working on?,What are your biggest frustrations?,What are your biggest frustrations?,What are your biggest frustrations?,What tools don’t exist at the moment?,What tools don’t exist at the moment?,What tools don’t exist at the moment?,...,Other domain specific questions?,What are you most excited about in the short term?,What are you most excited about in the short term?,What are you most excited about in the short term?,Who are the other people you think we should talk to?,Who are the other people you think we should talk to?,Who are the other people you think we should talk to?,Are there any other questions we should be asking?,Are there any other questions we should be asking?,Are there any other questions we should be asking?
Answers_Topics_Projects,Name,Answer,Topics,Projects,Answer,Topics,Projects,Answer,Topics,Projects,...,Projects,Answer,Topics,Projects,Answer,Topics,Projects,Answer,Topics,Projects
0,,,,,,,,,,,...,,,,,,,,,,
1,,,,,,,,,,,...,,,,,,,,,,
2,,,,,,,,,,,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,,,...,,,,,,,,,,
5,,,,,,,,,,,...,,,,,,,,,,
6,,,,,,,,,,,...,,,,,,,,,,
7,,,,,,,,,,,...,,,,,,,,,,
8,,,,,,,,,,,...,,,,,,,,,,
9,,,,,,,,,,,...,,,,,,,,,,


In [6]:
# There is no "set_column" so we transpose back and forth
df = df.T.set_index(getMultiCol(df.columns)).T

In [7]:
# Check
df.head(10)

Questions,Name,Who are you and what are you working on?,Who are you and what are you working on?,Who are you and what are you working on?,What are your biggest frustrations?,What are your biggest frustrations?,What are your biggest frustrations?,What tools don’t exist at the moment?,What tools don’t exist at the moment?,What tools don’t exist at the moment?,...,Other domain specific questions?,What are you most excited about in the short term?,What are you most excited about in the short term?,What are you most excited about in the short term?,Who are the other people you think we should talk to?,Who are the other people you think we should talk to?,Who are the other people you think we should talk to?,Are there any other questions we should be asking?,Are there any other questions we should be asking?,Are there any other questions we should be asking?
Answers_Topics_Projects,Name,Answer,Topics,Projects,Answer,Topics,Projects,Answer,Topics,Projects,...,Projects,Answer,Topics,Projects,Answer,Topics,Projects,Answer,Topics,Projects
0,﻿Fabio Berger + Remco Bloemen,0x - Decentralized exchange protocol. It is a ...,"exchange, protocol, standards, transaction, ne...",0x,Getting a simple experimental environment up i...,"experimentation, tracing, profiling, code cove...","Solidity, Remix",,,,...,,,,,,,,,,
1,﻿Leo Logvinov,"Started in blockchain 2 years ago in Berlin, w...","usability, errors","IntelliJ, 0x","Event watching - unreliable, no support for ba...","events, ABI, contracts, debugger, standards, c...","Truffle, Solidity, IntelliJ, Ganache, LLL",Prettier type plugin for solidity. I don’t hav...,,Solidity,...,"Ganache, Solidity, EthereumJS-blockstream",,,,,,,,,
2,﻿Axel Ericsson,I have built 1Protocol\nIt lets smart contract...,"contracts, stake signing, signatures, tokens, ...","MEW, Raiden, 1Protocol",,,,There is no tooling or anything related to sta...,state channels,MEW,...,,Raiden is the golden egg in the space. The exp...,,"Raiden, Ethereum, Counterfactual",,,,,,
3,﻿Mike Goldin,Software developer at Consensys\nKnown for Tok...,"Token Curated Registries, TCRs, design, money,...","Consensys, AdChain",Truffle’s debugger is a bit disappointing. Wor...,"debugger, contracts, proxy, data, production, ...","Truffle, EthPM",Fuzz Testing and formal verification desired.\n,"fuzz testing, formal verification",,...,,Excited for Casper\nApplications implemented i...,state channels,"Casper, Plasma","Infura team, client development\nSpankchain - ...",,"Infura, Spankchain",,,
4,﻿Oleksii,Started working with smart contracts in early ...,"contracts, scalability, tokens, logic, deploym...","Ambisafe, Truffle",Our original vision was to do everything – tes...,"snapshot, deployment, contracts, deployment, t...","built_our_own, Truffle, Testrpc, Remix, Securify",,,,...,"Ethercamp, 4byte.io",,,,,,,,,
5,﻿Brett Sun,Working on Aragon entirely.\nThe end goal is t...,"netowkr, organisation, protocol, money, permis...","Aragon, Ethereum",My biggest general frustration comes with the ...,"security, productivity, ecosystem, best practi...",Ethereum,A nice debugger! Please…\nMore infrastructure ...,"debugger, infrastructure, caching, dapps, events",,...,,,,,,,,,,
6,﻿Jorge Izquierdo,Aragon - Decentralized Governance platform\nWe...,"governance, language, natspec, dapps, Ruby, up...",Aragon,,,,,,,...,"Augur, Solidity, EthereumJS-blockstream",,,,,,,,,
7,﻿Jack Peterson and Sparkle,,,,Lack of a debugger - by far the biggest issue....,"debugger, transactions",,Setting breakpoints in tests!\nSalesforce Deve...,"bounties, documentation, gas, yellow paper","Salesforce, Geth, React, IPFS, EVM, Kauri, Eth...",...,"Augur, Raiden, Airbit",Raiden is coming shortly and will be very cool...,,"Raiden, Proof of Steak, uPort, Augur",,,,,,
8,﻿Joey Krug,Co-Chief Investment Officer at Pantera Capital...,,"Pantera Capital, Augur","At the end of the day, it all comes down to th...","scalability, payments, state channels","Ethereum, L4, EOS","My answers changed over a time, 1 year ago it ...",static analysis,,...,"Ethereum, Bitcoin, Geth, parity","In the short term, he’s most excited about Mak...","stability, scalability, sharding",MakerDAO,,,,,,
9,﻿Mark Beylin,Creator of the Bounties Network. Bounties on a...,"bounties, websockets, transactions","Bounties.network, Ethereum, IPFS, Solidity, Me...",Not being able to upgrade my contracts easily....,"upgradability, data, gas, gas limit, deploymen...",IPFS,Better querying possibilities on the state of ...,"state, contracts, community",,...,,Sharding. That’s all I want.\nuPort is getting...,sharding,uPort,Joseph Vander (learning the be a solidity deve...,,"Gitcoin, Solidity",Curious to know about developer incentives. As...,"incentives, money",Augur


### Second, unpivot/meltdown the dataframe into our desired structure

In [17]:
cleaned = df.set_index('Name').unstack().unstack(level=1).reset_index(drop=True)
cleaned

Answers_Topics_Projects,Questions,Name,Answer,Projects,Topics
0,Are there any other questions we should be ask...,"(﻿Fabio Berger + Remco Bloemen ,)",,,
1,Are there any other questions we should be ask...,"(﻿Leo Logvinov,)",,,
2,Are there any other questions we should be ask...,"(﻿Axel Ericsson,)",,,
3,Are there any other questions we should be ask...,"(﻿Mike Goldin,)",,,
4,Are there any other questions we should be ask...,"(﻿Oleksii,)",,,
5,Are there any other questions we should be ask...,"(﻿Brett Sun,)",,,
6,Are there any other questions we should be ask...,"(﻿Jorge Izquierdo,)",,,
7,Are there any other questions we should be ask...,"(﻿Jack Peterson and Sparkle,)",,,
8,Are there any other questions we should be ask...,"(﻿Joey Krug,)",,,
9,Are there any other questions we should be ask...,"(﻿Mark Beylin,)",Curious to know about developer incentives. As...,Augur,"incentives, money"


We need to fix the "Name" having leftover multilevel and let's sort the dataframe

In [18]:
cleaned['Name'] = cleaned['Name'].str.get(0)
cleaned = cleaned.reindex(columns = ['Name', 'Questions', 'Answer', 'Topics', 'Projects'])
cleaned = cleaned.sort_values(['Name', 'Questions']).reset_index(drop=True)
cleaned

Answers_Topics_Projects,index,Name,Questions,Answer,Topics,Projects
0,56,Christopher Brown,Are there any other questions we should be ask...,,,
1,160,Christopher Brown,How do you handle smart contract verification ...,,,
2,264,Christopher Brown,How do you handle testing?,Just Truffle for tests\nMocha for unit and fun...,"unit tests, testing, functional tests, contact...","Truffle, Mocha, Mthril"
3,368,Christopher Brown,Other bounties?,,,
4,472,Christopher Brown,Other domain specific questions?,,,
5,576,Christopher Brown,Was anything easier than expected?,,,
6,680,Christopher Brown,What are the best educational resources?,,,
7,784,Christopher Brown,What are the tools/libraries/frameworks you use?,"Truffle for building, testing and compiling\nC...","testing, compile, deployment, integration, con...","Open Zeppelin, Truffle, Ganache, Ethereum, not..."
8,888,Christopher Brown,What are you most excited about in the short t...,Proof of Stake overlays will be really interes...,,"Proof of Stake, Casper, eWASM"
9,992,Christopher Brown,What are your biggest frustrations?,,,
