# Who Is J?

One of the main goals of the ‘Yes We Tech’ community is contributing to create an inclusive space where we can celebrate diversity, provide visibility to women-in-tech, and ensure that everybody has an equal chance to learn, share and enjoy technology related disciplines.

As co-organisers of the JOTB event we are happy to see that the number of women speakers has been doubled this year from 5 to 11, representing a 21% of the total.

But, Is this diversity enough? How can we know that we have succeeded in our goal? and more importantly, how can we get a more diverse event in future editions?

## Analysing JOTB diversity network 

The work that we are sharing here talks about two things: one is a story about data and the other one is a story about people.

Data is pretty simple, very straightforward. We'll start importing some libraries to help us analyse and visualise it.

In [328]:
import pandas as pd
import numpy as np
import scipy as sp
from iplotter import GCPlotter

plotter = GCPlotter()

### Small data analysis

It says that in 2016, J engaged more than 40 speakers and 350 attendees into this big data thing.

Unfortunately, since it was our first event we didn't collect enough information about participation, so we only have approximate data and we don't know enough about company's distribution.

In [329]:
data2016 = pd.read_csv('../input/small_data_2016.csv')
data2016

Unnamed: 0,Tribe,Women,Men,Total
0,speakers,5,43,48
1,attendees,105,245,350
2,independent,0,0,0
3,copmany_teams,0,0,0
4,company_teams_no_women,0,0,0
5,hackathon,0,0,0


This year speakers are 40, attendees are also nearly 400 hundred and big data is bigger than ever and it includes workshops and a hackathon.

In [281]:
data2017 = pd.read_csv('../input/small_data_2017.csv')
data2017

Unnamed: 0,Tribe,Women,Men,Total
0,speakers,11,29,40
1,attendees,36,332,368
2,independent,6,65,71
3,copmany_teams,30,267,297
4,company_teams_no_women,0,134,134
5,hackathon,4,21,25


The more the better right? But there are more numbers behind those. The numbers that say something about a slightly sign of diversity.

This year we have a **27.5%** of women speaking to J, compared with a rough **10,4%** of the last year. 
However, and this is the worrying thing, the participation of women as attendees has dropped from an acceptable average of **30%** to a disappointing **9.8%**.


In [331]:
data2017['Women Rate'] = pd.Series(data2017['Women']*100/data2017['Total'])
data2017['Men Rate'] = pd.Series(data2017['Men']*100/data2017['Total'])
data2017

Unnamed: 0,Tribe,Women,Men,Total,Women Rate,Men Rate
0,speakers,11,29,40,27.5,72.5
1,attendees,36,332,368,9.782609,90.217391
2,independent,6,65,71,8.450704,91.549296
3,copmany_teams,30,267,297,10.10101,89.89899
4,company_teams_no_women,0,134,134,0.0,100.0
5,hackathon,4,21,25,16.0,84.0


In [332]:
data2016['Women Rate'] = pd.Series(data2016['Women']*100/data2016['Total'])
data2016['Men Rate'] = pd.Series(data2016['Men']*100/data2016['Total'])
data2016

Unnamed: 0,Tribe,Women,Men,Total,Women Rate,Men Rate
0,speakers,5,43,48,10.416667,89.583333
1,attendees,105,245,350,30.0,70.0
2,independent,0,0,0,,
3,copmany_teams,0,0,0,,
4,company_teams_no_women,0,0,0,,
5,hackathon,0,0,0,,


In [340]:
data = [
    ['Tribe', 'Women', 'Men', {"role": 'annotation'}],
    ['2016', data2016['Women Rate'][0], data2016['Men Rate'][0],''],
    ['2017', data2017['Women Rate'][0], data2017['Men Rate'][0],''],
]
options = {
    "title": 'Speakers at JOTB',
    "width": 600,
    "height": 400,
    "legend": {"position": 'top', "maxLines": 3},
    "bar": {"groupWidth": '50%'},
    "isStacked": "true",
    "colors": ['#984e9e', '#ed1c40'],
}

plotter.plot(data,chart_type='ColumnChart',chart_package='corechart', options=options)

In [341]:
data = [
    ['Tribe', 'Women', 'Men', {"role": 'annotation'}],
    ['2016', data2016['Women Rate'][1], data2016['Men Rate'][1],''],
    ['2017', data2017['Women Rate'][1], data2017['Men Rate'][1],''],
]
options = {
    "title": 'Attendees at JOTB',
    "width": 600,
    "height": 400,
    "legend": {"position": 'top', "maxLines": 3},
    "bar": {"groupWidth": '55%'},
    "isStacked": "true",
    "colors": ['#984e9e', '#ed1c40'],
}

plotter.plot(data,chart_type='ColumnChart',chart_package='corechart', options=options)

#### Why this happened? 

We really don’t know. We continued looking at the numbers and realised that **30** of the **45** companies that enroll two or more people didn't include any women on their lists.

That this percentage means a **31%** of the mass of attendees. We have observe that there are more women outside on the sponsor’s area than inside the conference rooms. And that despite the fact that our ability to summon has increased on meetups, the engagement on other events doesn’t have had a big impact. 

In [365]:
companies_team = data2017['Total'][3] + data2017['Total'][4]
mass_represented = pd.Series(data2017['Total'][4]*100/companies_team)
women_represented = pd.Series(100 - mass_represented)
mass_represented

0    31
dtype: int64

In [362]:
# it is not working
data = [
    ['Companies', 'Percentage',  {"role": 'annotation'}],
    ['Any Women', women_represented, ''],
    ['No women', mass_represented, ''],
]
options = {
    "title": 'Companies with more than two people',
    "width": 600,
    "height": 400,
    "colors": ['#984e9e', '#ed1c40'],
}

plotter.plot(data,chart_type='PieChart',chart_package='corechart', options=options)

TypeError: 69 is not JSON serializable

In [327]:
data = [
    ['Tribe', 'Women', 'Men', {"role": 'annotation'}],
    [data2017['Tribe'][2], data2017['Women Rate'][2], data2017['Men Rate'][2],''],
    [data2017['Tribe'][3], data2017['Women Rate'][3], data2017['Men Rate'][3],''],
    [data2017['Tribe'][5], data2017['Women Rate'][5], data2017['Men Rate'][5],''],
]
options = {
    "title": '2017 JOTB Edition',
    "width": 600,
    "height": 400,
    "legend": {"position": 'top', "maxLines": 3},
    "bar": {"groupWidth": '55%'},
    "isStacked": "true",
    "colors": ['#984e9e', '#ed1c40'],
}

plotter.plot(data,chart_type='ColumnChart',chart_package='corechart', options=options)

### Social network analysis

In [90]:
run index.py yeswetech_

In [10]:
whoisj = pd.read_json('../out/yeswetech_.json', orient = 'columns')
whoisj

Unnamed: 0,yeswetech_
favourites_count,506
female_count,125
female_rate,34%
followers_count,622
followers_list,"{u'Aliene_Guzh': {u'lang': u'en', u'favourites..."
friends_count,498
friends_list,"{u'diana_aceves_': {u'lang': u'en', u'favourit..."
gender,undetermined
id,3346368821
lang,es


In [11]:
people = pd.read_json(whoisj['yeswetech_'].to_json())
people

Unnamed: 0,favourites_count,female_count,female_rate,followers_count,followers_list,friends_count,friends_list,gender,id,lang,location,male_count,male_rate,name,nonbinary_count,nonbinary_rate,statuses_count,total_count,undefined_count,undefined_rate
101tvMalaga,506,125,34%,622,,498,"{u'lang': u'es', u'favourites_count': 6765, u'...",undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
11defebreroES,506,125,34%,622,"{u'lang': u'en', u'favourites_count': 8730, u'...",498,"{u'lang': u'en', u'favourites_count': 8730, u'...",undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
1Belen_Lorente,506,125,34%,622,,498,"{u'lang': u'es', u'favourites_count': 3436, u'...",undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
47deg,506,125,34%,622,,498,"{u'lang': u'en', u'favourites_count': 2041, u'...",undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
ABenton,506,125,34%,622,,498,"{u'lang': u'en', u'favourites_count': 3867, u'...",undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
APTE_es,506,125,34%,622,"{u'lang': u'es', u'favourites_count': 975, u'n...",498,"{u'lang': u'es', u'favourites_count': 975, u'n...",undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
AceptaCookies,506,125,34%,622,"{u'lang': u'es', u'favourites_count': 0, u'nam...",498,,undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
AdaLab_Digital,506,125,34%,622,"{u'lang': u'es', u'favourites_count': 1324, u'...",498,,undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
Agroprospero,506,125,34%,622,"{u'lang': u'es', u'favourites_count': 386, u'n...",498,,undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%
AlexanderTTcom,506,125,34%,622,"{u'lang': u'es', u'favourites_count': 1, u'nam...",498,,undetermined,3346368821,es,"Málaga, Andalucía",59,16%,Yes We Tech,161,44%,1017,360,0,0%


In [12]:
followers = pd.read_json(people['followers_list'].to_json(), orient = 'index')
followers

Unnamed: 0,favourites_count,followers_count,friends_count,gender,id,lang,location,name,statuses_count
101tvMalaga,,,,,,,,,
11defebreroES,8730,4814,1512,female,794593414763450368,en,facebook.com/dia11defebrero,Día Mujer y Ciencia,3240
1Belen_Lorente,,,,,,,,,
47deg,,,,,,,,,
ABenton,,,,,,,,,
APTE_es,975,3396,1413,undetermined,214108777,es,España,APTE,6111
AceptaCookies,0,3,38,undetermined,822112735764873217,es,,Acepta las cookies,0
AdaLab_Digital,1324,956,567,undetermined,761153947990097920,es,"Madrid, Comunidad de Madrid",AdaLab,1048
Agroprospero,386,60,278,undetermined,872927353,es,,agricultor ilustrado,769
AlexanderTTcom,1,24,78,undetermined,736249613263503360,es,"Barcelona, España",AlexanderTT,27


In [13]:
following = pd.read_json(people['friends_list'].to_json(), orient = 'index')
following

Unnamed: 0,favourites_count,followers_count,friends_count,gender,id,lang,location,name,statuses_count
101tvMalaga,6765,18188,1913,undetermined,516737429,es,"Málaga, España",101Tv Málaga,44622
11defebreroES,8730,4814,1512,female,794593414763450368,en,facebook.com/dia11defebrero,Día Mujer y Ciencia,3240
1Belen_Lorente,3436,285,320,undetermined,4901346731,es,"Málaga, España",Belen Lorente Molina,439
47deg,2041,3188,463,undetermined,187074794,en,Seattle | Spain | London,47 Degrees,1482
ABenton,3867,17200,2309,female,14202541,en,Everywhere I'm At,Angela Benton,12783
APTE_es,975,3396,1413,undetermined,214108777,es,España,APTE,6111
AceptaCookies,,,,,,,,,
AdaLab_Digital,,,,,,,,,
Agroprospero,,,,,,,,,
AlexanderTTcom,,,,,,,,,


In [14]:
sum(following['gender'] == 'male')

20

In [15]:
sum(following['gender'] == 'female')

79

In [16]:
sum(following['gender'] == 'andy')

0