# <img src="http://datapao.com/wp-content/themes/datapao/img/header.svg" style="display:inline"/>  Budapest Data Meetups

This notebook contains the overview of the Budapest Data community through the numbers of the [meetup.com](https://meetup.com).

#### Data source
If you are interested in the source data, visit [our blog](https://datapao.com/blog/) where we publish our data sources. <br/>
On our blog post we share the work of others too! Send us your take on this data.



#### Source code
This noteboook and the crawlers are available on our [Github site](https://github.com/datapao/budapest-data-community/). 


---
<img src="logo.png" alt="Drawing" style="width: 100px;"/>

In [94]:
%matplotlib inline
import pandas as pd
import matplotlib as plt
import numpy as np

# Meetup groups graph

## Edges

In [95]:
members = pd.read_csv("data/members_budapest_data.csv", delimiter=";", names=["country","city","joined","name","visited","id","group"])

In [96]:
members.head()

Unnamed: 0,country,city,joined,name,visited,id,group
0,hu,Budapest,1464271220000,A. Asbot,1500461845000,186451270.0,budapest_data_science
1,hu,Budapest,1432103533000,A.J. Lumijärvi,1478694290000,186768573.0,budapest_data_science
2,hu,Budapest,1474229938000,Ábel Fóthi,1478776809000,151890652.0,budapest_data_science
3,hu,Budapest,1422875944000,Ábel Melinda,1422875944000,183778449.0,budapest_data_science
4,hu,Budapest,1469042635000,Abel Oszwald,1469224448000,105964922.0,budapest_data_science


In [97]:
members = members.drop(['country','city'], axis=1)

In [98]:
members[members.name.str.contains(" ")].groupby("name")["group"].count().sort_values(ascending=False).head(15)

name
Arató Bence          25
Andras Nagy          25
Milan Dobri          23
Endre Adam           23
Csaba Peter          22
Zsolt Tóth           22
Zoltán Polgár        22
Andrew John Lowe     22
Zoltan Ludanyi       21
Samu Imre            21
Siddharth Pandit     20
Zoltan C. Toth       20
Dániel Berecz        20
MobilPhone Andras    20
Zoltan Prekopcsak    20
Name: group, dtype: int64

In [99]:
c = members.merge(members, on="name")[['name','group_x','group_y']]

In [100]:
c = c[c.group_x != c.group_y]

In [101]:
c["const"] = 1

In [102]:
edges = c.groupby(['group_x', 'group_y'], as_index=False)['const'].count()

In [103]:
edges = edges.rename(columns={"group_x": "Source", "group_y": "Target", "const": "Weight"})

In [104]:
edges["Type"]="Undirected"

In [105]:
edges.head()

Unnamed: 0,Source,Target,Weight,Type
0,Big-Data-Budapest,Big-Data-Meetup-Budapest,662,Undirected
1,Big-Data-Budapest,Budapest-Analytics-Rockstars,453,Undirected
2,Big-Data-Budapest,Budapest-BI-Meetup,401,Undirected
3,Big-Data-Budapest,Budapest-Cassandra-Users,45,Undirected
4,Big-Data-Budapest,Budapest-Data-Projects-Meetup,307,Undirected


In [106]:
edges.to_csv('data/meetup-groups-graph/edges.csv', header=True, index=False)

## Nodes

In [107]:
meetups = pd.read_csv("data/meetup_groups_budapest_data.csv", names=["rating","name","id","urlname","member"],delimiter=";")
meetups = meetups[["urlname","name","member"]]

In [108]:
meetups = meetups.rename(columns={"urlname": "Id", "name": "Label", "member": "Weight"})

In [109]:
meetups.head()

Unnamed: 0,Id,Label,Weight
0,budapest_data_science,Budapest Data Science Meetup,2711
1,HUG-MSSQL,HUG-MSSQL,399
2,Hungarian-nlp,Hungarian Natural Language Processing Meetup,968
3,KURT_Akademia,KÜRT Akadémia meetupok,484
4,Big-Data-Meetup-Budapest,Budapest Big Data Meetup,1906


In [110]:
meetups.to_csv('data/meetup-groups-graph/nodes.csv', header=True, index=False)

# Social fabric graph 

In [111]:
members = pd.read_csv("data/members_budapest_data.csv", delimiter=";", names=["country","city","joined","name","visited","id","group"])

In [112]:
members.head()

Unnamed: 0,country,city,joined,name,visited,id,group
0,hu,Budapest,1464271220000,A. Asbot,1500461845000,186451270.0,budapest_data_science
1,hu,Budapest,1432103533000,A.J. Lumijärvi,1478694290000,186768573.0,budapest_data_science
2,hu,Budapest,1474229938000,Ábel Fóthi,1478776809000,151890652.0,budapest_data_science
3,hu,Budapest,1422875944000,Ábel Melinda,1422875944000,183778449.0,budapest_data_science
4,hu,Budapest,1469042635000,Abel Oszwald,1469224448000,105964922.0,budapest_data_science


In [113]:
members = members[members.name.str.contains(" ")]

In [114]:
c = members.merge(members, on="group")[['group','name_x', 'name_y']]

In [115]:
c = c[c.name_x != c.name_y]

In [None]:
len(c)

18658628

In [None]:
graph_edges = c.groupby(['name_x', 'name_y'], as_index=False).count()

In [None]:
filtered_edges = graph_edges[graph_edges.group > 10]

In [None]:
len(filtered_edges)

In [None]:
filtered_edges.sort_values(by="group", ascending=False).head(23)

In [None]:
filtered_edges = filtered_edges.rename(columns={"name_x": "Source", "name_y": "Target", "group": "Weight"})

In [None]:
filtered_edges['Type'] = 'Undirected'

In [None]:
filtered_edges.head()

In [None]:
filtered_edges.to_csv('data/meetup-members-graph/edges.csv', header=True, index=False)

<span style="font-weight: bold;">*Created with love at Datapao*</span>

---
<img src="logo.png" alt="Drawing" style="width: 100px;"/>