[Home](index.ipynb) > [Data Collections](data_collections.ipynb) > Social Network Science

<img style='float: left;' src='https://www.gesis.org/fileadmin/styles/img/gs_home_logo_en.svg'>

### ``compsoc`` – *Notebooks for Computational Sociology* (alpha)

# Social Network Science (1916-2012): Collaboration and language use in a scholarly field
Authors: [Haiko Lietz](https://www.gesis.org/person/haiko.lietz)

Version: 0.91 (14.09.2020)

Please cite as: Lietz, Haiko (2020). Social Network Science (1916-2012): Collaboration and language use in a scholarly field. Version 0.91 (14.09.2020). *compsoc – Notebooks for Computational Sociology*. GESIS. url:[github.com/gesiscss/compsoc](https://github.com/gesiscss/compsoc)

<div class="alert alert-info">
<big><b>Significance</b></big>

Bla.
</div>

## Introduction
Bla.

Early case of behavioral data

Example for traces of behavior harnessed by digital technology, in this case: collected by the company Clarivate Analytics and stored in a bibliographic database.

Field delineated by Lietz (2020) for the purpose of... https://doi.org/10.1007/s11192-020-03527-0

Makes full use of the data model... "teaching example" for the mapping of quantifiable things like publications to transactions and authors, cited references or words to facts



**In this notebook**, bla.

## Dependencies and Settings

In [None]:
import compsoc as cs
import networkx as nx
import pandas as pd

Data is at: https://doi.org/10.7802/1.1954

In [None]:
publications = pd.read_csv('data/sns/publications.txt', sep='\t', encoding='utf-8')
authors = pd.read_csv('data/sns/authors.txt', sep='\t', encoding='utf-8')
authorships = pd.read_csv('data/sns/authorships.txt', sep='\t', encoding='utf-8')
words = pd.read_csv('data/sns/words.txt', sep='\t', encoding='utf-8')
usages = pd.read_csv('data/sns/usages.txt', sep='\t', encoding='utf-8')
subfields = pd.read_csv('data/sns/subfields.txt', sep='\t', encoding='utf-8')

Dataset is already normalized

Tables with primary keys contain entities

Their relationships are specified in tables that merely consist of foreign keys.

|<img src='images/data_model_sns.png' style='float: none; width: 640px'>|
|:--|
|<em style='float: center'>**Figure 1**: Entity-relationship model for the Social Network Science dataset</em>|

Transactions as elementary pieces of communication

In [None]:
publications.head()

In this academic case, a possible translation of "transactions select facts" is that "publications are authored by authors". Authors are the senders of communications to an unspecified set of receivers.

The ``authors`` entity table is a mere list of which author has which identifier, where the identifier is an integer between $0$ and $N$. In case of an author network, $N$ is the number of nodes.

In [None]:
authors.head()

The information which publication is actually authored by which author is stored in the ``authorships`` relationship table. The beauty of these tables is that they can directly be used as edge lists for network construction:

In [None]:
authorships.head()

In [None]:
authorships_2010 = authorships[authorships['publication_id'].isin(publications[publications['time'].between(2010, 2012)]['publication_id'])]

In [None]:
authorships_2010['weight'] = 1

In [None]:
_, authors, co_authorships_2010, _ = cs.meaning_structures(
    selections=authorships_2010, 
    transaction_id='publication_id', 
    fact_id='author_id', 
    multiplex=True, 
    transactions=publications, 
    domain_id='subfield_id', 
    facts=authors
)

In [None]:
G = cs.construct_graph(
    directed=False, 
    multiplex=True, 
    graph_name='co_authorships_2010', 
    node_list=authors, 
    node_size='degree', 
    edge_list=co_authorships_2010[['author_id_from', 'author_id_to', 'weight', 'subfield_id']], 
    node_label='author'
)

In [None]:
G_lcc = G.subgraph(max(nx.connected_components(G), key=len))

In [None]:
vp_node_pos = nx.spring_layout(G_lcc)

In [None]:
cs.draw_graph(
    G_lcc, 
    node_pos=vp_node_pos, 
    node_size_factor=5, 
    edge_width_factor=5, 
    edge_transparency=.5, 
    figsize='large'
)