[Home](index.ipynb) > [Data Transformation](data_transformation.ipynb) > Construct graph

<img style='float: left;' src='https://www.gesis.org/fileadmin/styles/img/gs_home_logo_en.svg'>

### ``compsoc`` – *Notebooks for Computational Sociology* (alpha)

# Construct graph: Build networks from standardized data
Authors: [Haiko Lietz](https://www.gesis.org/person/haiko.lietz)

Version: 0.91 (14.09.2020)

Please cite as: Lietz, Haiko (2020). Construct graph: Build networks from standardized data. Version 0.91 (14.09.2020). *compsoc – Notebooks for Computational Sociology*. GESIS. url:[github.com/gesiscss/compsoc](https://github.com/gesiscss/compsoc)

<div class='alert alert-info'>
<big><b>Significance</b></big>

Bla.
</div>

## Introduction
Bla.

**In this notebook**, bla.

## Dependencies and settings

In [1]:
import compsoc as cs
import networkx as nx
import numpy as np
import pandas as pd

In [2]:
#import warnings

In [3]:
#warnings.filterwarnings('ignore')

## Bla

In [None]:
def construct_graph(
    directed, 
    multiplex, 
    graph_name, 
    node_list, 
    edge_list, 
    node_pos=None, 
    node_size=None, 
    node_color=None, 
    node_shape=None, 
    node_border_color=None, 
    node_label=None, 
    attribute_shape={0: 's', 1: 'o', 2: '^', 3: '>', 4: 'v', 5: '<', 6: 'd', 7: 'p', 8: 'h', 9: '8'}, 
    layer_color={0: '#e41a1c', 1: '#377eb8', 2: '#4daf4a', 3: '#984ea3', 4: '#ff7f00', 5: '#ffff33', 6: '#a65628', 7: '#f781bf', 8: '#999999'}
):
    '''
    Description: Constructs a graph from a node list and an edge list
    
    Inputs:
        directed: Boolean parameter specifying if graph should be directed.
        multiplex: Boolean parameter specifying if graph should be multiplex.
        graph_name: Name of the graph (string); must be specified.
        node_list: Dataframe containing the node properties; must contain a continuous 
            index from 0 to N-1 where N is the number of vertices; must contain a column 
            holding the name of each vertex; must contain a column holding an integer that 
            codes the class a vertex belongs to (used to color the vertices).
        node_pos: List of two columns of the dataframe ``node_list`` that hold the x and y 
            positions of each node; must be numerical variables; set to ``None`` by default.
        node_size: Name of the column of the dataframe ``node_list`` that holds the size 
            of each node; must be a numerical variable; set to ``None`` by default.
        node_color: Name of the column of the dataframe ``node_list`` that holds the color 
            of each node; must be a hexadecimal color variable; set to ``None`` by default.
        node_shape: Name of the column of the dataframe ``node_list`` that codes the shape 
            of each node; must be an integer between 0 and 9; set to ``None`` by default.
        node_border_color: Name of the column of the dataframe ``node_list`` that holds 
            the color of each node border; must be a hexadecimal color variable; set to 
            ``None`` by default.
        node_label: Name of the column of the dataframe ``node_list`` that holds the name 
            of each node; must be a string variable; set to ``None`` by default.
        attribute_shape: Dictionary containing the mapping from the integer stored in the 
            'node_shape' column to a shape; matplotlib.scatter markers 'so^>v<dph8' are 
            used by default.
        edge_list: Dataframe with exactly three columns (source node id, target node id, 
            edge weight; in that order) containing the edges of the graph; if the graph is 
            multiplex, a fourth column must contain an integer between 0 and N-1 where N 
            is the number of edge layers; must be specified.
        layer_color: Dictionary containing the mapping from the layer integer stored in 
            the fourth column of the ``edge_list`` to a hexadecimal color; a dictionary of 
            nine colors that are qualitatively distinguishable is used by default.
    
    Output: networkx graph object, potentially with graph, node, and edge attributes.
    '''
    # create graph object
    import networkx as nx
    if directed:
        if multiplex: g = nx.MultiDiGraph(name=graph_name)
        else: g = nx.DiGraph(name=graph_name)
    else:
        if multiplex: g = nx.MultiGraph(name=graph_name)
        else: g = nx.Graph(name=graph_name)
    
    # populate graph with vertices and their properties
    for i in node_list.index:
        g.add_node(i)
        if node_pos: g.nodes[i]['node_pos'] = node_list[node_pos].values[i]
        if node_size: g.nodes[i]['node_size'] = node_list[node_size][i]
        if node_color: g.nodes[i]['node_color'] = node_list[node_color][i]
        if node_shape: g.nodes[i]['node_shape'] = attribute_shape[node_list[node_shape][i]]
        if node_border_color: g.nodes[i]['node_border_color'] = node_list[node_border_color][i]
        if node_label: g.nodes[i]['node_label'] = node_list[node_label][i]
    
    # populate graph with edges and their properties
    if multiplex == True:
        edge_list = edge_list[edge_list.columns[:4]]
        edge_list.loc[:, 'color'] = [layer_color[identifier] for identifier in edge_list[edge_list.columns[3]].values]
        edge_list.loc[:, 'dict'] = edge_list[[edge_list.columns[2], 'color']].to_dict(orient='records')
        edge_list.drop([edge_list.columns[2], 'color'], axis=1, inplace=True)
        g.add_edges_from(edge_list.values)
    else:
        g.add_weighted_edges_from(edge_list[edge_list.columns[:3]].values)
    
    return g