# Paper on Career Sequences of Transnational Corporate Elites

Extract data from Orbis dataset: www.bvdinfo.com

Snapshot December 2017

* BvD: firm ID
* UCI: individual ID
* type of the position that a person obtains
* important positions
* current positions
* appointment date of a person
* operating revenue of a firm (frims with > 100,000,000 US$)

Data preparation steps:
* Take people with one of their current position: board of directors, executive board, supervisory board
* With known appointment dates
* Their careers during 2000-2017
* Extract transnational individuals (work in more than 1 country)
* Assign regions to countries using 'countrycode' package in R

## R

In [23]:
getRversion()

[1] ‘3.4.1’

In [None]:
library(TraMineR)
library(colorspace)
library(cluster)
library(dplyr)
library(ggplot2)
library(psych)
library(WeightedCluster)

## Reading data for sequence analysis

In [30]:
setwd(...)
df <- read.csv('data.csv', header = TRUE, sep = ",", na.strings=c("","NA"))
nrow(df)

In [31]:
head(df)

X,positionUCI,X0,X1,X2,X3,X4,X5,X6,X7,X8,X9
0,P000004754,Europe S,Europe E,,,,,,,,
1,P000004818,Europe S,Europe N,,,,,,,,
2,P000025348,Europe S,Europe W,,,,,,,,
3,P000044184,Europe W,Europe S,,,,,,,,
4,P000044407,Europe W,Europe W,,,,,,,,
5,P000063770,Europe E,Europe W,,,,,,,,


## Assign color palette for sequence states

In [20]:
# Colors are color-blind friendly, based on this palette:
# http://colorbrewer2.org/#type=sequential&scheme=Greens&n=4

# Melanesia + Micronesia => merged into one state
# N Africa + W Africa => merged into one state
# C Asia + W Asia => merged into one state

## Europe
# green
EuropeN <- '#edf8e9'
EuropeW <- '#bae4b3'
EuropeE <- '#74c476'
EuropeS <- '#238b45'

## America
# blue
AmericaN <- '#2171b5'
AmericaS <- '#bdd7e7'
AmericaC <- '#6baed6'
Caribbean <- '#eff3ff'

## Asia
# orange
AsiaCW <- '#fdbe85'
AsiaE <- '#fd8d3c'
AsiaSE <- '#e6550d'
AsiaS  <- '#a63603'

## Africa
# pink
AfricaNW  <- '#fbb4b9'
AfricaE <- '#c51b8a'
AfricaS <- '#7a0177'

## Australia
# violet
MelaMicronesia <- '#bcbddc'
AustrNewZealand <-'#756bb1'

# Palette should be in alphabetical order
palette = c(AfricaE,AfricaNW,AfricaS,
            AmericaC,AmericaN,AmericaS,
            AsiaCW, AsiaE, AsiaS, AsiaSE,
            AustrNewZealand, Caribbean,
            EuropeE,EuropeN,EuropeS, EuropeW,
            MelaMicronesia)

palette

In [None]:
# Assign index to the dataset
rownames(df) <- df$positionUCI

# Delete unnecessary columns
df$positionUCI <- NULL
df$X <- NULL

## Sequencing

In [None]:
# Create a sequence dataset
sts.seq1 <- seqdef(df, cpal=palette, right='DEL', nr="*")

# Assign a cost matrix based on observed transition rates
costmatrix1 <- seqsubm(sts.seq1, 
                      method="TRATE", 
                      with.missing=FALSE,
                     time.varying=FALSE,
                     cval = 2,
                     miss.cost=1)

# Create a dissimilarity matrix, using Optimal Matching algorithm 
diss1 <- seqdist(sts.seq1, method = "OM",
                 indel=1,
                   with.missing = FALSE,
                    full.matrix =FALSE,
                sm = costmatrix1,
               weighted=FALSE)

## Clustering sequences

In [None]:
# Use Ward clustering algorithm
clward1 <- hclust(diss1, method="ward.D")

In [None]:
# Show statistics for 35 clusters
cluQual_ward35 <- as.clustrange(clward1, diss1, ncluster = 35)
cluQual_ward35$stats

# ASW (siluette)
# 0.71 – 1.00 excellent split
# 0.51 – 0.70 reasonable structure has been found
# 0.26 – 0.50 weak structure, could be artificial

# A high ASW value means that the clusters are homogeneous (all observations are close to cluster center)
# and that they are well separated

In [None]:
# Show the best clustering solutions
summary(cluQual_ward35, max.rank = 2)

## Visualization of clusters

In [None]:
## Visualize all 35 clusters together

# Cutting cluster by the number of the determined clusters (k)
clTree1 <- cutree(clward1, k = 35) 

# Turning cut points into a factor variable and labeling them
clward35 <- factor(clTree1, labels = paste("Trajectory", 1:35))

# Save as .png
png(filename = "seqdplot35.png", width = 3200, height = 2200, units = "px", pointsize=30)
seqplot(sts.seq1, group = clward35, type="I", sortv = "from.start", with.legend = 'auto',border = NA,
        use.layout=TRUE, cols=6)
dev.off()


## Visualize each cluster separately

png(filename = "Cluster_1.png", width = 650, height = 500, units = "px", pointsize=18)
seqplot(sts.seq1[clward35=="Trajectory 1",], type="I", sortv = "from.start",
        with.legend = 'right',border = NA, use.layout=TRUE)
dev.off()