## Replication: Dion, Sumner, Mitchell (2018): Gendered Citation Patterns across Political Science and Social Science Methodology Fields

### Introduction

In the publication in **Political Analysis**, Dion, Sumner and Mitchell aim to explain the gender biased citation behaviour in the social sciences. Building on 

### 

In [13]:
# load necessary packages
library(tidyverse)
library(MASS)
library(foreign)
library(IRdisplay)
library(optimx)
library(rms)
library(kableExtra)


# load datasets necessary for replication
# Data can be found under
# https://www.cambridge.org/core/journals/political-analysis/article/gendered-citation-patterns-across-political-science-and-social-science-methodology-fields/5E8E92DB7454BCAE41A912F9E792CBA7#supplementary-materials-tab
df <- read.dta("Data/DSM2018PAreplication.dta")
df_articles <- read.dta("Data/DSM2018PAreplication_articlesonly.dta")

# load source code necessary for analysis
source("sources/logistic_function.R")
source("sources/execute_logistic_per_journal.R")


Installiere Paket nach 'C:/Users/miohi/AppData/Local/R/win-library/4.2'
(da 'lib' nicht spezifiziert)



Paket 'IRdisplay' erfolgreich ausgepackt und MD5 Summen abgeglichen

Die heruntergeladenen Binärpakete sind in 
	C:\Users\miohi\AppData\Local\Temp\RtmpSg3Rwn\downloaded_packages


"Paket 'IRdisplay' wurde unter R Version 4.2.2 erstellt"


### Replicate Summary tables

The article by Dion et al. (2018) entails Tables 1 and 2 which summarising the distribution of gender regarding articles published as well as references included in articles.

In [22]:
##Table 1

df_articles %>%
  group_by(newjnlid, authorteam) %>%
  summarise(N = n()) %>%
  mutate(Percent = 100 * N / sum(N)) %>%
  kable("html", caption = "Table 1: Distribution of author genders by article, 2007–2016.")  %>%
  as.character() %>%
  display_html()

[1m[22m`summarise()` has grouped output by 'newjnlid'. You can override using the
`.groups` argument.


newjnlid,authorteam,N,Percent
APSR,Male only,324,69.8275862
APSR,Female only,67,14.4396552
APSR,Mixed,73,15.7327586
Politics & Gender,Male only,27,7.9178886
Politics & Gender,Female only,266,78.0058651
Politics & Gender,Mixed,47,13.7829912
Politics & Gender,,1,0.2932551
Political Analysis,Male only,220,74.5762712
Political Analysis,Female only,8,2.7118644
Political Analysis,Mixed,67,22.7118644


In [23]:
##Table 2

df %>%
  filter(refauthcomplete == 1 & !is.na(refteam)) %>%
  group_by(newjnlid, refteam) %>%
  summarise(N = n()) %>%
  mutate(Percent = 100 * N / sum (N))%>%
  kable("html", caption = "Table 2: Distribution of reference author genders, 2007–2016.")  %>%
  as.character() %>%
  display_html()

[1m[22m`summarise()` has grouped output by 'newjnlid'. You can override using the
`.groups` argument.


newjnlid,refteam,N,Percent
APSR,Male,11617,74.239519
APSR,Female,2203,14.078477
APSR,Mixed,1828,11.682004
Politics & Gender,Male,1649,27.977604
Politics & Gender,Female,3405,57.770614
Politics & Gender,Mixed,840,14.251781
Political Analysis,Male,4650,78.933967
Political Analysis,Female,322,5.465965
Political Analysis,Mixed,919,15.600068
Econometrica,Male,9226,84.883614


Using the data provided in the replication material, I am able to replicate the numbers for both tables.

I will now presume to replicate Table 3 in the article, which consists of 6 separate logistic regressions with robust standard errors. Each model aims to explain whether a citation was authored by females only as explained by the gender of the original articles' authors (Male vs. mixed vs. female).

In [14]:
models <- do.call("cbind", lapply(unique(df$newjnlid), logistic_per_journal) )
pooled <- logistic_per_journal("Pooled")

In [15]:
cbind.data.frame(models, pooled) %>% rownames_to_column(" ") %>%
  kable("html", caption = "Table 3: Logistic Regression Estimates: Effect of gender of citing author on gender of cited authors (1=female)")  %>%
  as.character() %>%
  display_html()

Unnamed: 0,APSR,Politics & Gender,Political Analysis,Econometrica,Soc. Methods & Res.,Pooled
Intercept,-2.07 (0.05),-0.01 (0.11),-2.84 (0.09),-3.18 (0.06),-2.46 (0.1),-2.02 (0.05)
Female,0.99 (0.16),0.53 (0.12),0.42 (0.38),1.14 (0.22),0.76 (0.28),0.86 (0.1)
Mixed,0.21 (0.13),-0.15 (0.16),-0.08 (0.16),0.07 (0.14),0.06 (0.18),0.11 (0.08)
P&G,,,,,,1.73 (0.1)
PA,,,,,,-0.89 (0.09)
Econ,,,,,,-1.14 (0.07)
SMR,,,,,,-0.47 (0.1)
Pseudo R2,-0.026,-0.0165,-7e-04,-0.0106,-0.0078,-0.2796
NullLL,-6359,-4007,-1249,-1951,-1185,-18566
LL,-6198,-3942,-1248,-1931,-1175,-14509
