# Second Author Title and Abstract Review

**Author:** Jack Galbraith-Edge

In this notebook, I will:
- Merge all results into one large results dataframe
- Create sample dataframes for second author review.
- Export these dataframes to CSVs for review.
- Calculate Kappa values for first and second author title and abstract agreement.

In [2]:
from msc_code.scripts.helpers import *
from msc_code.scripts.notebook_setup import *

In [3]:
# Import google scholar and database search data
google_df = pd.read_csv("/".join([PROC_DATA_DIR, "cleaned_google_scholar_title_abstract_screen_end_1.csv"]))
database_df = pd.read_csv("/".join([PROC_DATA_DIR, "database_search_title_abstract_screen_end.csv"]))

In [8]:
# Import google scholar and database search data
google_df = pd.read_csv("/".join([PROC_DATA_DIR, "cleaned_google_scholar_title_abstract_screen_end_1.csv"]))
database_df = pd.read_csv("/".join([PROC_DATA_DIR, "database_search_title_abstract_screen_end.csv"]))

In [9]:
# Merge all Google Scholar and Database search results
all_df = pd.merge(
    database_df, google_df,
    how='outer',
    on=["Publication Year", 
        "First Author", 
        "Authors",
        "Publication Title",
        "Title",
        "Abstract",
        "Database",
        "Exclude",
        "Reason ID",
        "DOI"]
)

# Specify Desired Column Order
desired_column_order = ["Publication Year", "First Author", "Authors", "Publication Title", "Summary", "Database", "DOI", "URL", "Title", "Abstract", "Abstract2", "Exclude", "Reason ID"]

# Apply to dataframe
all_df = all_df[desired_column_order]

# Fix capitalisation on of fields
all_df["Title"] = all_df["Title"].str.title()
all_df["First Author"] = all_df["First Author"].str.title()
all_df["Authors"] = all_df["Authors"].str.title()
all_df["Abstract"] = all_df["Abstract"].str.title()

# Sort by Publication Year ascending.
all_df = all_df.sort_values(by="Publication Year", ascending=True).reset_index(drop=True)

all_df.index.name = 'id' # Name the index 'id', giving each paper a unique ID.

In [10]:
# Export all JGE title and abstract reviewed articles
all_df.to_csv("/".join([PROC_DATA_DIR, "all_results_title_abstract_reviewed_jge_end.csv"]))


In [11]:
# Create clean dataframe to export for other author review
clean_df = all_df

# Remove reasons for exlusion and previously marked exclusions for clean export and second author review.
clean_df['Exclude'] = None
clean_df['Reason ID'] = None

# Export to CSV
clean_df.to_csv("/".join([RAW_DATA_DIR, "all_results_title_abstract_start.csv"]))

In [12]:
# Create sample of 10% of papers
clean_sample_df = create_results_sample(clean_df, 0.1, 42) # '42' argument specificies random number to great sample from

# export clean data to raw data directory as ms_title_abstract review start.csv
clean_sample_df.to_csv("/".join([RAW_DATA_DIR, "ms_title_abstract_review_start.csv"]))

Given sample size: 495
Calculating desired sample size... 10% of 495 = 50
Creating Sample Dataframe


In [13]:
# Import second author (MS) Title Abstract Review Data
ms_reviewed_df = pd.read_csv("/".join([RAW_DATA_DIR, "ms_title_abstract_review_end.csv"]))

# Check what values are present in Exclude column
print(ms_reviewed_df['Exclude'].unique())


[ True False]


In [14]:
# Import title and abstract review results from JGE.
jge_reviewed_df = pd.read_csv("/".join([PROC_DATA_DIR, "all_results_title_abstract_reviewed_jge_end.csv"]))

# Rename columns before merge with second author (MS) review data.
jge_reviewed_df = jge_reviewed_df.rename(columns={
    'Exclude': 'Exclude_JGE',
    'Reason ID': 'Reason ID_JGE'
})

In [15]:
# Drop columns in ms_reviewed_df that are redundant in jge_reviewed_df (except for columns to add)
ms_reviewed_df = ms_reviewed_df[['id', 'Exclude', 'Reason ID']]

# Rename columns in the reviewed dataset
ms_reviewed_df.rename(columns={
    'Exclude': 'Exclude_MS', 
    'Reason ID': 'Reason ID_MS'
    }, 
    inplace=True
)

# Merge the datasets rightwards, so only papers review by MS and JGE are together.
jge_ms_reviewed_df = pd.merge(jge_reviewed_df, ms_reviewed_df, on='id', how='right')

In [16]:
# See how many papers JGE and MS agree on.
jge_ms_agree = jge_ms_reviewed_df[jge_ms_reviewed_df['Exclude_JGE'] == jge_ms_reviewed_df['Exclude_MS']] # create JGE/MS agreement dataframe
print(f"JGE and MS agree on {len(jge_ms_agree)}/{len(jge_ms_reviewed_df)} reviewed by MS.")

jge_ms_disagree = jge_ms_reviewed_df[jge_ms_reviewed_df['Exclude_JGE'] != jge_ms_reviewed_df['Exclude_MS']] # create JGE/MS disagreement dataframe
print(f"JGE and MS disagree on {len(jge_ms_disagree)}/{len(jge_ms_reviewed_df)} review by MS.") 

# Export to CSV
jge_ms_disagree.to_csv("/".join([RAW_DATA_DIR, "jge_ms_disagree_title_abstract.csv"]), index=False) # Export JGE/MS disagreements
jge_ms_agree.to_csv("/".join([RAW_DATA_DIR, "jge_ms_agree_title_abstract.csv"]), index=False) # Export JGE/MS aggreements

# Calculate Cohen's Kappa
jge_include = jge_ms_reviewed_df[jge_ms_reviewed_df['Exclude_JGE'] == False] # create dataframe of ms and jge reviewed articles that jge wishes to include
jge_exclude = jge_ms_reviewed_df[jge_ms_reviewed_df['Exclude_JGE'] == True] # create dataframe of ms and jge reviewed articles that jge wishes to exclude
ms_include = jge_ms_reviewed_df[jge_ms_reviewed_df['Exclude_MS'] == False]  # create dataframe of ms and jge reviewed articles that ms wishes to include
ms_exclude = jge_ms_reviewed_df[jge_ms_reviewed_df['Exclude_MS'] == True]   # create dataframe of ms and jge reviewed articles that ms wishes to exclude

ms_reviewed_count = len(ms_reviewed_df) # calculate number of papers review by JGE and MS
jge_ms_agree_count = len(jge_ms_agree) # calculate number of papers JGE and MS agree on

jge_include_count = len(jge_include) # calculate number of JGE inclusions
jge_exclude_count = len(jge_exclude) # calculate number of JGE exclusions
ms_include_count = len(ms_include) # calculate number of MS inclusion
ms_exclude_count = len(ms_exclude) # calculate number of MS exclusions

# Observer agreement
P_o = jge_ms_agree_count / ms_reviewed_count

# Expected agreement
P_e =   ((jge_include_count / ms_reviewed_count) * (ms_include_count / ms_reviewed_count)) + \
        ((jge_exclude_count / ms_reviewed_count) * (ms_exclude_count / ms_reviewed_count))

# Cohen's Kappa
kappa = (P_o - P_e) / (1 - P_e)

print(f"Cohen's Kappa: {kappa:.3f}")

JGE and MS agree on 36/50 reviewed by MS.
JGE and MS disagree on 14/50 review by MS.
Cohen's Kappa: 0.467


At this point, the disagreements were then sent to my supervisor (GC) for a third review

In [17]:
gc_reviewed = pd.read_csv("/".join([RAW_DATA_DIR, "jge_ms_disagree_title_abstract_gc_comments.csv"]))

In [18]:
gc_reviewed

Unnamed: 0,id,Publication Year,First Author,Authors,Publication Title,Summary,Database,DOI,URL,Title,Abstract,Abstract2,Exclude_JGE,Reason ID_JGE,Exclude_MS,Reason ID_MS,Unnamed: 16,GC,Unnamed: 18
0,73,2002,"Moshfeghi, D.","Moshfeghi, Darius M.; Wilson, Matthew W.; Mart...",Ophthalmic Plastic And Reconstructive Surgery,,PubMed,10.1097/00002341-200205000-00011,,Swallowed Ocular Prostheses: Report Of Three C...,Q1211,,False,,True,3.0,,Unclear,No abstract - would need to see full paper
1,204,2014,"Kim, A.","Kim, Andrew; Ahn, Kang-Min",Implant Dentistry,,PubMed,10.1097/id.0000000000000100,,Endoscopic Removal Of An Aspirated Healing Abu...,Purpose: Aspiration Of Foreign Bodies During D...,,False,,True,3.0,,Exclude,"1. Not oral ingestion, 2. Not clearly intentio..."
2,55,1997,"Ozanne-Smith, A.","Ae Altmann, J Ozanne-Smith",Injury Prevention,"AE Altmann, J Ozanne-Smith - Injury Prevention...",Google Scholar,10.1136/ip.3.3.176,https://injuryprevention.bmj.com/content/3/3/1...,Non-Fatal Asphyxiation And Foreign Body Ingest...,Objectives: To Examine The Frequency And Natur...,OBJECTIVES: To examine the frequency and natur...,False,,True,3.0,,Unclear,"Likely to be accidental , but might include so..."
3,320,2019,"Benoist, L.","Benoist, Lbl; Van Der Hoven, B; De Vries, Ac; ...",Acta Oto-Laryngologica Case Reports,,Web of Science,10.1080/23772484.2018.1555436,,A Jackass And A Fish: A Case Of Life-Threateni...,Inspired By Jackass (A Tv-Show About Self-Inju...,,False,,True,3.0,,Exclude,Not intentional self-harm (it was an intention...
4,250,2016,"Lim, C.","Lim, Chun Woo; Park, Min Hwan; Do, Hyun Jeong;...","Pediatric Gastroenterology, Hepatology & Nutri...",,PubMed,10.5223/pghn.2016.19.3.168,,Factors Associated With Removal Of Impactted F...,Purpose: The Management And Clinical Course In...,,False,,True,3.0,,Exclude,Not intentional
5,450,2023,"Alturkmani, O.","Alturkmani, Omar G.; Al-Badawi, Maysa M.; Altu...",Cureus,,PubMed,10.7759/cureus.37684,,A Case Report Of Non-Intentional Foreign Body ...,Foreign Body Ingestion Is A Common Problem Tha...,,False,,True,3.0,,Exclude,Not intentional
6,287,2018,"Sola, R.","Sola, Richard Jr; Rosenfeld, Eric H.; Yu, Yang...",Journal Of Pediatric Surgery,,PubMed,10.1016/j.jpedsurg.2017.08.013,,Magnet Foreign Body Ingestion: Rare Occurrence...,Purpose: To Review The Outcomes Of Magnet Inge...,,False,,True,3.0,,Unclear,"Likely to be accidental , but might include so..."
7,72,2001,"Vicente, Y.","Vicente, Y.; Hernandez-Peredo, G.; Molina, M.;...",Journal Of Pediatric Surgery,,PubMed,10.1053/jpsu.2001.26381,,Acute Food Bolus Impaction Without Stricture I...,Background/Purpose: This Report Identifies A G...,,False,,True,2.0,,Exclude,"Not intentional, food bolus"
8,77,2003,"Janik, J.","Janik, Je; Janik, Js",Journal Of Pediatric Surgery,,Web of Science,10.1053/jpsu.2003.50049,,Magill Forceps Extraction Of Upper Esophageal ...,Background/Purpose: This Study Demonstrates A ...,,False,,True,3.0,,Unclear,Could well include intentional coin ingestion
9,438,2023,"Raju, R.","Raju, Rajkiran S.; Raj, A. John Michael; Shubh...",Pediatric Surgery International,,PubMed,10.1007/s00383-022-05360-3,,Impacted Esophageal Foreign Bodies In Children,Purpose: Foreign Body (Fb) Ingestion In Childr...,,False,,True,3.0,,Unclear,Could well include intentional foreign object ...
