# AI Ethics Researchers

This script creates a search query for Scopus using the IDs of authors who have published at either FAccT or AIES based on bibliographic data from Scopus. It serves as an alternative to the API search capability

+ **Input data:** Scopus metadata in csv format  
+ **Output data:** Search query in txt file for Scopus advanced search function

Developed by Tyler Reinmund  
Date: 8 June 2021

Department of Science and Technology Studies  
University College London

In [1]:
# Import libraries
import glob
import pandas as pd
import re
import string
import sys
import os

In [8]:
# Import Scopus data: FAccT and AIES publications
in_path = '' #Update file path here
all_files = glob.glob(in_path + "/*.csv")
li = []

for filename in all_files:
    df = pd.read_csv(filename, index_col=None, header=0)
    li.append(df)

full_df = pd.concat(li, axis=0, ignore_index=True, sort=False)

full_df.shape

(449, 452)

In [9]:
# Drop articles without abstracts, titles, keywords, authors, or author ID
df_clean = full_df.dropna(axis=0, subset=['Abstract', 'Title', 'Author Keywords', 'Authors', 'Author(s) ID'])

df_clean.shape

(364, 452)

In [10]:
# Create list of author IDs from "Author(s) ID" column
id_list = df_clean['Author(s) ID'].to_list()

# Concatenate articles
id_list = ''.join(id_list)

# Split each author ID into separate string
id_list = id_list.split(';')

# Remove duplicates
id_list = list(set(id_list))

# Remove any invalid entries
id_list = [x for x in id_list if x.isdigit()]

len(id_list)

996

In [12]:
# Create search query for Scopus search
id_search = ['AU-ID({}) OR '.format(s) for s in id_list] #List comprehension to include search terms for author ID 'AU-ID(string) OR '

# Concatenate all items into strings with 50 author IDs each for search
n = 50

# List comprehension to break list into sublists with 50 elements each
id_separated = [id_search[i * n:(i + 1) * n] for i in range((len(id_search) + n - 1) // n)]

# For loop to save each element as a string file
out_path = '' # Update file path here to save search strings

for i in range(len(id_separated)):
    name = 'auth_id_search_{}'.format(i)
    filename = '%s.txt' % name
    sub_list = ''.join(id_separated[i])
    sub_list = sub_list[:-4:]
    print(sub_list, file=open(os.path.join(out_path, filename), 'w'))