# Impersonate and change author

If you want to replace every instance of an author in a list of records, this script will impersonate the person's account, gather all their records and replace their author. Their account will still own the records.

If you unpublish the items first, they won't get a DOI version.


In [1]:
import json
import requests

In [4]:
#Set the token in the header and base URL

text_file = open("./././testing-token.txt", "r") #Paste your token in a text file and save it where this notebook is
TOKEN = text_file.read()
TOKEN.strip() #removes any hidden spaces
text_file.close()


api_call_headers = {'Authorization': 'token ' + TOKEN}


#Get the author info from this endpoint: https://docs.figshare.com/#private_institution_accounts_list
account_id = ENTER ACCOUNT ID #This is 'id' in the endpoint output and is the account id for impersonation
user_id = ENTER USER ID #this is the user_id in the endpoint output and will be replaced by the new user (author) id later 
 


#Set the base URL
BASE_URL = 'https://api.figshare.com/v2' #Change this to 'https://api.figsh.com/v2' if you want to test in stage

## Get a list of article ids

Get a list of all the articles the account owns.

In [8]:
#Gather all the article ids

s = requests.get(BASE_URL + '/account/articles?page=1&page_size=20&impersonate=' + str(account_id), headers=api_call_headers) 
metadata=json.loads(s.text)

article_ids = []

for i in metadata:
    if i['published_date'] != None: #if a record has a published date
        article_ids.append(i['id'])

print(len(article_ids),'item ids collected')


15 item ids collected


## Retrieve Author information
The author info is in the same order as the item id list- This is important!

In [11]:
#Creates a list dictionaries with the key as the item id and value as a dictionary of authors. 

author_list = []
published_items = []

for item_id in article_ids: 
    s=requests.get(BASE_URL + '/account/articles/' + str(item_id), headers=api_call_headers)
    metadata=json.loads(s.text)
    if metadata['status'] == 'public': #if a record has a published date
        published_items.append(metadata['id'])
        authors = metadata['authors']
        author_id_list = []
        for a in authors:
            id_list = {key: a[key] for key in a.keys() & {'id'}} #extracts just the author ids
            author_id_list.append(id_list)
        author_dict = {}
        author_dict['authors'] = author_id_list
        author_dict['impersonate'] = account_id
        author_list.append(author_dict)

print(len(author_list),'author sets collected')
print('Unpublish these items:',published_items)

12 author sets collected
Unpublish these items: [8324958, 8283932, 8250362, 7658738, 7183908, 7183888, 7183886, 6969378, 6968984, 6968976, 6965540, 5735939]


#### Copy and paste the item ids above (ignore the square brackets) and paste into the unpublish field in the administrator page.
That will unpublish them and when they are republished below, they will not be DOI versioned

## Change id for the appropriate author

In [12]:
new_user_id = ENTER USER ID #this is the user_id in the endpoint output


In [13]:
#Make the author id changes to prepare for upload
for i in author_list: #author_list is a list of dictionaries that contain a list of dictionaries
    for j in i['authors']: #for each dictionary in the list of author dictionaries
        for ids, a_id in j.items(): # for each key value pair 
            if a_id == user_id: #if the value matches the author id/user id
                j[ids] = new_user_id #replace with the new id


In [14]:
#See the format of the author list
author_list

[{'authors': [{'id': 1448415}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}, {'id': 1002514}, {'id': 1448413}],
  'impersonate': 1128601},
 {'authors': [{'id': 1448415}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}, {'id': 2155540}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}, {'id': 1448413}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}], 'impersonate': 1128601},
 {'authors': [{'id': 1448415}], 'impersonate': 1128601}]

# Then upload the new author(s)

This assumes the lists are ordered in the same way!

In [76]:
#Change and publish records

record_change_fails = []
record_publish_fails = []
success_count = 0
item_count = 0

for i in article_ids:
    
    #Change the authors
    author_data =  author_list[item_count]
    item_count += 1
    #format as json
    author_json = json.dumps(author_data)
    
    s = requests.put(BASE_URL + '/account/articles/' + str(i), headers=api_call_headers, data = author_json) 
    if s.status_code != 205:
        record_change_fails.append(str(s.content[0:75])) #Add failed index to list with partial description
    else:
        #Publish the record
        body = '{"impersonate":' + str(account_id) + '}'
        u = requests.post(BASE_URL + '/account/articles/' + str(i) +'/publish', headers=api_call_headers, data = body)
        if u.status_code != 201:
            record_publish_fails.append(str(u.content[0:75])) #Add failed index to list with partial description
        else:
            success_count += 1
        
print(success_count,'records published. ', len(record_change_fails),'author change fails. ',len(record_publish_fails),'publish fails.')
print('Change failure details:',record_change_fails)
print('Publish failure details:',record_publish_fails)
      
      
      

0 records published.  0 author change fails.  2 publish fails.
Change failure details: []
Publish failure details: ['b\'{"message": "Missing mandatory value: Did you add a README file?", "code": \'', 'b\'{"message": "Missing mandatory value: Did you add a README file?", "code": \'']


#### To publish all the records at once and automate the review process, delete the code that publishes the record. Use the Admin Batch Management tool to download the private metadata for your repository. Delete all rows except the drafts you'd like to publish. Re- upload that set of metadata through the Batch Management tool and select publish and automatically review.