<a href="https://colab.research.google.com/github/sgbaird/CRediT-statement/blob/main/credit_statement.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CRediT Author Statement Generator

⚠️ WARNING: Due to lack of template sharing support by Google Forms and lack of needed features from other free platforms like Microsoft Forms, **I am sharing the Google Form as a publicly editable template, so please do not edit the form directly**. This is a non-ideal workaround. Here are the instructions for making your own copy that you can edit, own, and share:    
1. Open `CRediT-template` from [this GDrive folder](https://drive.google.com/drive/folders/1foy9rKJ5BbJVc0YkywhEY-sk7Q9L5MYT?usp=sharing)
2. If not logged in, sign in to your Google account
3. Click on the three vertical dots at the top-right of the screen and click `Make a Copy`
4. Save to somewhere only you have access to
5. *Within the copy*, replace `MANUSCRIPT TITLE` with your own title and delete the warning text
6. Follow the rest of the instructions in the Google Colab notebook
7. Go to `Responses` tab and click `Link to Sheets`
8. Click the `Share` button
9. Within `General Access` section, click the dropdown and change the permissions to `Anyone with the link`
10. Copy the link and paste below (this is the `sheets_url`)

In [3]:
# Original Google Sheets URL
sheets_url = "https://docs.google.com/spreadsheets/d/1npol09kvtr98gzFP92NpFFUwJT2zusynysnNWzE-mPk/edit?usp=sharing" # @param {type:"string"}

In [54]:
# Construct the Google Sheets export URL as CSV
export_url = sheets_url.replace("/edit?usp=sharing", "/export?format=csv")
print(export_url)

# import the CSV into a Pandas DataFrame
import pandas as pd

categories = ["project", "ideas", "results", "writing"]
raw_df = pd.read_csv(export_url)

raw_df.rename(
    columns={
        "Timestamp": "timestamp",
        "Email Address": "email",
        "Write your name below (e.g. John C. Doe):": "name",
        "Project-Level": "project",
        "Ideas/Methods": "ideas",
        "Results": "results",
        "Writing/Visualization": "writing",
        "Please list any funding sources that need to be acknowledged in the manuscript.": "funding",
        "Please list any conflicts of interest that need to be included in the manuscript.": "conflicts",
        "Comments": "comments",
    },
    inplace=True,
)

raw_df

https://docs.google.com/spreadsheets/d/1npol09kvtr98gzFP92NpFFUwJT2zusynysnNWzE-mPk/export?format=csv


Unnamed: 0,timestamp,email,name,project,ideas,results,writing,Comments (optional)
0,9/5/2023 15:42:25,sterling.baird@utoronto.ca,Sterling G. Baird,Supervision: Oversight and leadership responsi...,Conceptualization: Ideas; formulation or evolu...,Validation: Verification whether as a part of ...,Writing - Review & Editing: Preparation and/or...,abc


In [72]:
# Helper function for parsing the Google Sheets strings
def extract_keys(entry):
    """Extracts keys from a string of the form 'key1: value1, key2: value2, ...'

    Parameters
    ----------
    entry : str
        A string of the form 'key1: value1, key2: value2, ...'

    Returns
    -------
    list of str
        A list of keys extracted from the input string.

    Examples
    --------
    >>> entry = 'Supervision: Oversight and leadership responsibility for the research activity planning and execution or including mentorship external to the core team, Project administration: Management and coordination responsibility for the research activity planning and execution, Funding acquisition: Acquisition of the financial support for the project leading to this publication'
    >>> extract_keys(entry)
    ['Supervision', 'Project administration', 'Funding acquisition']
    """
    if isinstance(entry, str):
        keys = [e.split(":")[0].strip() for e in entry.split(",")]
        return keys
    
    return []

# Apply extract_keys to every row in the projects, ideas, results, and writing columns
df = raw_df.copy()
for col in categories:
    df[col] = df[col].apply(extract_keys)

# concatenate the category columns into a single column
df["categories"] = df[categories].apply(lambda x: x.sum(), axis=1)

# drop rows with no categories
df = df[df["categories"].apply(len) > 0]

# # sort the rows so that authors with the most categories are listed first
# df["num_categories"] = df["categories"].apply(len)
# df.sort_values(by=["num_categories", "name"], ascending=False, inplace=True)

# sort the rows by author last name
df["last_name"] = df["name"].apply(lambda x: x.split()[-1])
df.sort_values(by=["last_name", "name"], inplace=True)

Unnamed: 0,timestamp,email,name,project,ideas,results,writing,Comments (optional),categories,num_categories
0,9/5/2023 15:42:25,sterling.baird@utoronto.ca,Sterling G. Baird,"[Supervision, Project administration, Funding ...","[Conceptualization, Methodology, Software]","[Validation, Formal analysis]",[Writing - Review & Editing],abc,"[Supervision, Project administration, Funding ...",9


In [71]:
# Create the final authorship statement 
authorship_statement = ''
for i, row in df.iterrows():
    authorship_statement += row['name'] + ': ' + ', '.join(row['categories']) + '. '

# Print the final authorship statement
print(authorship_statement)

Sterling G. Baird: Supervision, Project administration, Funding acquisition, Conceptualization, Methodology, Software, Validation, Formal analysis, Writing - Review & Editing. 


In [69]:
# Create the TeX-friendly authorship statement
authorship_statement = '\\begin{section}\n'
for i, row in df.iterrows():
    authorship_statement += '\\textbf{' + row['name'] + '}: ' + ', '.join(row['categories']) + '. '
authorship_statement = authorship_statement[:-1] + '\n\\end{section}'

# Print the final authorship statement
print(authorship_statement)

# write to file
with open('credit-statement.tex', 'w') as f:
    f.write(authorship_statement)

\begin{section}
\textbf{Sterling G. Baird}: Supervision, Project administration, Funding acquisition, Conceptualization, Methodology, Software, Validation, Formal analysis, Writing - Review & Editing.
\end{section}
