# Notebook to parse faculty bios

Requirements:
    
    !pip install pandas
    !pip install openai
    !pip install wordcloud

Also requires [OpenAI account](https://openai.com/) and [OpenAI API key](https://platform.openai.com/account/api-keys). The key must be placed in a text file and pointed 
to in the setup cell (open_ai_inpath) below. Alternatively, store API key in os environment variable and use the commented out snippet in setup below.

Must first run scrape_biox.ipynb

License: [CC BY-NC 3.0](https://creativecommons.org/licenses/by-nc/3.0/)


### Setup

In [1]:
import pandas as pd
import time
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import ast

from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain_openai import ChatOpenAI

# set paths
bio_inpath = 'data/all_bios_raw.json'
json_outpath = 'data/all_bios_parsed.json'

# NOTE: TO USE langchain, you must set your OpenAI API key in the environment variable OPENAI_API_KEY

# set api key from: d:/AI-Projects/secrets/SECRETS-OPENAI.txt
# read txt file
with open('d:/AI-Projects/secrets/SECRETS-OPENAI.txt', 'r') as file:
    api_key = file.read()
# set api key
import os
os.environ['OPENAI_API_KEY'] = api_key


### Load faculty bios

To update scraped data, run scrape_bios.ipynb first.

In [2]:
# Load data
df = pd.read_json(bio_inpath)
df

Unnamed: 0,faculty,name,title,email,href,bio,listed_research_areas
0,osgoode,Rabiat Akande,Assistant Professor,rakande@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Rabiat Akande works in the fields of...,legal history; law and religion; constitutiona...
1,osgoode,Saptarishi Bandopadhyay,Associate Professor,sbandopadhyay@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,I am an Associate Professor at Osgoode Hall La...,Law; history; and politics of Disasters; Inter...
2,osgoode,Stephanie Ben-Ishai,Professor and York University Distinguished Re...,sbenishai@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Stephanie Ben-Ishai is a Distinguish...,Corporate/Commercial Law
3,osgoode,Benjamin L. Berger,Professor & York Research Chair in Pluralism a...,bberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Benjamin L. Berger is Professor and ...,Law and Religion; Criminal and Constitutional ...
4,osgoode,Kate Glover Berger,Associate Professor,kgberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Kate Glover Berger joined the facult...,
...,...,...,...,...,...,...,...
370,uottawa-civil,David Robitaille,Vice-doyen aux études et professeur titulaire,david.robitaille@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,David Robitaille est professeur titulaire à la...,
371,uottawa-civil,Terry Skolnik,Professeur agrégé,tskolnik@uottawa.ca,https://uniweb.uottawa.ca/members/4305/profile,Terry Skolnik is an associate professor (tenur...,
372,uottawa-civil,Marie-Eve Sylvestre,"Doyenne, professeure titulaire",Marie-Eve.Sylvestre@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Marie-Eve Sylvestre est doyenne et professeure...,
373,uottawa-civil,Sophie Thériault,Professeure titulaire,Sophie.Theriault@uOttawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Sophie Thériault est professeure titulaire et ...,


In [3]:
# Standardize listed research areas
def standardize_research_areas(research_areas):

    if not research_areas:
        return None

    # convert comas to semicolons
    research_areas = research_areas.replace(',', ';')
    
    # split research areas on ; and remove trailing spaces
    research_areas = [x.strip() for x in research_areas.split(';')]

    # capitalize research_areas using titles
    research_areas = [x.title() for x in research_areas]
    
    # convert / to and
    research_areas = [x.replace('/', ' and ') for x in research_areas]
    
    # convert & to and
    research_areas = [x.replace('&', ' and ') for x in research_areas]

    # join into string
    research_areas = '; '.join(research_areas)

    # fix capitalization
    research_areas = research_areas.replace('And', 'and')
    research_areas = research_areas.replace('Of', 'of')
    research_areas = research_areas.replace('In', 'in')
    research_areas = research_areas.replace('For', 'for')
    research_areas = research_areas.replace(' The ', ' the ')
    research_areas = research_areas.replace(' To ', ' to ')

    # remove multiple spaces
    research_areas = ' '.join(research_areas.split())    

    return research_areas

df['listed_research_areas'] = df['listed_research_areas'].apply(standardize_research_areas)

# Convert listed research areas to a list
df['listed_research_areas'] = df['listed_research_areas'].apply(lambda x: x.split(';') if x else None)

df


Unnamed: 0,faculty,name,title,email,href,bio,listed_research_areas
0,osgoode,Rabiat Akande,Assistant Professor,rakande@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Rabiat Akande works in the fields of...,"[Legal History, Law and Religion, Constituti..."
1,osgoode,Saptarishi Bandopadhyay,Associate Professor,sbandopadhyay@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,I am an Associate Professor at Osgoode Hall La...,"[Law, History, and Politics of Disasters, i..."
2,osgoode,Stephanie Ben-Ishai,Professor and York University Distinguished Re...,sbenishai@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Stephanie Ben-Ishai is a Distinguish...,[Corporate and Commercial Law]
3,osgoode,Benjamin L. Berger,Professor & York Research Chair in Pluralism a...,bberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Benjamin L. Berger is Professor and ...,"[Law and Religion, Criminal and Constitutiona..."
4,osgoode,Kate Glover Berger,Associate Professor,kgberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Kate Glover Berger joined the facult...,
...,...,...,...,...,...,...,...
370,uottawa-civil,David Robitaille,Vice-doyen aux études et professeur titulaire,david.robitaille@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,David Robitaille est professeur titulaire à la...,
371,uottawa-civil,Terry Skolnik,Professeur agrégé,tskolnik@uottawa.ca,https://uniweb.uottawa.ca/members/4305/profile,Terry Skolnik is an associate professor (tenur...,
372,uottawa-civil,Marie-Eve Sylvestre,"Doyenne, professeure titulaire",Marie-Eve.Sylvestre@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Marie-Eve Sylvestre est doyenne et professeure...,
373,uottawa-civil,Sophie Thériault,Professeure titulaire,Sophie.Theriault@uOttawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Sophie Thériault est professeure titulaire et ...,


### Use ChatGPT to get consolidated research areas from bios

In [4]:
list_manual_research_areas = [
    "Aboriginal Law",
    "Academic Freedom",
    "Access to Justice",
    "Administrative Law",
    "Animal Law",
    "Anti-Discrimination",
    "Decolonization",
    "Antitrust Law",
    "Anti-Terrorism",
    "Artificial Intelligence",
    "Bail",
    "Banking Law",
    "Bankruptcy and Insolvency",
    "Bioethics",
    "Border Control",
    "Business Associations",
    "Charter of Rights and Freedoms",
    "Children's Rights",
    "Civil Litigation",
    "Civil Procedure",
    "Class Actions",
    "Climate Change",
    "Clinical Legal Education",
    "Commercial Law",
    "Communications Law",
    "Comparative Law",
    "Competition Law",
    "Computational Law",
    "Constitutional Law",
    "Construction Law",
    "Consumer Protection",
    "Contracts",
    "Corporate Governance",
    "Corporate Law",
    "Corporate Social Responsibility",
    "Criminal Law",
    "Criminal Procedure",
    "Critical Race Theory",
    "Disability Law",
    "Disinformation",
    "Dispute Resolution",
    "E-commerce Law",
    "Elder Law",
    "Election Law",
    "Employment Law",
    "Empirical Legal Studies",
    "Energy Law",
    "Entertainment Law",
    "Environmental Law",
    "Estates and Trusts",
    "Equality",
    "Evidence",
    "Family Law",
    "Federalism",
    "Feminist Legal Theory",
    "Financial Regulation",
    "Food Law",
    "Freedom of Expression",
    "Freedom of Information",
    "Gender and the Law",
    "Health Law",
    "Housing",
    "Human Rights Law",
    "Immigration and Refugee Law",
    "Indigenous Law",
    "Insurance Law",
    "Intellectual Property",
    "International Arbitration",
    "International Business Law",
    "International Criminal Law",
    "International Environmental Law",
    "International Human Rights Law",
    "International Law",
    "International Organizations",
    "International Trade Law",
    "Islamic Law",
    "Judicial Decision Making",
    "Judicial Review",
    "Jurisprudence",
    "Labour Law",
    "Land Use and Zoning Law",
    "Language Rights",
    "Law and Art",
    "Law and Development",
    "Law and Film",
    "Law and Globalization",
    "Law and Economics",
    "Law and Literature",
    "Law and Religion",
    "Law and Society",
    "Law and Technology",
    "Legal Anthropology",
    "Legal Education",
    "Legal Ethics",
    "Legal History",
    "Legal Information Technology",
    "Legal Philosophy",
    "Legal Pluralism",
    "Legal Reasoning",
    "Legal Theory",
    "Legal Writing",
    "LGBTQ+ Rights",
    "Maritime Law",
    "Media Law",
    "Mergers and Acquisitions",
    "Migrant Workers",
    "Migration",
    "Military Law",
    "Movement Lawyering",
    "Municipal Law",
    "Natural Resources Law",
    "National Security Law",
    "Negotiation",
    "Pensions",
    "Policing",
    "Political Philosophy",
    "Poverty Law",
    "Privacy Law",
    "Private International Law",
    "Professional Responsibility",
    "Property",
    "Public International Law",
    "Public Law",
    "Race and the Law",
    "Real Estate",
    "Regulatory Law",
    "Reproductive Rights",
    "Rule of Law",
    "Securities",
    "Sentencing",
    "Sexuality and the Law",
    "Social Justice",
    "Social Movements",
    "Space Law",
    "Sports Law",
    "Statutory Interpretation",
    "Tax Law",
    "Talmudic Law",
    "Third World Approaches to International Law",
    "Torts",
    "Transnational Law",
    "Trial Advocacy"
]

# create a new dictionary with the research areas as a key and the list index as the value
research_areas_dict = {research_area: i for i, research_area in enumerate(list_manual_research_areas)}


In [5]:

def get_extracted_research_areas(row):
    if not row['bio']:
        return None
    else: 
        #print ('Getting :', row['name'])  # Uncomment to see progress
        response = chat(chat_prompt.format_prompt(bio = row['bio'][:10000], research_areas_dict = research_areas_dict).to_messages()).content
        try:
            response = ast.literal_eval(response)     
            mapped_response = [inverted_research_areas_dict[i] for i in response]
            time.sleep(0.25)
            return mapped_response
        except SyntaxError:
            return None

# invert key value pairs
inverted_research_areas_dict = {v: k for k, v in research_areas_dict.items()}

# Set up the chat prompt using langchain and openai
chat = ChatOpenAI(temperature=0, model = "gpt-4-0125-preview", max_tokens = 200, request_timeout = 30, max_retries=10)

system_message = "You are an automated list consolidation system. "\
    "You follow instructions precisely and literally, step by step. You only return a list of integers." 
system_message_prompt = SystemMessagePromptTemplate.from_template(system_message)

human_message = "*Faculty bio*: {bio} \n"\
        "*Consolidated dictionary*: {research_areas_dict} \n"\
        "*Task*: Return a valid python list with the value associated with each key that is implicitly or explicitly included "\
        "as an area of research expertise in the faculty bio. \n"\
        "*Return list*: "
human_message_prompt = HumanMessagePromptTemplate.from_template(human_message)

chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])


In [6]:
##############################################################################
########### CAREFUL: INCURS COSTS TO OPENAI API #################################
##############################################################################

# same as above but with progress bar
from tqdm import tqdm
tqdm.pandas()
df['extracted_research_areas'] = df.progress_apply(get_extracted_research_areas, axis=1)
df

  warn_deprecated(
100%|██████████| 375/375 [10:29<00:00,  1.68s/it]


Unnamed: 0,faculty,name,title,email,href,bio,listed_research_areas,extracted_research_areas
0,osgoode,Rabiat Akande,Assistant Professor,rakande@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Rabiat Akande works in the fields of...,"[Legal History, Law and Religion, Constituti...","[Legal History, Law and Religion, Constitution..."
1,osgoode,Saptarishi Bandopadhyay,Associate Professor,sbandopadhyay@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,I am an Associate Professor at Osgoode Hall La...,"[Law, History, and Politics of Disasters, i...","[Climate Change, Environmental Law, Food Law, ..."
2,osgoode,Stephanie Ben-Ishai,Professor and York University Distinguished Re...,sbenishai@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Stephanie Ben-Ishai is a Distinguish...,[Corporate and Commercial Law],"[Banking Law, Bankruptcy and Insolvency, Comme..."
3,osgoode,Benjamin L. Berger,Professor & York Research Chair in Pluralism a...,bberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Benjamin L. Berger is Professor and ...,"[Law and Religion, Criminal and Constitutiona...","[Constitutional Law, Criminal Law, Evidence, L..."
4,osgoode,Kate Glover Berger,Associate Professor,kgberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Kate Glover Berger joined the facult...,,"[Administrative Law, Constitutional Law, Publi..."
...,...,...,...,...,...,...,...,...
370,uottawa-civil,David Robitaille,Vice-doyen aux études et professeur titulaire,david.robitaille@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,David Robitaille est professeur titulaire à la...,,"[Administrative Law, Constitutional Law, Envir..."
371,uottawa-civil,Terry Skolnik,Professeur agrégé,tskolnik@uottawa.ca,https://uniweb.uottawa.ca/members/4305/profile,Terry Skolnik is an associate professor (tenur...,,"[Criminal Law, Criminal Procedure, Property, L..."
372,uottawa-civil,Marie-Eve Sylvestre,"Doyenne, professeure titulaire",Marie-Eve.Sylvestre@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Marie-Eve Sylvestre est doyenne et professeure...,,"[Criminal Law, Anti-Discrimination, Poverty La..."
373,uottawa-civil,Sophie Thériault,Professeure titulaire,Sophie.Theriault@uOttawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Sophie Thériault est professeure titulaire et ...,,"[Aboriginal Law, Environmental Law, Food Law, ..."


### Try breaking out by gender (pronouns)

In [21]:
# function to extract pronouns from text

def get_pronouns(text):

    if not text:
        return "other"

    # Create a dictionary to keep track of the count of each pronoun
    pronoun_count = {"he": 0, "him": 0, "his": 0, "she": 0, "her": 0, "hers": 0, "they": 0, "them": 0, "their": 0, "theirs": 0}

    # prepare the text for analysis
    text = text.lower().split()

    # count the pronouns
    for word in text:
        if word in pronoun_count:
            pronoun_count[word] += 1

    # if no pronouns, return "other"
    if sum(pronoun_count.values()) == 0:
        return "other"

    # Get the most common pronouns
    most_common_pronoun = max(pronoun_count, key=pronoun_count.get)
    if most_common_pronoun in ["he", "him", "his"]:
        return "he/him"
    elif most_common_pronoun in ["she", "her", "hers"]:
        return "she/her"
    else:
        return "they/them"

# get pronouns for each bio using apply
df['pronouns'] = df['bio'].apply(get_pronouns)

# print counts of pronouns
print(df['pronouns'].value_counts())
print()
print()

# print ('------------------------')

# # Check they/them pronouns
# for index, row in df.iterrows():
#     if row['pronouns'] == 'they/them':
#         print(row['name'])
#         print(row['faculty'])
#         print(row['bio'])
#         print('')

# NOTE: As of March 2024, no true they/them pronouns in the data
# replace they/them pronouns with other (because first person or french)

df['pronouns'] = df['pronouns'].replace('they/them', 'other')

# print ('------------------------')

# # Check other pronouns
# for index, row in df.iterrows():
#     if row['pronouns'] == 'other':
#         print(row['name'])
#         print(row['faculty'])
#         print(row['bio'])
#         print('')

# NOTE: As of March 2024 Currently mostly correctly identifies bios without any pronouns (because no content or because first person)

# fix  pronouns 
df.loc[df['name'] == 'Ravi Malhotra', 'pronouns'] = 'he/him'  # error caused by French, but pronouns used in en bio
df.loc[df['name'] == 'Michael Geist', 'pronouns'] = 'he/him' # error caused by French, but pronouns used in en bio
df.loc[df['name'] == 'Ellen Zweibel', 'pronouns'] = 'other' # first person

# NOT FIXED BECAUSE NO PRONOUNS AVAILABLE
# Saptarishi Bandopadhyay No pronouns used in bio
# Peter Cziraki No bio
# Patricia Peppin First person
# Ruth Kuras No bio
# Margaret Liddle No bio
# Lisa Trabucco No bio
# Jula Hughes No pronouns used in bio
# Amir Attaran No pronouns used in bio
# Jeremy De Beer first person
# Sylvia Rich first person
# Penelope Simons first person
# LIST DOES NOT INCLUDE OTTAWA CIVIL LAW
# HAVE ONLY FIXED PRONOUNS WHERE ALSO SALARY AVAILABLE (see salaries.ipynb)

pronouns
he/him       171
she/her      154
other         46
they/them      2
Name: count, dtype: int64


------------------------
Saptarishi Bandopadhyay
osgoode
I am an Associate Professor at Osgoode Hall Law School. I am also a research Fellow at the Dahdaleh Institute for Global Health Research at York University, and a Senior Fellow at Melbourne University Law School. My first book, All Is Well: Catastrophe and the Making of the Normal State was published by Oxford University Press in 2022. In All Is Well, I offer a history of the mutually constitutive relationship between disasters and states during the eighteenth-century and show the enduring influence of the underlying narratives, instincts, techniques, and practices on global disaster management today. I am currently working on two book projects. The first examines the history of war, environmental degradation/disasters, and human displacement from 1860 to the present. This research is funded by a Social Sciences and Humanities

In [17]:
# drop rows with duplicate names
df = df.drop_duplicates(subset=['name'], keep='first')
df

Unnamed: 0,faculty,name,title,email,href,bio,listed_research_areas,extracted_research_areas,pronouns
0,osgoode,Rabiat Akande,Assistant Professor,rakande@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Rabiat Akande works in the fields of...,"[Legal History, Law and Religion, Constituti...","[Legal History, Law and Religion, Constitution...",she/her
1,osgoode,Saptarishi Bandopadhyay,Associate Professor,sbandopadhyay@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,I am an Associate Professor at Osgoode Hall La...,"[Law, History, and Politics of Disasters, i...","[Climate Change, Environmental Law, Food Law, ...",other
2,osgoode,Stephanie Ben-Ishai,Professor and York University Distinguished Re...,sbenishai@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Stephanie Ben-Ishai is a Distinguish...,[Corporate and Commercial Law],"[Banking Law, Bankruptcy and Insolvency, Comme...",she/her
3,osgoode,Benjamin L. Berger,Professor & York Research Chair in Pluralism a...,bberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Benjamin L. Berger is Professor and ...,"[Law and Religion, Criminal and Constitutiona...","[Constitutional Law, Criminal Law, Evidence, L...",he/him
4,osgoode,Kate Glover Berger,Associate Professor,kgberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Kate Glover Berger joined the facult...,,"[Administrative Law, Constitutional Law, Publi...",she/her
...,...,...,...,...,...,...,...,...,...
369,uottawa-civil,Jennifer Quaid,Professeure agrégée et vice-doyenne à la reche...,Jennifer.Quaid@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Jennifer Quaid est professeure agrégée et vice...,,"[Antitrust Law, Competition Law, Corporate Gov...",other
370,uottawa-civil,David Robitaille,Vice-doyen aux études et professeur titulaire,david.robitaille@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,David Robitaille est professeur titulaire à la...,,"[Administrative Law, Constitutional Law, Envir...",other
371,uottawa-civil,Terry Skolnik,Professeur agrégé,tskolnik@uottawa.ca,https://uniweb.uottawa.ca/members/4305/profile,Terry Skolnik is an associate professor (tenur...,,"[Criminal Law, Criminal Procedure, Property, L...",he/him
372,uottawa-civil,Marie-Eve Sylvestre,"Doyenne, professeure titulaire",Marie-Eve.Sylvestre@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Marie-Eve Sylvestre est doyenne et professeure...,,"[Criminal Law, Anti-Discrimination, Poverty La...",other


### Save for future use

In [18]:
# Save to json for future use
df.to_json(json_outpath, orient='records', indent = 4)

### Data verification

In [19]:
# load parsed bios
df = pd.read_json('data/all_bios_parsed.json')
df

Unnamed: 0,faculty,name,title,email,href,bio,listed_research_areas,extracted_research_areas,pronouns
0,osgoode,Rabiat Akande,Assistant Professor,rakande@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Rabiat Akande works in the fields of...,"[Legal History, Law and Religion, Constituti...","[Legal History, Law and Religion, Constitution...",she/her
1,osgoode,Saptarishi Bandopadhyay,Associate Professor,sbandopadhyay@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,I am an Associate Professor at Osgoode Hall La...,"[Law, History, and Politics of Disasters, i...","[Climate Change, Environmental Law, Food Law, ...",other
2,osgoode,Stephanie Ben-Ishai,Professor and York University Distinguished Re...,sbenishai@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Stephanie Ben-Ishai is a Distinguish...,[Corporate and Commercial Law],"[Banking Law, Bankruptcy and Insolvency, Comme...",she/her
3,osgoode,Benjamin L. Berger,Professor & York Research Chair in Pluralism a...,bberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Benjamin L. Berger is Professor and ...,"[Law and Religion, Criminal and Constitutiona...","[Constitutional Law, Criminal Law, Evidence, L...",he/him
4,osgoode,Kate Glover Berger,Associate Professor,kgberger@osgoode.yorku.ca,https://www.osgoode.yorku.ca/faculty-and-staff...,Professor Kate Glover Berger joined the facult...,,"[Administrative Law, Constitutional Law, Publi...",she/her
...,...,...,...,...,...,...,...,...,...
368,uottawa-civil,Jennifer Quaid,Professeure agrégée et vice-doyenne à la reche...,Jennifer.Quaid@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Jennifer Quaid est professeure agrégée et vice...,,"[Antitrust Law, Competition Law, Corporate Gov...",other
369,uottawa-civil,David Robitaille,Vice-doyen aux études et professeur titulaire,david.robitaille@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,David Robitaille est professeur titulaire à la...,,"[Administrative Law, Constitutional Law, Envir...",other
370,uottawa-civil,Terry Skolnik,Professeur agrégé,tskolnik@uottawa.ca,https://uniweb.uottawa.ca/members/4305/profile,Terry Skolnik is an associate professor (tenur...,,"[Criminal Law, Criminal Procedure, Property, L...",he/him
371,uottawa-civil,Marie-Eve Sylvestre,"Doyenne, professeure titulaire",Marie-Eve.Sylvestre@uottawa.ca,https://www.uottawa.ca/faculte-droit/droit-civ...,Marie-Eve Sylvestre est doyenne et professeure...,,"[Criminal Law, Anti-Discrimination, Poverty La...",other


In [12]:
# View most common research areas

results = []
for index, row in df.iterrows():
    extracted_research_areas = row['extracted_research_areas']
    if extracted_research_areas == None:
        continue
    for item in extracted_research_areas:
            results.append(item)
results = set(results)
results = sorted(results)

results_dict = {}
for result in results:
    results_dict[result] = df['extracted_research_areas'].apply(lambda x: result in x if x else False).sum()

results_df = pd.DataFrame.from_dict(results_dict, orient='index', columns=['count'])
results_df = results_df.rename_axis('research_area').reset_index()
results_df = results_df.sort_values(by=['count'], ascending=False)
results_df.head(40)

Unnamed: 0,research_area,count
61,Human Rights Law,76
27,Constitutional Law,58
33,Criminal Law,58
99,Legal Theory,53
24,Comparative Law,51
29,Contracts,44
71,International Law,43
63,Indigenous Law,40
134,Torts,38
53,Feminist Legal Theory,37


In [13]:
# print a list of names from df where area is in the list of extracted_research_areas

area = "Immigration and Refugee Law"
for index, row in df.iterrows():
    if row['extracted_research_areas'] == None:
        continue
    if area in row['extracted_research_areas']:
        print(row['name'])

Saptarishi Bandopadhyay
Amar Bhatia
Janet Mosher
Obiora Chinedu Okafor
Sean Rehaag
Audrey Macklin
Ayelet Shachar
Idil Atak
Hilary Evans Cameron
Sharry Aiken
Colin Grey
Ardi Imseis
Anneke Smit
Vasanthi Venkatesh
Jennifer Bond
Yin-Yuan Chen
Jamie Chai Yun Liew
Yves Le Bouthillier
Delphine Nakache
Joao Velloso
Joao Velloso


In [14]:
# print duplicate names
df[df.duplicated(subset=['name'], keep=False)]

Unnamed: 0,faculty,name,title,email,href,bio,listed_research_areas,extracted_research_areas,pronouns
270,uottawa-common,Kristen Boon,Susan & Perry Dellelce Dean,Decanat.CML.Dean@uottawa.ca,https://www.uottawa.ca/faculty-law/common-law/...,Kristen Boon is the inaugural Susan & Perry De...,,"[Access to Justice, Contracts, Indigenous Law,...",she/her
318,uottawa-common,Kristen Boon,Susan & Perry Dellelce Dean,Decanat.CML.Dean@uottawa.ca,https://www.uottawa.ca/faculty-law/common-law/...,Kristen Boon is the inaugural Susan & Perry De...,,"[Access to Justice, Contracts, Indigenous Law,...",she/her
335,uottawa-common,Joao Velloso,Associate Professor,joao.velloso@uOttawa.ca,https://www.uottawa.ca/faculty-law/common-law/...,João Velloso teaches sentencing and “sanctioni...,,"[Access to Justice, Administrative Law, Crimin...",he/him
374,uottawa-civil,Joao Velloso,Professeur agrégé,joao.velloso@uottawa.ca,https://www.uottawa.ca/faculte-droit/common-la...,Joao Velloso enseigne le droit des peines et d...,,"[Criminal Law, Empirical Legal Studies, Immigr...",other


In [15]:
# print extracted_research_areas where name = XXX

name = 'Sean Rehaag'
for index, row in df.iterrows():
    if row['name'] == name:
        print(row['extracted_research_areas'])
        print(row['bio'])



['Access to Justice', 'Administrative Law', 'Artificial Intelligence', 'Clinical Legal Education', 'Empirical Legal Studies', 'Immigration and Refugee Law', 'Judicial Decision Making']
Professor Sean Rehaag is the Director of the Centre for Refugee Studies and the Director of the Refugee Law Laboratory. He specializes in immigration and refugee law, administrative law, legal process, access to justice, and new legal technologies. He frequently contributes to public debates about immigration and refugee law, and he engages in law reform efforts in these areas. He is also committed to exploring innovative teaching methodologies, with a particular interest in clinical and experiential education. From 2015 to 2018, he served as the Academic Director at Parkdale Community Legal Services. Professor Rehaag’s interdisciplinary academic research focuses on empirical studies of immigration and refugee law decision-making processes. He currently holds an SSHRC grant involving new legal technologi