# Can GPT-3.5 / GPT-4 Pass the Canadian Citizenship Test?

28 February 2024

**Sean Rehaag**\
Director, Centre for Refugee Studies\
Director, Refugee Law Lab\
Associate Professor, Osgoode Hall Law School\
York University

NOTE: This is an update of a prior version of this notebook that used GPT-3

One of the requirements to obtain Canadian citizenship is to pass a multiple-choice test that that involves questions about Canadian history, geography, economy, government, laws, and important symbols.

Details about the test, along with a study guide, are available [here](https://www.canada.ca/en/immigration-refugees-citizenship/services/canadian-citizenship/become-canadian-citizen/citizenship-test.html).

The original notebook, *Can_gpt_pass_citizenship_test.ipynb*, examines whether, as of January 2023, OpenAI's GPT-3 could pass the citizenship test without any fine-tuning, using 60 practice questions made available by the [Toronto Public Library](https://www.torontopubliclibrary.ca/new-to-canada/citizenship.jsp). This notebook does the same using GPT-3.5 and GPT-4 as of February 28, 2024.

You can scrape practice questions and answers from the Toronto Public Library's website by running the *ScrapeCitizenshipTest.py* file from a terminal (scraping uses playwright, which does not work well in Jupyter Notebook):

>pip install pandas
>
>pip install playwright
>
>playwright install
>
>python -m ScrapeCitizenshipTest

Alternatively, you can run the notebook using the Excel file with the scraped data in this repo.

Requirements for the notebook:

>pip install pandas
> 
>pip install openai
>
>pip install tqdm

To run the notebook, you will also need an API key from OpenAI. Details about obtaining an account are [here](https://beta.openai.com/signup), and details about getting an API key are [here](https://help.openai.com/en/articles/4936850-where-do-i-find-my-secret-api-key). Once you have an API key, you can load it locally using python-dotenv or you can load using google colab secrets if you are using colab.

To be cited as: Sean Rehaag, "Can GPT-3 Pass the Canadian Citizenship Test?" (9 January 2023), online: \<https://github.com/Refugee-Law-Lab/gpt3-canadian-citizenship-test\>.

License: [CC BY-NC-SA/4.0](https://creativecommons.org/licenses/by-nc-sa/4.0)

### Setup

In [1]:
#! pip install pandas
import pandas as pd

# for progress bar
#!pip install tqdm
from tqdm import tqdm
tqdm.pandas()

import time

# # SET API KEY FOR GOOGLE COLAB
# # first set the secret in colab (call it OPENAI_API_KEY)
# from google.colab import userdata
# import os
# os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

# SET API KEY FOR LOCAL INSTALL
# first set the secret in .env file (OPENAI_API_KEY = '')
# !pip install python-dotenv
from dotenv import load_dotenv
load_dotenv()
# don't forget to include .env in .gitignore if you are pushing the folder to github

# !pip install openai
from openai import OpenAI
client = OpenAI()

In [2]:
# load practice questions
# source: https://www.torontopubliclibrary.ca\new-to-canada/citizenship.jsp
# Scraped using: ScrapeCitizenshipTest.py

df = pd.read_excel('questions.xlsx')

# get letter for correct answers
def get_correct_letter(x):
    if x['correct_answer'] in x['answerA']:
        return 'a'
    elif x['correct_answer'] in x['answerB']:
        return 'b'
    elif x['correct_answer'] in x['answerC']:
        return 'c'
    elif x['correct_answer'] in x['answerD']:
        return 'd'
    else:
        return ''

df['correct_letter'] = df.apply(get_correct_letter, axis=1)

df.head()

Unnamed: 0,question,answerA,answerB,answerC,answerD,correct_answer,correct_letter
0,Name two fundamental freedoms under Canadian law.,Equality rights and care for the environment,Aboriginal rights and conserving water,Freedom of speech and freedom of religion,The Magna Carta and English common law,Freedom of speech and freedom of religion,c
1,List four additional rights Canadian citizens ...,The right to be educated in either official la...,The right of be educated in either official la...,The right to enter and leave Canada and the Un...,"The right to live and work anywhere in Canada,...",The right to be educated in either official la...,a
2,Name three responsibilities of Canadian citize...,"Serving on a jury, keeping your yard tidy and ...","Obeying the law, voting in elections and worki...","Obeying the law, voting in elections and takin...","Voting in elections, taking responsibility for...","Obeying the law, voting in elections and takin...",c
3,Give an example of how you can help in the com...,Wear red on Canada Day,Drive to work,Volunteer at a food bank,Wash your car,Volunteer at a food bank,c
4,"What is meant by the ""equality of women and men""?",Men and women are treated equally under the law.,Men and women are the same,Men and women are similar,Men and women need to obey the law,Men and women are treated equally under the law.,a


### Submit practice questions to GPT-3.5/4

The code uses a few shot learning approach. It provides GPT-3.5 and GPT 4 with a prompt containing contextual information (i.e. that the system is being asked to answer questions from a Canadian citizenship test), with an example of a question with an answer in the format sought, and with the actual practice question and possible answers. The system returns the letter corresponding to the answer that it thinks best responds to the question.

In [3]:
# Helper functions for GPT-3.5/4

def get_completion(user_message,
        system_message="You are a helpful assistant to a Canadian law student",
        model_to_use = "gpt-3.5-turbo-0125",
        temperature = 0,
        time_delay = 0):

    completion = client.chat.completions.create(
        model=model_to_use,
        temperature=temperature,
        messages=[
            {"role": "system", "content": system_message},
            {"role": "user", "content": user_message}
        ]
    )
    if time_delay > 0:
        time.sleep(time_delay)

    return completion.choices[0].message.content

def get_answer(x, system_message = '', model_to_use='', time_delay=0):
        
    header = 'Please answer this question from a Canadian citizenship test, providing only the letter of the answer that is most likely correct. \n\n'
    header_question = 'Question 1: Why is the battle of Vimy Ridge important to Canadians?\n'
    header_answers = []
    header_answers.append('Possible answers:\n')
    header_answers.append(' a: It was the last battle of the First World War\n')
    header_answers.append(' b: It was an important victory in the Boer War\n')
    header_answers.append(' c: It has come to symbolize Canada\'s coming of age as a nation\n')
    header_answers.append(' d: Out of it was formed the Canadian Corps\n\n')
    header_answers = ''.join(header_answers)
    header_completion = 'Selected answer: c \n\n'

    question = 'Question 2: ' + x['question'] + '\n'
    answers = []
    answers.append('Possible answers: \n')
    answers.append(' a: ' + x['answerA'] + '\n')
    answers.append(' b: ' + x['answerB'] + '\n')
    answers.append(' c: ' + x['answerC'] + '\n')
    answers.append(' d: ' + x['answerD'] + '\n\n')
    answers = ''.join(answers)
    completion_prompt = 'Selected answer:'

    user_prompt = header + header_question + header_answers + header_completion + question + answers + completion_prompt
       
    completion = get_completion(user_prompt, system_message=system_message, model_to_use=model_to_use, time_delay=time_delay)

    return completion.strip()


In [4]:
# Get answers for GPT-3.5 (if this gets stuck, interrupt the kernel and run again)

system_message = """You are an assistant that helps assess possible citizenship test questions in Canada. 
You only return the correct letter corresponding with what you believe is the most likely correct answer."""

model_to_use = "gpt-3.5-turbo-0125"

# set the time delay to 20 seconds if on free tier
time_delay = 1

df['gpt-3.5-answer'] = df.progress_apply(lambda x: get_answer(x,
                        system_message=system_message,
                        model_to_use=model_to_use,
                        time_delay=time_delay), axis=1)

df['gpt_3.5_correct']=df.apply(lambda x: True if x['gpt-3.5-answer']==x['correct_letter'] else False, axis=1)
print('Percentage of GPT-3.5 answers that are correct: ', 100 * len(df[df['gpt_3.5_correct']==True])/len(df))

print("These are the ones that GPT-3.5 got wrong:")
df[df['gpt_3.5_correct']==False]


100%|██████████| 60/60 [01:26<00:00,  1.44s/it]

Percentage of GPT-3.5 answers that are correct:  91.66666666666667
These are the ones that GPT-3.5 got wrong:





Unnamed: 0,question,answerA,answerB,answerC,answerD,correct_answer,correct_letter,gpt-3.5-answer,gpt_3.5_correct
8,What do you promise when you take the oath of ...,"To pledge your loyalty to the Sovereign, His M...",To pledge your allegiance to the flag and fulf...,To pledge your allegiance to the Canadian Cons...,To pledge your loyalty to Canada from sea to sea,"To pledge your loyalty to the Sovereign, His M...",a,b,False
11,Who is Canada's Head of State?,"The Prime Minister, Justin Trudeau","The Governor General, Mary Simon","The consort of the Queen, Prince Phillip","The Sovereign, His Majesty King Charles III","The Sovereign, His Majesty King Charles III",d,b,False
24,Name the federal political parties represented...,Conservative Party of Canada (Pierre Poilievre...,Conservative Party of Canada (Pierre Poilievre...,Conservative Party of Canada (Pierre Poilievre...,Conservative Party of Canada (Pierre Poilievre...,Conservative Party of Canada (Pierre Poilievre...,a,b,False
31,Who is the Premier of Ontario?,Tim Hudak,Doug Ford,Andrea Horwath,Steven Del Duca,Doug Ford,b,d,False
39,Who are the Québécois?,French-speaking Catholics,Descendants of French colonists,European settlers,People of Quebec,People of Quebec,d,b,False


In [5]:
# Get answers for GPT-4 (if this gets stuck, interrupt the kernel and run again)

system_message = """You are an assistant that helps assess possible citizenship test questions in Canada. 
You only return the correct letter corresponding with what you believe is the most likely correct answer."""

model_to_use = "gpt-4"

# set the time delay to 20 seconds if on free tier
time_delay = 1

df['gpt-4-answer'] = df.progress_apply(lambda x: get_answer(x,
                        system_message=system_message,
                        model_to_use=model_to_use,
                        time_delay=time_delay), axis=1)

df['gpt_4_correct']=df.apply(lambda x: True if x['gpt-4-answer']==x['correct_letter'] else False, axis=1)
print('Percentage of GPT-4 answers that are correct: ', 100 * len(df[df['gpt_4_correct']==True])/len(df))

print("These are the ones that GPT-4 got wrong:")
df[df['gpt_4_correct']==False]

100%|██████████| 60/60 [01:31<00:00,  1.53s/it]

Percentage of GPT-4 answers that are correct:  90.0
These are the ones that GPT-4 got wrong:





Unnamed: 0,question,answerA,answerB,answerC,answerD,correct_answer,correct_letter,gpt-3.5-answer,gpt_3.5_correct,gpt-4-answer,gpt_4_correct
8,What do you promise when you take the oath of ...,"To pledge your loyalty to the Sovereign, His M...",To pledge your allegiance to the flag and fulf...,To pledge your allegiance to the Canadian Cons...,To pledge your loyalty to Canada from sea to sea,"To pledge your loyalty to the Sovereign, His M...",a,b,False,b,False
11,Who is Canada's Head of State?,"The Prime Minister, Justin Trudeau","The Governor General, Mary Simon","The consort of the Queen, Prince Phillip","The Sovereign, His Majesty King Charles III","The Sovereign, His Majesty King Charles III",d,b,False,b,False
15,How often does an election have to be held acc...,Within three years of the last election,Every four years following the most recent gen...,Within five years of the last election,Whenever the Sovereign decides,Every four years following the most recent gen...,b,b,True,c,False
24,Name the federal political parties represented...,Conservative Party of Canada (Pierre Poilievre...,Conservative Party of Canada (Pierre Poilievre...,Conservative Party of Canada (Pierre Poilievre...,Conservative Party of Canada (Pierre Poilievre...,Conservative Party of Canada (Pierre Poilievre...,a,b,False,b,False
25,Which federal political party is in power?,Liberal Party of Canada,Green Party,Conservative Party of Canada,New Democratic Party,Liberal Party of Canada,a,a,True,This question cannot be answered without curre...,False
39,Who are the Québécois?,French-speaking Catholics,Descendants of French colonists,European settlers,People of Quebec,People of Quebec,d,b,False,b,False


In [6]:
# Export to dataframe excel
df.to_excel('questions_with_gpt_3.5-4_answers.xlsx', index = False)

# So, can GPT-3 pass the Canadian citizenship test?

Yes -- and quite easily. The passing grade is 75% (on 20 questions), while GPT-3.5/4 typically gets between 90% and 92%. As a comparison, in 2019 a [poll](https://www.ctvnews.ca/canada/nearly-9-in-10-canadians-would-fail-the-citizenship-test-poll-1.4489704) found that 9 out of 10 Canadian citizens would not pass the test.