# The Popularity of Prompt Engineering Methods

This series of notebooks produces statistics on Semantic Scholar citations per day for all of the prompt engineering approaches listed at "https://www.promptingguide.ai/papers", "https://en.wikipedia.org/wiki/Prompt_engineering#Text-to-text", and the citations section for "The Practicality of Prompt Engineering".

This file performs web scraping and an initial request for the citation data.


In [23]:
# Print current datetime/run as of date
import datetime
print(datetime.datetime.now())


2023-10-22 12:54:45.361669


In [24]:
# Packages
import requests
from bs4 import BeautifulSoup
import pandas as pd
from datetime import date


In [25]:
# Scrape list of "Approaches" papers on "https://www.promptingguide.ai/papers"
# Use the Internet Archive copy
url = "https://web.archive.org/web/20231022192048/https://www.promptingguide.ai/papers"
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
print(soup)


<!DOCTYPE html>
<html lang="en"><head><script charset="utf-8" src="//web-static.archive.org/_static/js/bundle-playback.js?v=6XRi73ky" type="text/javascript"></script>
<script charset="utf-8" src="//web-static.archive.org/_static/js/wombat.js?v=txqj7nKC" type="text/javascript"></script>
<script>window.RufflePlayer=window.RufflePlayer||{};window.RufflePlayer.config={"autoplay":"on","unmuteOverlay":"hidden"};</script>
<script src="//web-static.archive.org/_static/js/ruffle.js" type="text/javascript"></script>
<script type="text/javascript">
  __wm.init("https://web.archive.org/web");
  __wm.wombat("https://www.promptingguide.ai/papers","20231022192048","https://web.archive.org/","web","//web-static.archive.org/_static/",
	      "1698002448");
</script>
<link href="//web-static.archive.org/_static/css/banner-styles.css?v=S1zqJCYt" rel="stylesheet" type="text/css"/>
<link href="//web-static.archive.org/_static/css/iconochive.css?v=qtvMKcIJ" rel="stylesheet" type="text/css"/>
<!-- End Waybac

In [26]:
# Strip html tags
no_tags = soup.get_text()
print(no_tags)










Papers | Prompt Engineering Guide Prompt Engineering GuidePrompt Engineering CoursePrompt Engineering CourseServicesServicesAboutAboutGitHubGitHub (opens in a new tab)DiscordDiscord (opens in a new tab)Prompt EngineeringIntroductionLLM SettingsBasics of PromptingPrompt ElementsGeneral Tips for Designing PromptsExamples of PromptsTechniquesZero-shot PromptingFew-shot PromptingChain-of-Thought PromptingSelf-ConsistencyGenerate Knowledge PromptingTree of ThoughtsRetrieval Augmented GenerationAutomatic Reasoning and Tool-useAutomatic Prompt EngineerActive-PromptDirectional Stimulus PromptingReActMultimodal CoTGraph PromptingApplicationsProgram-Aided Language ModelsGenerating DataGenerating Synthetic Dataset for RAGTackling Generated Datasets DiversityGenerating CodeGraduate Job Classification Case StudyPrompt FunctionModelsFlanChatGPTLLaMAGPT-4LLM CollectionRisks & MisusesAdversarial PromptingFactualityBiasesPapersToolsNotebooksDatasetsAdditional ReadingsEnglishLightOn This PageOve

In [27]:
# Get approaches by taking text after the line "Approaches" and before the line "Applications"

approaches = no_tags.split("\nApproaches")[1].split("\nApplications")[0]
print(approaches)




Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL (opens in a new tab) (September 2023)
Chain-of-Verification Reduces Hallucination in Large Language Models (opens in a new tab) (September 2023)
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers (opens in a new tab) (September 2023)
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting (opens in a new tab) (September 2023)
Re-Reading Improves Reasoning in Language Models (opens in a new tab) (September 2023)
Graph of Thoughts: Solving Elaborate Problems with Large Language Models (opens in a new tab) (August 2023)
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding (opens in a new tab) (July 2023)
Focused Prefix Tuning for Controllable Text Generation (opens in a new tab) (June 2023)
Exploring Lottery Prompts for Pre-trained Language Models (opens in a new tab) (May 2023)
Less Likely Brainstorming: Using Language Models to Genera

In [28]:
# Remove fully empty lines
no_empty_lines = approaches.replace("\n\n", "\n")
print(no_empty_lines)



Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL (opens in a new tab) (September 2023)
Chain-of-Verification Reduces Hallucination in Large Language Models (opens in a new tab) (September 2023)
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers (opens in a new tab) (September 2023)
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting (opens in a new tab) (September 2023)
Re-Reading Improves Reasoning in Language Models (opens in a new tab) (September 2023)
Graph of Thoughts: Solving Elaborate Problems with Large Language Models (opens in a new tab) (August 2023)
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding (opens in a new tab) (July 2023)
Focused Prefix Tuning for Controllable Text Generation (opens in a new tab) (June 2023)
Exploring Lottery Prompts for Pre-trained Language Models (opens in a new tab) (May 2023)
Less Likely Brainstorming: Using Language Models to Generat

In [29]:
# Create a dataframe of approaches
# First column is title (content before "(opens in a new tab)")
# Second column is month (content after "(opens in a new tab)")

approaches_papers = pd.DataFrame(columns=["Title", "Month"])
for line in no_empty_lines.split("\n"):
    if line != "":
        title = line.split("(opens in a new tab)")[0]
        month = line.split("(opens in a new tab)")[1]
        new_record = pd.DataFrame([{"Title": title, "Month": month}])
        approaches_papers = pd.concat([approaches_papers, new_record], ignore_index=True)

print(approaches_papers)


                                                 Title              Month
0    Query-Dependent Prompt Evaluation and Optimiza...   (September 2023)
1    Chain-of-Verification Reduces Hallucination in...   (September 2023)
2    Connecting Large Language Models with Evolutio...   (September 2023)
3    From Sparse to Dense: GPT-4 Summarization with...   (September 2023)
4    Re-Reading Improves Reasoning in Language Models    (September 2023)
..                                                 ...                ...
144                   Learning from Task Descriptions     (November 2020)
145  AutoPrompt: Eliciting Knowledge from Language ...     (October 2020)
146             Language Models are Few-Shot Learners          (May 2020)
147        How Can We Know What Language Models Know?         (July 2020)
148           Scaling Laws for Neural Language Models      (January 2020)

[149 rows x 2 columns]


In [30]:
# Create a column "Source" that is "Prompt Engineering Guide"
approaches_papers["Source"] = "Prompt Engineering Guide"

# Limit columns to "Title" and "Source"
approaches_papers = approaches_papers[["Title", "Source"]]


In [31]:
# Load in Excel file "Papers From Wikipedia.xlsx"

papers_from_wikipedia = pd.read_excel("Papers From Wikipedia.xlsx")

papers_from_wikipedia


Unnamed: 0,Title,Source
0,Chain-of-Thought Prompting Elicits Reasoning i...,Prompt engineering - Wikipedia
1,Large Language Models are Zero-Shot Reasoners,Prompt engineering - Wikipedia
2,Scaling Instruction-Finetuned Language Models,Prompt engineering - Wikipedia
3,Generated Knowledge Prompting for Commonsense ...,Prompt engineering - Wikipedia
4,Least-to-most prompting enables complex reason...,Prompt engineering - Wikipedia
5,Self-consistency improves chain of thought rea...,Prompt engineering - Wikipedia
6,Active Prompting with Chain-of-Thought for Lar...,Prompt engineering - Wikipedia
7,Complexity-Based Prompting for Multi-Step Reas...,Prompt engineering - Wikipedia
8,Self-Refine: Iterative Refinement with Self-Fe...,Prompt engineering - Wikipedia
9,Large Language Model Guided Tree-of-Thought,Prompt engineering - Wikipedia


In [32]:
# Drop columns with NaN values in the Source column
papers_from_wikipedia = papers_from_wikipedia.dropna(subset=["Source"])

papers_from_wikipedia


Unnamed: 0,Title,Source
0,Chain-of-Thought Prompting Elicits Reasoning i...,Prompt engineering - Wikipedia
1,Large Language Models are Zero-Shot Reasoners,Prompt engineering - Wikipedia
2,Scaling Instruction-Finetuned Language Models,Prompt engineering - Wikipedia
3,Generated Knowledge Prompting for Commonsense ...,Prompt engineering - Wikipedia
4,Least-to-most prompting enables complex reason...,Prompt engineering - Wikipedia
5,Self-consistency improves chain of thought rea...,Prompt engineering - Wikipedia
6,Active Prompting with Chain-of-Thought for Lar...,Prompt engineering - Wikipedia
7,Complexity-Based Prompting for Multi-Step Reas...,Prompt engineering - Wikipedia
8,Self-Refine: Iterative Refinement with Self-Fe...,Prompt engineering - Wikipedia
9,Large Language Model Guided Tree-of-Thought,Prompt engineering - Wikipedia


In [33]:
# Load in CSV file "Zotero Citations.csv"

zotero_citations = pd.read_csv("Zotero Citations.csv")

zotero_citations


Unnamed: 0,Key,Item Type,Publication Year,Author,Title,Publication Title,ISBN,ISSN,DOI,Url,...,Programming Language,Version,System,Code,Code Number,Section,Session,Committee,History,Legislative Body
0,355F8MSY,preprint,2023.0,OpenAI,GPT-4 Technical Report,,,,,http://arxiv.org/abs/2303.08774,...,,,,,,,,,,
1,F37GLVE7,preprint,2023.0,"Cheng, Liying; Li, Xingxuan; Bing, Lidong",Is GPT-4 a Good Data Analyst?,,,,,http://arxiv.org/abs/2305.15038,...,,,,,,,,,,
2,65DI4GGM,newspaperArticle,2023.0,"Roose, Kevin",A Conversation With Bing’s Chatbot Left Me Dee...,The New York Times,,0362-4331,,https://www.nytimes.com/2023/02/16/technology/...,...,,,,,,Technology,,,,
3,HLYY654Y,forumPost,2023.0,Ethan Mollick [@emollick],I have a strong suspicion that “prompt enginee...,Twitter,,,,https://twitter.com/emollick/status/1627804798...,...,,,,,,,,,,
4,H445H9N6,conferencePaper,2022.0,"Wu, Tongshuang; Terry, Michael; Cai, Carrie Jun",AI Chains: Transparent and Controllable Human-...,CHI Conference on Human Factors in Computing S...,978-1-4503-9157-3,,10.1145/3491102.3517582,https://dl.acm.org/doi/10.1145/3491102.3517582,...,,,,,,,,,,
5,U6WJMRWZ,magazineArticle,2023.0,"Acar, Oguz A.",AI Prompt Engineering Isn’t the Future,Harvard Business Review,,0017-8012,,https://hbr.org/2023/06/ai-prompt-engineering-...,...,,,,,,,,,,
6,5YYT9DF3,preprint,2023.0,"Diao, Shizhe; Wang, Pengcheng; Lin, Yong; Zhan...",Active Prompting with Chain-of-Thought for Lar...,,,,,http://arxiv.org/abs/2302.12246,...,,,,,,,,,,
7,4BILARCH,webpage,,,PromptBase,,,,,https://promptbase.com,...,,,,,,,,,,
8,MI32YKHN,preprint,2023.0,"Hebenstreit, Konstantin; Praas, Robert; Kiesew...",An automatically discovered chain-of-thought p...,,,,,http://arxiv.org/abs/2305.02897,...,,,,,,,,,,
9,HBGCI5CT,journalArticle,,"Wei, Jason; Wang, Xuezhi; Schuurmans, Dale; Bo...",Chain-of-Thought Prompting Elicits Reasoning i...,,,,,,...,,,,,,,,,,


In [34]:
# Remove some items that are clearly not of interest - Item Type values of "forumPost", "newspaperArticle", "magazineArticle"
# Publication title "Business Insider", "IBM Research Blog", "The Conversation"
# Specific item titles "PromptBase", "How to Write Plain English", "Semantic Scholar | AI-Powered Research Tool", "A New Academic Vocabulary List", "radon: Code Metrics in Python"
# Items with "Programming Language" value not NaN
zotero_citations_limited = zotero_citations[~zotero_citations["Item Type"].isin(["forumPost", "newspaperArticle", "magazineArticle"])]
zotero_citations_limited = zotero_citations_limited[~zotero_citations["Publication Title"].isin(["Business Insider", "IBM Research Blog", "The Conversation"])]
zotero_citations_limited = zotero_citations_limited[~zotero_citations["Title"].isin(["PromptBase", "How to Write Plain English", "Semantic Scholar | AI-Powered Research Tool", "A New Academic Vocabulary List", "radon: Code Metrics in Python"])]
zotero_citations_limited = zotero_citations_limited[zotero_citations_limited["Programming Language"].isna()]

zotero_citations_limited


  zotero_citations_limited = zotero_citations_limited[~zotero_citations["Publication Title"].isin(["Business Insider", "IBM Research Blog", "The Conversation"])]
  zotero_citations_limited = zotero_citations_limited[~zotero_citations["Title"].isin(["PromptBase", "How to Write Plain English", "Semantic Scholar | AI-Powered Research Tool", "A New Academic Vocabulary List", "radon: Code Metrics in Python"])]


Unnamed: 0,Key,Item Type,Publication Year,Author,Title,Publication Title,ISBN,ISSN,DOI,Url,...,Programming Language,Version,System,Code,Code Number,Section,Session,Committee,History,Legislative Body
0,355F8MSY,preprint,2023.0,OpenAI,GPT-4 Technical Report,,,,,http://arxiv.org/abs/2303.08774,...,,,,,,,,,,
1,F37GLVE7,preprint,2023.0,"Cheng, Liying; Li, Xingxuan; Bing, Lidong",Is GPT-4 a Good Data Analyst?,,,,,http://arxiv.org/abs/2305.15038,...,,,,,,,,,,
4,H445H9N6,conferencePaper,2022.0,"Wu, Tongshuang; Terry, Michael; Cai, Carrie Jun",AI Chains: Transparent and Controllable Human-...,CHI Conference on Human Factors in Computing S...,978-1-4503-9157-3,,10.1145/3491102.3517582,https://dl.acm.org/doi/10.1145/3491102.3517582,...,,,,,,,,,,
6,5YYT9DF3,preprint,2023.0,"Diao, Shizhe; Wang, Pengcheng; Lin, Yong; Zhan...",Active Prompting with Chain-of-Thought for Lar...,,,,,http://arxiv.org/abs/2302.12246,...,,,,,,,,,,
8,MI32YKHN,preprint,2023.0,"Hebenstreit, Konstantin; Praas, Robert; Kiesew...",An automatically discovered chain-of-thought p...,,,,,http://arxiv.org/abs/2305.02897,...,,,,,,,,,,
9,HBGCI5CT,journalArticle,,"Wei, Jason; Wang, Xuezhi; Schuurmans, Dale; Bo...",Chain-of-Thought Prompting Elicits Reasoning i...,,,,,,...,,,,,,,,,,
10,LURSUVRC,preprint,2022.0,"Zhang, Zhuosheng; Zhang, Aston; Li, Mu; Smola,...",Automatic Chain of Thought Prompting in Large ...,,,,,http://arxiv.org/abs/2210.03493,...,,,,,,,,,,
11,46IQXEII,preprint,2023.0,"Dhuliawala, Shehzaad; Komeili, Mojtaba; Xu, Ji...",Chain-of-Verification Reduces Hallucination in...,,,,,http://arxiv.org/abs/2309.11495,...,,,,,,,,,,
12,N3RXJ67P,preprint,2022.0,"Liu, Jiacheng; Liu, Alisa; Lu, Ximing; Welleck...",Generated Knowledge Prompting for Commonsense ...,,,,,http://arxiv.org/abs/2110.08387,...,,,,,,,,,,
16,WP3MGPDG,conferencePaper,2022.0,"Zhou, Yongchao; Muresanu, Andrei Ioan; Han, Zi...",Large Language Models are Human-Level Prompt E...,,,,,https://openreview.net/forum?id=92gvk82DE-,...,,,,,,,,,,


In [35]:
# Set value of "Source" column to "Zotero Citations"
zotero_citations_limited["Source"] = "Zotero Citations"

# Limit columns to "Title" and "Source"
zotero_citations_limited = zotero_citations_limited[["Title", "Source"]]
zotero_citations_limited


Unnamed: 0,Title,Source
0,GPT-4 Technical Report,Zotero Citations
1,Is GPT-4 a Good Data Analyst?,Zotero Citations
4,AI Chains: Transparent and Controllable Human-...,Zotero Citations
6,Active Prompting with Chain-of-Thought for Lar...,Zotero Citations
8,An automatically discovered chain-of-thought p...,Zotero Citations
9,Chain-of-Thought Prompting Elicits Reasoning i...,Zotero Citations
10,Automatic Chain of Thought Prompting in Large ...,Zotero Citations
11,Chain-of-Verification Reduces Hallucination in...,Zotero Citations
12,Generated Knowledge Prompting for Commonsense ...,Zotero Citations
16,Large Language Models are Human-Level Prompt E...,Zotero Citations


In [43]:
# Concatenate together
all_papers = pd.concat([approaches_papers, papers_from_wikipedia, zotero_citations_limited], ignore_index=True)

# Strip extra whitespace from Title column
all_papers["Title"] = all_papers["Title"].str.strip()

# Replace "Chain-of-Thought Prompting Elicits" with "Chain of Thought Prompting Elicits"
all_papers["Title"] = all_papers["Title"].str.replace("Chain-of-Thought Prompting Elicits", "Chain of Thought Prompting Elicits")

# Replace "Large Language Models are Human" with "Large Language Models Are Human"
all_papers["Title"] = all_papers["Title"].str.replace("Large Language Models are Human", "Large Language Models Are Human")

# Replace "Self-consistency improves chain of thought reasoning in language models" with "Self-Consistency Improves Chain of Thought Reasoning in Language Models"
all_papers["Title"] = all_papers["Title"].str.replace("Self-consistency improves chain of thought reasoning in language models", "Self-Consistency Improves Chain of Thought Reasoning in Language Models")

# Collapse items by Title
# Concatenate Source values together, separated by ", "
all_papers = all_papers.groupby("Title").agg({"Source": lambda x: ", ".join(x)}).reset_index()

# Sort by Title
all_papers = all_papers.sort_values(by="Title").reset_index(drop=True)

# Print all rows in the dataframe
pd.set_option("display.max_rows", None)

all_papers


Unnamed: 0,Title,Source
0,A Prompt Pattern Catalog to Enhance Prompt Eng...,Prompt Engineering Guide
1,A Taxonomy of Prompt Modifiers for Text-To-Ima...,Prompt Engineering Guide
2,AI Chains: Transparent and Controllable Human-...,"Prompt Engineering Guide, Zotero Citations"
3,ART: Automatic multi-step reasoning and tool-u...,Prompt Engineering Guide
4,Active Prompting with Chain-of-Thought for Lar...,"Prompt Engineering Guide, Prompt engineering -..."
5,An automatically discovered chain-of-thought p...,Zotero Citations
6,Ask Me Anything: A simple strategy for prompti...,Prompt Engineering Guide
7,Atlas: Few-shot Learning with Retrieval Augmen...,Prompt Engineering Guide
8,AutoPrompt: Eliciting Knowledge from Language ...,Prompt Engineering Guide
9,Automatic Chain of Thought Prompting in Large ...,"Prompt Engineering Guide, Prompt engineering -..."


In [48]:
# Get semantic scholar citations

# For sleep to avoid API limit
import time

# Semantic scholar dataframe
semantic_scholar_df = pd.DataFrame()
no_results_df = pd.DataFrame()

# Today's date
today = date.today()

# Loop over papers
for paper_title in all_papers["Title"]:
    # Replace hyphens with a space (per documentation)
    query = paper_title.replace("-", " ")
    # Query semantic scholar
    r = requests.get(
    'https://api.semanticscholar.org/graph/v1/paper/search?query=' + query + '&fields=title,citationCount,publicationDate,year&limit=1'
    )
    # Attempt for a returned result
    try:
        semantic_scholar_title = r.json()['data'][0]['title']
        semantic_scholar_citation_count = r.json()['data'][0]['citationCount']
        ss_publication_date = r.json()['data'][0]['publicationDate']
        ss_year = r.json()['data'][0]['year']
        new_record = pd.DataFrame([{"paper title":paper_title, "semantic scholar title": semantic_scholar_title, "ss_publication_date": ss_publication_date, "ss_year": ss_year, "citation_count": semantic_scholar_citation_count, "query": query, "day_queried": today}])
        semantic_scholar_df = pd.concat([semantic_scholar_df, new_record], ignore_index=True)
    # Error catch for no results
    except:
        new_record = pd.DataFrame([{"paper title":paper_title, "query": query}])
        no_results_df = pd.concat([no_results_df, new_record], ignore_index=True)
    # Wait one second to avoid exceeding API limit
    time.sleep(1)


In [49]:
semantic_scholar_df


Unnamed: 0,paper title,semantic scholar title,ss_publication_date,ss_year,citation_count,query,day_queried
0,A Prompt Pattern Catalog to Enhance Prompt Eng...,A Prompt Pattern Catalog to Enhance Prompt Eng...,2023-02-21,2023,163,A Prompt Pattern Catalog to Enhance Prompt Eng...,2023-10-22
1,A Taxonomy of Prompt Modifiers for Text-To-Ima...,A Taxonomy of Prompt Modifiers for Text-To-Ima...,2022-04-20,2022,43,A Taxonomy of Prompt Modifiers for Text To Ima...,2023-10-22
2,AI Chains: Transparent and Controllable Human-...,AI Chains: Transparent and Controllable Human-...,2021-10-04,2021,146,AI Chains: Transparent and Controllable Human ...,2023-10-22
3,ART: Automatic multi-step reasoning and tool-u...,ART: Automatic multi-step reasoning and tool-u...,2023-03-16,2023,39,ART: Automatic multi step reasoning and tool u...,2023-10-22
4,Active Prompting with Chain-of-Thought for Lar...,Active Prompting with Chain-of-Thought for Lar...,2023-02-23,2023,34,Active Prompting with Chain of Thought for Lar...,2023-10-22
5,An automatically discovered chain-of-thought p...,An automatically discovered chain-of-thought p...,2023-05-04,2023,6,An automatically discovered chain of thought p...,2023-10-22
6,Ask Me Anything: A simple strategy for prompti...,Ask Me Anything: A simple strategy for prompti...,2022-10-05,2022,65,Ask Me Anything: A simple strategy for prompti...,2023-10-22
7,Atlas: Few-shot Learning with Retrieval Augmen...,Few-shot Learning with Retrieval Augmented Lan...,2022-08-05,2022,202,Atlas: Few shot Learning with Retrieval Augmen...,2023-10-22
8,AutoPrompt: Eliciting Knowledge from Language ...,Eliciting Knowledge from Language Models Using...,2020-10-29,2020,217,AutoPrompt: Eliciting Knowledge from Language ...,2023-10-22
9,Automatic Chain of Thought Prompting in Large ...,Automatic Chain of Thought Prompting in Large ...,2022-10-07,2022,201,Automatic Chain of Thought Prompting in Large ...,2023-10-22


In [50]:
no_results_df


Unnamed: 0,paper title,query
0,Structured Prompting: Scaling In-Context Learn...,Structured Prompting: Scaling In Context Learn...
1,Using Tree-of-Thought Prompting to boost ChatG...,Using Tree of Thought Prompting to boost ChatG...


In [51]:
# Output both files to Excel
semantic_scholar_df.to_excel("Semantic Scholar Citations - First Pass.xlsx", index=False)
no_results_df.to_excel("No Results Semantic Scholar - First Pass.xlsx", index=False)
