# Chain of Density Summary

Adams et al. paper: https://arxiv.org/pdf/2309.04269.pdf

# GPT-3.5 Prompt

Adapted from Adams et al. paper.

Modifications:
- Include example JSON
- Ask for similar length summaries, not exact

In [51]:
cod_gpt35_prompt_template = """Article: {{ARTICLE}} 

You will generate increasingly concise, entity-dense summaries of the above Article.

Repeat the following 2 steps 5 times.

Step 1. Identify 1-3 informative Entities (";" delimited) from the Article which are missing from the previously generated summary.
Step 2. Write a new, denser summary of similar length which covers every entity and detail from the previous summary plus the Missing Entities.

A Missing Entity is:
- Relevant: to the main story.
- Specific: descriptive yet concise (5 words or fewer).
- Novel: not in the previous summary.
- Faithful: present in the Article.
- Anywhere: located anywhere in the Article.

Guidelines:
- The first summary should be long (4-5 sentences, —80 words) yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose language and fillers (e.g., "this article discusses") to reach ~80 words.
- Make every word count: re-write the previous summary to improve flow and make space for additional entities.
- Make space with fusion, compression, and removal of uninformative phrases like "the article discusses".
- The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Article.
- Missing entities can appear anywhere in the new summary.
- Never drop entities from the previous summary. If space cannot be made, add fewer new entities.

Answer in JSON. The JSON should be a list of dictionaries whose keys are "Missing Entities" and "Denser Summary".

e.g.

[
  {
    "Missing Entities": "topic 1; topic 2; topic 3",
    "Denser Summary": "..."
  },
  {
    "Missing Entities": "topic 4; topic 5; topic 6",
    "Denser Summary": "..."
  },
  ...
]
"""

## Analyze GPT-3.5 result

Copy and paste ChatGPT JSON outputs into list below

In [41]:
json_summaries = [
  {
    "Missing Entities": "Heimskringla Sagas;Scandinavian past;skaldic poetry",
    "Denser Summary": "The Heimskringla Sagas, penned by Snorri Sturluson circa 1230, offer a glimpse into King Ólafur's tale. Primary Viking sources are scarce in Scandinavia before the 11th century. Latin and Old Norse documents emerged post-Christianity. Icelandic sagas from the 12th-14th centuries capture Viking traditions. Skaldic poetry, family trees, and ethical values are noteworthy. Old Norse words persist in English, revealing Viking influence. Linguistic studies illuminate Viking interactions. Runestones, inscribed with voyages and conquests, reveal Viking activities. Elfdalian, a unique language, retained runes until the 1920s."
  },
  {
    "Missing Entities": "King Gorm;Jelling Stones;Harald Bluetooth",
    "Denser Summary": "Jelling Stones, raised between 960-985, are significant. King Gorm honored Queen Thyre with a stone. Harald Bluetooth's larger stone celebrates Denmark's conquest, conversion, and Norway's control. The inscriptions mark Gorm's memory and Harald's achievements. Runestones detail Viking voyages, including England, Greece, and Jerusalem. The Elfdalians, in Älvdalen, used runes until the 1920s. Elfdalian, unique to Älvdalen, is distinct from Old Norse. Its last known record is from 1929. Approximately 2,000-3,000 Elfdalian speakers remain today."
  },
  {
    "Missing Entities": "Viking Age inscriptions;Manx runestones;Byzantine (Eastern Roman) Emperor",
    "Denser Summary": "Runestones, essential in Viking history, reveal diverse details. Viking Age inscriptions on Manx runestones document history. The Piraeus Lion depicts Viking warriors, likely Varangians, in Byzantine service. The runic alphabet persisted into the 15th century. The Byzantium territories, Greece, Khwaresm, and Jerusalem are mentioned. England runestones highlight Viking voyages to England. The Vikings left linguistic legacies in modern English. Elfdalian, a unique language, maintained runes until the 1920s. Älvdalen, with 2,000-3,000 speakers, has its own alphabet."
  },
  {
    "Missing Entities": "Runor;Dalecarlian runes;Viking settlements of Eastern Europe",
    "Denser Summary": "Vikings, adept at runor (runes), left rich inscriptions. Dalecarlian runes, related to Elfdalian, were used in Dalarna. Viking settlements in Eastern Europe had limited linguistic impact. The Norse named rapids on the Dnieper. Runestones, spread unevenly, vary by region. Uppland, Sweden, boasts the highest concentration. Viking expeditions to England, Greece, and the East are chronicled. Elfdalian, a unique language, held onto runes until 1920s. Älvdalen's residents, bilingual in Swedish, speak Elfdalian fluently."
  },
  {
    "Missing Entities": "Old Norse connections;Old Norse;Gorm the Old",
    "Denser Summary": "Old Norse connections persist in modern languages. Old Norse, influential in Scandinavia, still echoes today. Gorm the Old, a pagan king, left Jelling Stones. Jelling Stones, erected 960-985, mark significant events. Harald Bluetooth's stone celebrates Denmark's conquest and Christianization. The Vikings influenced English with Old Norse words. Elfdalian, unique to Älvdalen, retained runes until 1920s. Runestones, essential historical markers, reveal Viking exploits. Älvdalen, with 2,000-3,000 speakers, has linguistic distinctiveness."
  }
]


In [48]:
from IPython.display import HTML
import re

def cod_summary_analysis_html(json_summaries):
    html = []
    topic_lists = []
    for i, json_summary in enumerate(json_summaries):
        summary_text = json_summary["Denser Summary"]
        new_topics = json_summary["Missing Entities"]
        new_topics_split = [topic.strip() for topic in new_topics.split(";") if topic.strip()]
        topic_lists.append(new_topics_split)
        for j, topics in enumerate(topic_lists):
            for topic in topics:
                topic_regex = f"[{topic[0].upper()}{topic[0].lower()}]" + topic[1:] # replace leading upper case with regex match
                topic_regex = topic_regex.replace("(", r"\(")
                topic_regex = topic_regex.replace(")", r"\)")
                topic_regex = topic_regex.replace(" ", r"\s")
                summary_text = re.sub(rf"({topic_regex})", rf"<b>\1 [{j+1}]</b>", summary_text)
        html.append(f"<h3>SUMMARY #{i+1}</h3>")
        html.append(f"<div>New Topics: {new_topics}</div><br>")
        html.append(f"<div>{summary_text}</div>")
        html.append("<hr>")
    return html

In [49]:
HTML("\n".join(cod_summary_analysis_html(json_summaries)))

# GPT-4 Prompt

Replicated from Adams et al. paper exactly.

In [6]:
cod_gpt4_prompt_template = """Article: {{ARTICLE}}

You will generate increasingly concise, entity-dense summaries of the above Article.

Repeat the following 2 steps 5 times.

Step 1. Identify 1-3 informative Entities (";" delimited) from the Article which are missing from the previously generated summary.
Step 2. Write a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities.

A Missing Entity is:
- Relevant: to the main story.
- Specific: descriptive yet concise (5 words or fewer).
- Novel: not in the previous summary.
- Faithful: present in the Article.
- Anywhere: located anywhere in the Article.

Guidelines:
- The first summary should be long (4-5 sentences, —80 words) yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose language and fillers (e.g., "this article discusses") to reach ~80 words.
- Make every word count: re-write the previous summary to improve flow and make space for additional entities.
- Make space with fusion, compression, and removal of uninformative phrases like "the article discusses".
- The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Article.
- Missing entities can appear anywhere in the new summary.
- Never drop entities from the previous summary. If space cannot be made, add fewer new entities.

Remember, use the exact same number of words for each summary.

Answer in JSON. The JSON should be a list (length 5) of dictionaries whose keys are "Missing Entities" and "Denser Summary".
"""

## Fetch and analyze GPT-4 result

In [32]:
from openai import OpenAI
import tiktoken
import json
import os

In [8]:
# os.environ["OPENAI_API_KEY"] = "..."
assert os.getenv("OPENAI_API_KEY") is not None

In [11]:
client = OpenAI()
for model in client.models.list().data:
    if "gpt-4" in model.id:
        print(model)

Model(id='gpt-4', created=1687882411, object='model', owned_by='openai')
Model(id='gpt-4-vision-preview', created=1698894917, object='model', owned_by='system')
Model(id='gpt-4-0314', created=1687882410, object='model', owned_by='openai')
Model(id='gpt-4-0613', created=1686588896, object='model', owned_by='openai')
Model(id='gpt-4-1106-preview', created=1698957206, object='model', owned_by='system')


In [12]:
article = """
Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, computer scientist and cryptographer known as the "father of information theory".[1][2][3][4] He is credited alongside George Boole for laying the foundations of the Information Age.[5][6][4]

As a 21-year-old master's degree student at the Massachusetts Institute of Technology (MIT), he wrote his thesis demonstrating that electrical applications of Boolean algebra could construct any logical numerical relationship.[7] Shannon contributed to the field of cryptanalysis for national defense of the United States during World War II, including his fundamental work on codebreaking and secure telecommunications, writing a paper which is considered one of the foundational pieces of modern cryptography.[8]

His mathematical theory of information laid the foundations for the field of information theory,[9] with his famous paper being called the "Magna Carta of the Information Age" by Scientific American.[6][10] He also made contributions to artificial intelligence.[11] His achievements are said to be on par with those of Albert Einstein and Alan Turing in their fields.[2][12][13]

Biography
Childhood
The Shannon family lived in Gaylord, Michigan, and Claude was born in a hospital in nearby Petoskey.[1] His father, Claude Sr. (1862–1934), was a businessman and, for a while, a judge of probate in Gaylord. His mother, Mabel Wolf Shannon (1890–1945), was a language teacher, who also served as the principal of Gaylord High School.[14] Claude Sr. was a descendant of New Jersey settlers, while Mabel was a child of German immigrants.[1] Shannon's family was active in their Methodist Church during his youth.[15]

Most of the first 16 years of Shannon's life were spent in Gaylord, where he attended public school, graduating from Gaylord High School in 1932. Shannon showed an inclination towards mechanical and electrical things. His best subjects were science and mathematics. At home, he constructed such devices as models of planes, a radio-controlled model boat and a barbed-wire telegraph system to a friend's house a half-mile away.[16] While growing up, he also worked as a messenger for the Western Union company.

Shannon's childhood hero was Thomas Edison, whom he later learned was a distant cousin. Both Shannon and Edison were descendants of John Ogden (1609–1682), a colonial leader and an ancestor of many distinguished people.[17][18]

Logic circuits
In 1932, Shannon entered the University of Michigan, where he was introduced to the work of George Boole. He graduated in 1936 with two bachelor's degrees: one in electrical engineering and the other in mathematics.

In 1936, Shannon began his graduate studies in electrical engineering at MIT, where he worked on Vannevar Bush's differential analyzer, an early analog computer.[19] While studying the complicated ad hoc circuits of this analyzer, Shannon designed switching circuits based on Boole's concepts. In 1937, he wrote his master's degree thesis, A Symbolic Analysis of Relay and Switching Circuits.[20] A paper from this thesis was published in 1938.[21] In this work, Shannon proved that his switching circuits could be used to simplify the arrangement of the electromechanical relays that were used during that time in telephone call routing switches. Next, he expanded this concept, proving that these circuits could solve all problems that Boolean algebra could solve. In the last chapter, he presented diagrams of several circuits, including a 4-bit full adder.[20]

Using this property of electrical switches to implement logic is the fundamental concept that underlies all electronic digital computers. Shannon's work became the foundation of digital circuit design, as it became widely known in the electrical engineering community during and after World War II. The theoretical rigor of Shannon's work superseded the ad hoc methods that had prevailed previously. Howard Gardner called Shannon's thesis "possibly the most important, and also the most noted, master's thesis of the century."[22]

Shannon received his PhD in mathematics from MIT in 1940.[17] Vannevar Bush had suggested that Shannon should work on his dissertation at the Cold Spring Harbor Laboratory, in order to develop a mathematical formulation for Mendelian genetics. This research resulted in Shannon's PhD thesis, called An Algebra for Theoretical Genetics.[23]

In 1940, Shannon became a National Research Fellow at the Institute for Advanced Study in Princeton, New Jersey. In Princeton, Shannon had the opportunity to discuss his ideas with influential scientists and mathematicians such as Hermann Weyl and John von Neumann, and he also had occasional encounters with Albert Einstein and Kurt Gödel. Shannon worked freely across disciplines, and this ability may have contributed to his later development of mathematical information theory.[24]

Wartime research
Shannon then joined Bell Labs to work on fire-control systems and cryptography during World War II, under a contract with section D-2 (Control Systems section) of the National Defense Research Committee (NDRC).

Shannon is credited with the invention of signal-flow graphs, in 1942. He discovered the topological gain formula while investigating the functional operation of an analog computer.[25]

For two months early in 1943, Shannon came into contact with the leading British mathematician Alan Turing. Turing had been posted to Washington to share with the U.S. Navy's cryptanalytic service the methods used by the British Government Code and Cypher School at Bletchley Park to break the cyphers used by the Kriegsmarine U-boats in the north Atlantic Ocean.[26] He was also interested in the encipherment of speech and to this end spent time at Bell Labs. Shannon and Turing met at teatime in the cafeteria.[26] Turing showed Shannon his 1936 paper that defined what is now known as the "universal Turing machine".[27][28] This impressed Shannon, as many of its ideas complemented his own.

In 1945, as the war was coming to an end, the NDRC was issuing a summary of technical reports as a last step prior to its eventual closing down. Inside the volume on fire control, a special essay titled Data Smoothing and Prediction in Fire-Control Systems, coauthored by Shannon, Ralph Beebe Blackman, and Hendrik Wade Bode, formally treated the problem of smoothing the data in fire-control by analogy with "the problem of separating a signal from interfering noise in communications systems."[29] In other words, it modeled the problem in terms of data and signal processing and thus heralded the coming of the Information Age.

Shannon's work on cryptography was even more closely related to his later publications on communication theory.[30] At the close of the war, he prepared a classified memorandum for Bell Telephone Labs entitled "A Mathematical Theory of Cryptography", dated September 1945. A declassified version of this paper was published in 1949 as "Communication Theory of Secrecy Systems" in the Bell System Technical Journal. This paper incorporated many of the concepts and mathematical formulations that also appeared in his A Mathematical Theory of Communication. Shannon said that his wartime insights into communication theory and cryptography developed simultaneously, and that "they were so close together you couldn't separate them".[31] In a footnote near the beginning of the classified report, Shannon announced his intention to "develop these results … in a forthcoming memorandum on the transmission of information."[32]

While he was at Bell Labs, Shannon proved that the cryptographic one-time pad is unbreakable in his classified research that was later published in 1949. The same article also proved that any unbreakable system must have essentially the same characteristics as the one-time pad: the key must be truly random, as large as the plaintext, never reused in whole or part, and kept secret.[33]

Information theory
Main article: information theory
In 1948, the promised memorandum appeared as "A Mathematical Theory of Communication", an article in two parts in the July and October issues of the Bell System Technical Journal. This work focuses on the problem of how best to encode the message a sender wants to transmit. Shannon developed information entropy as a measure of the information content in a message, which is a measure of uncertainty reduced by the message. In so doing, he essentially invented the field of information theory.

The book The Mathematical Theory of Communication reprints Shannon's 1948 article and Warren Weaver's popularization of it, which is accessible to the non-specialist. Weaver pointed out that the word "information" in communication theory is not related to what you do say, but to what you could say. That is, information is a measure of one's freedom of choice when one selects a message. Shannon's concepts were also popularized, subject to his own proofreading, in John Robinson Pierce's Symbols, Signals, and Noise.

Information theory's fundamental contribution to natural language processing and computational linguistics was further established in 1951, in his article "Prediction and Entropy of Printed English", showing upper and lower bounds of entropy on the statistics of English – giving a statistical foundation to language analysis. In addition, he proved that treating space as the 27th letter of the alphabet actually lowers uncertainty in written language, providing a clear quantifiable link between cultural practice and probabilistic cognition.

Another notable paper published in 1949 is "Communication Theory of Secrecy Systems", a declassified version of his wartime work on the mathematical theory of cryptography, in which he proved that all theoretically unbreakable cyphers must have the same requirements as the one-time pad. He is also credited with the introduction of sampling theory, which is concerned with representing a continuous-time signal from a (uniform) discrete set of samples. This theory was essential in enabling telecommunications to move from analog to digital transmissions systems in the 1960s and later.

He returned to MIT to hold an endowed chair in 1956.

Teaching at MIT
In 1956 Shannon joined the MIT faculty to work in the Research Laboratory of Electronics (RLE). He continued to serve on the MIT faculty until 1978.

Later life
Shannon developed Alzheimer's disease and spent the last few years of his life in a nursing home; he died in 2001, survived by his wife, a son and daughter, and two granddaughters.[34][35]

Hobbies and inventions

The Minivac 601, a digital computer trainer designed by Shannon
Outside of Shannon's academic pursuits, he was interested in juggling, unicycling, and chess. He also invented many devices, including a Roman numeral computer called THROBAC, and juggling machines.[36][37] He built a device that could solve the Rubik's Cube puzzle.[17]

Shannon designed the Minivac 601, a digital computer trainer to teach business people about how computers functioned. It was sold by the Scientific Development Corp starting in 1961.[38]

He is also considered the co-inventor of the first wearable computer along with Edward O. Thorp.[39] The device was used to improve the odds when playing roulette.

Personal life
Shannon married Norma Levor, a wealthy, Jewish, left-wing intellectual in January 1940. The marriage ended in divorce after about a year. Levor later married Ben Barzman.[40]

Shannon met his second wife, Betty Shannon (née Mary Elizabeth Moore), when she was a numerical analyst at Bell Labs. They were married in 1949.[34] Betty assisted Claude in building some of his most famous inventions.[41] They had three children.[42]

Shannon presented himself as apolitical and an atheist.[43]

Tributes
There are six statues of Shannon sculpted by Eugene Daub: one at the University of Michigan; one at MIT in the Laboratory for Information and Decision Systems; one in Gaylord, Michigan; one at the University of California, San Diego; one at Bell Labs; and another at AT&T Shannon Labs.[44] The statue in Gaylord is located in the Claude Shannon Memorial Park.[45] After the breakup of the Bell System, the part of Bell Labs that remained with AT&T Corporation was named Shannon Labs in his honor.

According to Neil Sloane, an AT&T Fellow who co-edited Shannon's large collection of papers in 1993, the perspective introduced by Shannon's communication theory (now called information theory) is the foundation of the digital revolution, and every device containing a microprocessor or microcontroller is a conceptual descendant of Shannon's publication in 1948:[46] "He's one of the great men of the century. Without him, none of the things we know today would exist. The whole digital revolution started with him."[47] The cryptocurrency unit shannon (a synonym for gwei) is named after him.[48]

A Mind at Play, a biography of Shannon written by Jimmy Soni and Rob Goodman, was published in 2017.[49] They described Shannon as "the most important genius you’ve never heard of, a man whose intellect was on par with Albert Einstein and Isaac Newton".[50]

On April 30, 2016, Shannon was honored with a Google Doodle to celebrate his life on what would have been his 100th birthday.[51][52][53][54][55][56]

The Bit Player, a feature film about Shannon directed by Mark Levinson premiered at the World Science Festival in 2019.[57] Drawn from interviews conducted with Shannon in his house in the 1980s, the film was released on Amazon Prime in August 2020.


"""

In [13]:
prompt = cod_gpt4_prompt_template.replace("{{ARTICLE}}", article)

In [17]:
encoding = tiktoken.encoding_for_model("gpt-4")
num_input_tokens = len(encoding.encode(prompt))

In [20]:
# https://openai.com/pricing#language-models
# GPT-4 costs as of 2023-12

input_cost_per_token = 0.03 / 1_000
output_cost_per_token = 0.06 / 1_000

In [23]:
print(f"Input will cost ${input_cost_per_token * num_input_tokens} ({num_input_tokens} tokens)")

Input will cost $0.09576 (3192 tokens)


In [24]:
completion = client.chat.completions.create(
  model="gpt-4",
  messages=[
    {"role": "user", "content": prompt},
  ]
)

In [25]:
total_cost = input_cost_per_token * completion.usage.prompt_tokens
total_cost += output_cost_per_token * completion.usage.completion_tokens
print(f"Total cost: ${total_cost}")

Total cost: $0.12722999999999998


In [33]:
cod_summary = json.loads(completion.choices[0].message.content)

In [46]:
print(json.dumps(cod_summary, indent=4))

[
    {
        "Missing Entities": "American mathematician; 'father of information theory'; Information Age",
        "Denser Summary": "This article talks about the life and contributions of Claude Elwood Shannon, an American mathematician and electrical engineer who is famously recognized as the 'father of information theory'. Shannon, along with George Boole, is credited for establishing the fundamental groundwork of the Information Age, transforming the realm of communication and digital technology."
    },
    {
        "Missing Entities": "MIT student; cryptanalysis during World War II; 'Magna Carta of the Information Age'",
        "Denser Summary": "Claude Shannon was a brilliant MIT student who demonstrated the potential of Boolean algebra in constructing logical numerical relationships. Contributing significantly to cryptanalysis during World War II, his groundbreaking work established the foundation of modern cryptography. Shannon's 'Magna Carta of the Information Age', a r

In [50]:
HTML("\n".join(cod_summary_analysis_html(cod_summary)))