# Abstractive Summarization

This notebook contains a sample for abstractive summarization using chain of density prompting.

In [2]:
from dotenv import load_dotenv
import logging
import pandas as pd

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

load_dotenv()

True

In [3]:
from biagen.llm import CohereProvider, GroqProvider

llm = GroqProvider.from_env()

In [4]:
# load some data

area_df = pd.read_csv('mo_conservation.tsv.gz', sep='\t')

data = area_df.loc[4]

article = data['area_info']

data

id                                                           4
area_id                           jaycee-park-lake-cole-county
area_name                       Jaycee Park Lake (Cole County)
area_info     Jaycee Park Lake (Cole County) Jaycee Park La...
Name: 4, dtype: object

In [5]:
summaries = []

In [6]:
prompt_a = f"""Analyze the purpose and main topics of this article. Provide a detailed and accurate analysis spanning all covered topics.

{article}

## Analysis (the most significant topics, with one sentence commentary)
"""

analysis = llm.generate_one(prompt_a, max_tokens=1024, temperature=0.7, stop_sequences=None)

print(analysis)
len(analysis.split())

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:root:<function GroqProvider.generate at 0x10d800fe0>: Exec: 0.7788562774658203s


1. Jaycee Park Lake (Cole County) location and contact information: The article provides the address, contact details, and a map for Jaycee Park Lake in Cole County.
2. Area regulations: The area is not owned by the Missouri Department of Conservation, and the regulations are based on a cooperative agreement between the owner and the department.
3. Prohibited activities: Boats are prohibited on the lake, and fishing may be prohibited or subject to special regulations.
4. Fishing rules and allowed species: If fishing is allowed, the specific rules and allowed species can be found in Chapter 12 of the Wildlife Code. The article mentions sunfish, catfish, and black bass as potential species.
5. Things to do when visiting: Fishing is one of the main activities, but the availability and regulations may vary.


132

In [7]:
prompt_s = f"""Given the article and the following analysis, provide a detailed, erudite, succinct, and accurate summary.

## Article

```article
{article}
```

## Analysis

{analysis}

## Summary (5 paragraphs)

"""

summary= llm.generate_one(prompt_s, max_tokens=1024, temperature=0.5, stop_sequences=None)

summaries.append(summary)

print(summary)
len(summary.split())

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:root:<function GroqProvider.generate at 0x10d800fe0>: Exec: 1.0303590297698975s


Jaycee Park Lake, located in Cole County, offers a beautiful natural setting for visitors looking to enjoy the outdoors. The lake is easily accessible, with directions provided in the article, and interested individuals can contact the Cole County Public Works Director for more information or to report any issues. However, it is important to note that the lake is not owned by the Missouri Department of Conservation, and the area regulations are based on a cooperative agreement between the owner and the department.

The regulations for Jaycee Park Lake are outlined in the article and provide specific information on allowed and prohibited activities. One notable prohibition is the use of boats on the lake, which may limit some visitors' plans. Fishing regulations are more complex, as fishing may be prohibited or subject to special regulations depending on the specific area managed under the cooperative agreement. The article directs individuals to Chapter 12 of the Wildlife Code to deter

341

### Improvement Loop

In [8]:
# Get our last summary

summary = summaries[-1]

In [9]:
# Check for missing information in the summary

prompt_m = f"""Given the article and the current summary, identify missing information or ways to improve the summary.

## Article

```article
{article}
```

## Current Summary

{summary}

## Missing Information (provide 12 entries of novel information not contained in the summary)
"""

missing = llm.generate_one(prompt_m, max_tokens=1024, temperature=0.7, stop_sequences=None)

print(missing)
len(missing.split())

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:root:<function GroqProvider.generate at 0x10d800fe0>: Exec: 0.8801159858703613s


1. Jaycee Park Lake's total acreage is 7.3 acres.
2. The lake is open from 4:00 AM to 10:00 PM every day.
3. To reach the lake, take South Country Club Drive/Fairgrounds Road south for approximately 1.50 miles, then turn onto County Park Road.
4. The lake is managed under a cooperative agreement with the Missouri Department of Conservation, and the regulations for this area can be found in Chapter 12 of the Wildlife Code.
5. There are no specific fishing regulations mentioned in the article, but fishing may be allowed with special provisions.
6. The contact number for the Cole County Public Works Director is 573-636-3614.
7. There is an area map available for download, which can help visitors navigate the lake and its surrounding areas.
8. The lake is located in Jefferson City, which offers other attractions and activities that visitors can enjoy during their stay.
9. The Wildlife Code provides information on the types of fish allowed for fishing and any size or quantity restrictions.


233

In [10]:
prompt_r = f"""Using the current summary and the identified missing information, identify what information is important and unimportant.

## Article

```article
{article}
```

## Current Summary

{summary}

## Missing Information

{missing}

## Most Important Information
"""

important = llm.generate_one(prompt_r, max_tokens=2048, temperature=0.5, stop_sequences=None)

print(important)
len(important.split())

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:root:<function GroqProvider.generate at 0x10d800fe0>: Exec: 0.834165096282959s


The most important information for the article about Jaycee Park Lake includes:

1. Jaycee Park Lake's total acreage is 7.3 acres.
2. The lake's location is in Cole County, and its specific address is not provided, but it is accessible via South Country Club Drive/Fairgrounds Road and County Park Road.
3. The lake is open from 4:00 AM to 10:00 PM every day.
4. The lake is managed under a cooperative agreement with the Missouri Department of Conservation, and the regulations for this area can be found in Chapter 12 of the Wildlife Code.
5. Boats are prohibited on the lake.
6. Fishing regulations may vary for the lake, so visitors should consult the Wildlife Code for more information.
7. The contact number for the Cole County Public Works Director is 573-636-3614.
8. An area map is available for download.
9. Visitors should follow all the regulations and posted signs in the area to maintain a safe and enjoyable environment for all.

This information provides context for the lake's locati

201

In [11]:
prompt_r = f"""Using the current summary and the identified missing information, create an improved summary of the article.

## Article

```article
{article}
```

## Current Summary

{summary}

## Missing Information

{missing}

## Ideas To Consider

{important}

## Improved Summary (reorganized, expanded, and including new information; 6 paragraphs)
"""

resummarized = llm.generate_one(prompt_r, max_tokens=2048, temperature=0.5, stop_sequences=None)

summaries.append(resummarized)

print(resummarized)
len(resummarized.split())

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:root:<function GroqProvider.generate at 0x10d800fe0>: Exec: 1.2683651447296143s


Jaycee Park Lake, located in Cole County, is a beautiful natural area that spans 7.3 acres and offers a variety of recreational activities, including fishing. The lake is easily accessible, with directions provided in the article, and interested individuals can contact the Cole County Public Works Director for more information or to report any issues. However, it is important to note that the lake is not owned by the Missouri Department of Conservation, and the area regulations are based on a cooperative agreement between the owner and the department.

To reach Jaycee Park Lake, visitors can take South Country Club Drive/Fairgrounds Road south for approximately 1.50 miles, then turn onto County Park Road. The lake is open from 4:00 AM to 10:00 PM every day, providing ample opportunity for visitors to enjoy the outdoors. An area map is available for download, which can help visitors navigate the lake and its surrounding areas.

Visitors to Jaycee Park Lake should familiarize themselves 

423