Imports

In [1]:
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM
from bs4 import BeautifulSoup
import requests

  from .autonotebook import tqdm as notebook_tqdm


Check if GPU available

In [2]:
device = 0 if torch.cuda.is_available() else -1

Fetch the webpage content

In [3]:
url = "https://secure.toronto.ca/council/agenda-item.do?item=2024.EC14.4"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
text = soup.get_text()
print(text)













Agenda Item History - 2024.EC14.4



































 














Item - 2024.EC14.4




Share


Share to Facebook
Share to Twitter
Share to LinkedIn




				Print 

 







 Close
						

 E-mail Item


 Submit Comments




Tracking Status


This item was considered by the Economic and Community Development Committee  on July 4, 2024 and adopted without amendment.  It will be considered by City Council on July 24, 2024.





							Expand All 2024.EC14.4 


							Collapse All 2024.EC14.4








										Economic and Community Development Committee  consideration on
										July 4, 2024







EC14.4 - Application to the Imagination, Manufacturing, Innovation and Technology Property Tax Incentive Program




Decision Type: ACTION


Status: Adopted


Ward: 14 - Toronto - Danforth

















Committee Recommendations
The Economic and Community Development Committee recommends that:
 
1. City Council approve Imagination, Manufacturing, Innovation

Function to split text into chunks

In [4]:
def split_text(text, max_length):
    return [text[i:i+max_length] for i in range(0, len(text), max_length)]

Use a meeting summarization LLM

In [5]:
model_name = "knkarthick/MEETING_SUMMARY"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)



Create a summarization pipeline

In [6]:
summarizer = pipeline("summarization", model=model, tokenizer=tokenizer, device=device)

Split text into chunks

In [7]:
chunks = split_text(text, 4000)

Summarize each chunk

In [8]:
summaries = []
for chunk in chunks:
    summary = summarizer(chunk, max_length=200, min_length=150, do_sample=False)
    summary = summary[0]['summary_text']
    print(f"{summary}\n")
    summaries.append(summary)

Your max_length is set to 200, but your input_length is only 176. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=88)


On July 4, 2024, the Economic and Community Development Committee adopted an agenda item concerning the Imagination, Manufacturing, Innovation and Technology Property Tax Incentive Program. It will be considered by Toronto City Council on July 24, 2024. In the fourth quarter of 2024, a new City-wide Community Improvement Plan will be prepared. The General Manager, Economic Development and Culture, will negotiate and execute a Financial Incentivist Agreement for the application in the amount of $21.6 million. The application for a film studio complex known as Basin Media Studios in the Port Lands, an area in Ward 14 (Toronto – Danforth) was submitted by Hackman Capital Partners and CreateTO on May 24, 2023. It has an estimated construction value of $150 million.

Councillor Shelley Carroll moved and the motion to adopt an item was carried. Toronto City Clerk is the official keeper of public records in the City of Toronto between 1998 and 2024. Toronto Council is the elected body that ho

Combine the summaries into a single text

In [9]:
combined_summary = " ".join(summaries)

Summarize the combined summary to get a final concise summary

In [10]:
final_summary = summarizer(combined_summary, max_length=200, min_length=150, do_sample=False)
final_summary = final_summary[0]['summary_text']
print(final_summary)

On July 4, 2024, the Economic and Community Development Committee adopted an agenda item concerning the Imagination, Manufacturing, Innovation and Technology Property Tax Incentive Program. It will be considered by Toronto City Council on July 24, 2024. In the fourth quarter of 2024, a new City-wide Community Improvement Plan will be prepared. The General Manager, Economic Development and Culture, will negotiate and execute a Financial Incentivist Agreement for the application in the amount of $21.6 million. The application for a film studio complex known as Basin Media Studios in the Port Lands, an area in Ward 14 (Toronto – Danforth) was submitted by Hackman Capital Partners and CreateTO on May 24, 2023. It has an estimated construction value of $150 million.
