# Text Summarization

**Author: Airc Miao**

We will show different solutions for text summarization in this notebook, and compare their performance.

In [45]:
import os
import sys
#import requests
import pandas as pd
import re
#import tensorflow as tf

In [46]:
meeting_text = '''
Sarah (Project Manager): Good morning, everyone! Thank you for joining today's kickoff meeting for our new project. Let's get started. James, could you give us a brief overview of the project scope and objectives?

James (Team Lead): Certainly, Sarah. The project involves developing a mobile application for our client that focuses on enhancing user engagement and streamlining their booking process. Our main goals are to deliver a user-friendly interface and ensure seamless functionality across different devices.

Sarah: Great. Emma, how are we approaching the design aspect?

Emma (Designer): I've already started working on some initial design concepts based on the client's preferences and industry trends. I'll share a presentation later this week to gather feedback before finalizing the design direction.

Sarah: Excellent, Emma. Mark, any technical considerations we should be aware of?

Mark (Developer): The application will be built using React Native to ensure cross-platform compatibility. I've outlined the technical requirements and dependencies in the project documentation, which I'll share with the team after this meeting.

Sarah: Perfect. Lisa, what's the marketing strategy for the launch?

Lisa (Marketing Representative): We're planning a phased approach, starting with teaser campaigns on social media to build anticipation. Once the app is ready for beta testing, we'll initiate a closed beta with selected users to gather feedback and make necessary adjustments before the official launch.

Sarah: Thanks, Lisa. Team, let's establish a timeline. James, can you outline the key milestones?

James: Sure, Sarah. I'll break down the development process into sprints, with a goal to have a beta version ready for testing in six weeks. We'll then allocate time for feedback and adjustments before the final launch in three months.

Sarah: Sounds good. Let's schedule regular check-ins to ensure we're on track. Any final thoughts or questions before we wrap up?

Emma: I just want to confirm the deadline for design feedback from the team.

Sarah: We'll aim to provide feedback by the end of next week. Thank you all for your contributions. Let's make this project a success. Our next meeting is scheduled for Wednesday next week. Have a productive day, everyone!
'''

## 1. Summa


In [47]:
import summa
summa_result = summa.summarizer.summarize(meeting_text, ratio=0.2)
print(summa_result)

Sarah (Project Manager): Good morning, everyone!
James (Team Lead): Certainly, Sarah.
I'll share a presentation later this week to gather feedback before finalizing the design direction.
I've outlined the technical requirements and dependencies in the project documentation, which I'll share with the team after this meeting.
Sarah: Thanks, Lisa.
Sarah: We'll aim to provide feedback by the end of next week.


## 2. Summarization with PyTextRank

In [48]:
#!pip install pytextrank

In [49]:
import spacy
import pytextrank

nlp = spacy.load("en_core_web_trf")
nlp.add_pipe("textrank")

<pytextrank.base.BaseTextRankFactory at 0x2982e9550>

In [50]:
phrases = 1 # Summarize based on its top N phrases
sentences = 1 # Yielding its top K sentences

doc = nlp(meeting_text)
tr = doc._.textrank

for sent in tr.summary(limit_phrases=phrases,
                       limit_sentences=sentences):
    print(sent)

I just want to confirm the deadline for design feedback from the team.


In [38]:
phrases = 5 # Summarize based on its top N phrases
sentences = 5 # Yielding its top K sentences

doc = nlp(meeting_text)
tr = doc._.textrank

for sent in tr.summary(limit_phrases=phrases,
                       limit_sentences=sentences):
    print(sent)

Once the app is ready for beta testing, we'll initiate a closed beta with selected users to gather feedback and make necessary adjustments before the official launch.
I just want to confirm the deadline for design feedback from the team.

Sarah (Project Manager): Good morning, everyone!
I'll share a presentation later this week to gather feedback before finalizing the design direction.
We'll then allocate time for feedback and adjustments before the final launch in three months.


## 3. ktrain: 

ktrain uses a pretrained BART model 

In [39]:
%%time

import ktrain
from ktrain import text

ts = text.TransformerSummarizer()
ts.summarize(meeting_text)

CPU times: user 28.8 s, sys: 20.3 s, total: 49 s
Wall time: 8.82 s


"The project involves developing a mobile application for our client that focuses on enhancing user engagement. The application will be built using React Native to ensure cross-platform compatibility. Once the app is ready for beta testing, we'll initiate a closed beta with selected users to gather feedback and make necessary adjustments before the official launch."

## 4. HuggingFace

In [40]:
from transformers import pipeline

### 4.1 facebook/bart-large-cnn

BART model pre-trained on English language, and fine-tuned on CNN Daily Mail.

In [44]:
%%time

hf_fb_bart = pipeline("summarization", model="facebook/bart-large-cnn")

hf_fb_bart(meeting_text, 
                # max_length=300, min_length=30, do_sample=False
                 )

CPU times: user 27.6 s, sys: 21.2 s, total: 48.8 s
Wall time: 10 s


[{'summary_text': "The project involves developing a mobile application for our client. Our main goals are to deliver a user-friendly interface and ensure seamless functionality across different devices. The application will be built using React Native to ensure cross-platform compatibility. We're planning a phased approach, starting with teaser campaigns on social media to build anticipation."}]

### 4.2 knkarthick/MEETING_SUMMARY

Model obtained by Fine Tuning 'facebook/bart-large-xsum' using AMI Meeting Corpus, SAMSUM Dataset, DIALOGSUM Dataset, XSUM Dataset!

In [42]:
hf_mt= pipeline("summarization", model="knkarthick/MEETING_SUMMARY")

hf_mt(meeting_text)

[{'summary_text': 'Sarah, James, Emma and Lisa are meeting to discuss a project. The project involves developing a mobile application for their client. The development will be completed in six weeks and ready for beta testing in three months. There will be teaser campaigns on social media to build anticipation. The deadline for design feedback is'}]

### 4.3 aruca/pegasus_x-meeting-summarizer-gpt3.5
This model is a fine-tuned version of google/pegasus-x-base on the None dataset

In [43]:
hf_pg = pipeline("summarization", model="aruca/pegasus_x-meeting-summarizer-gpt3.5")
hf_pg(meeting_text)

[{'summary_text': 'The project involves developing a mobile application for a client to enhance user engagement and streamline their booking process. The main goals are to deliver a user-friendly interface, ensure seamless functionality across devices, and implement React Native to ensure cross-platform compatibility. The marketing strategy includes teaser campaigns on social media to build anticipation, and a closed beta with selected users to gather feedback and make necessary adjustments before the official launch. The project aims to have a beta version ready for testing in six weeks, with a goal to allocate time for feedback and adjustments before the final launch in three months. The next meeting is scheduled for Wednesday next week.'}]