# Summary of Lecture Video




### Reference

[GenAI 2024 course at ntu](https://speech.ee.ntu.edu.tw/~hylee/genai/2024-spring.php)

# Part1 - Preparation

## The lecture video provided for this assignment

(1) For ease of processing, it has already been converted to a MP3 file.

(2) If you would like to view the original video, the link is here:

- 李琳山教授 信號與人生 (2023)

  - https://www.youtube.com/watch?v=MxoQV4M0jY8


(3) Since the original lecture video is quite long, we have edited the segment from 1:43:24 to 2:00:49 to use for this assignment.

## Install all necessary packages and import them

The following code block takes about **150** seconds to run, but it may vary slightly depending on the condition of Colab.

In [1]:
# Install packages.
!pip install srt==3.5.3
!pip install datasets==2.20.0
!pip install DateTime==5.5
!pip install OpenCC==1.1.7
!pip install opencv-contrib-python==4.8.0.76
!pip install opencv-python==4.8.0.76
!pip install opencv-python-headless==4.10.0.84
!pip install openpyxl==3.1.4
!pip install openai==1.35.3
!pip install git+https://github.com/openai/whisper.git@ba3f3cd54b0e5b8ce1ab3de13e32122d0d5f98ab
!pip install numpy==1.25.2
!pip install soundfile==0.12.1
!pip install -q -U google-generativeai==0.7.0
!pip install anthropic==0.29.0

Collecting srt==3.5.3
  Downloading srt-3.5.3.tar.gz (28 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: srt
  Building wheel for srt (setup.py) ... [?25l[?25hdone
  Created wheel for srt: filename=srt-3.5.3-py3-none-any.whl size=22428 sha256=e6115ca348c8098edb531f253b2651ad8600498519da4e3e16ad032e977936f0
  Stored in directory: /root/.cache/pip/wheels/d7/31/a1/18e1e7e8bfdafd19e6803d7eb919b563dd11de380e4304e332
Successfully built srt
Installing collected packages: srt
Successfully installed srt-3.5.3
Collecting datasets==2.20.0
  Downloading datasets-2.20.0-py3-none-any.whl (547 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m547.8/547.8 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
Collecting pyarrow>=15.0.0 (from datasets==2.20.0)
  Downloading pyarrow-16.1.0-cp310-cp310-manylinux_2_28_x86_64.whl (40.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.8/40.8 MB[0m [31m13.5 MB/s[0m eta [36m

The following code block takes about **5** seconds to run, but it may vary slightly depending on the condition of Colab.

In [2]:
# Import packages.
import whisper
import srt
import datetime
import time
import os
import re
import pathlib
import textwrap
import numpy as np
import soundfile as sf
from opencc import OpenCC
from tqdm import tqdm
from datasets import load_dataset
from openai import OpenAI
import google.generativeai as genai
import anthropic

## Download data

The code block below takes about **10** seconds to run, although there might be some slight variation depending on the state of Colab.

In [3]:
# Load dataset.
dataset_name = "kuanhuggingface/NTU-GenAI-2024-HW9"
dataset = load_dataset(dataset_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/305 [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/3.14M [00:00<?, ?B/s]

Generating test split:   0%|          | 0/1 [00:00<?, ? examples/s]

The code block below takes about **15** seconds to run, although there might be some slight variation depending on the state of Colab.

In [4]:
# Prepare audio.
input_audio = dataset["test"]["audio"][0]
input_audio_name = input_audio["path"]
input_audio_array = input_audio["array"].astype(np.float32)
sampling_rate = input_audio["sampling_rate"]

print(f"Now, we are going to transcribe the audio: 李琳山教授 信號與人生 (2023) ({input_audio_name}).")

Now, we are going to transcribe the audio: 李琳山教授 信號與人生 (2023) (ntu-gen-ai-2024-hw9-16k.mp3).


# Part2 - Automatic Speech Recognition (ASR)
The function "speech_recognition" aims to convert audio to subtitle.

In [5]:
def speech_recognition(model_name, input_audio, output_subtitle_path, decode_options, cache_dir="./"):
    '''
        (1) Objective:
            - This function aims to convert audio to subtitle.

        (2) Arguments:

            - model_name (str):
                The name of the model. There are five model sizes, including tiny, base, small, medium, large-v3.
                For example, you can use 'tiny', 'base', 'small', 'medium', 'large-v3' to specify the model name.
                You can see 'https://github.com/openai/whisper' for more details.

            - input_audio (Union[str, np.ndarray, torch.Tensor]):
                The path to the audio file to open, or the audio waveform
                - For example, if your input audio path is 'input.wav', you can use 'input.wav' to specify the input audio path.
                - For example, if your input audio array is 'audio_array', you can use 'audio_array' to specify the input audio array.

            - output_subtitle_path (str):
                The path of the output subtitle file.
                For example, if you want to save the subtitle file as 'output.srt', you can use 'output.srt' to specify the output subtitle path.

            - decode_options (dict):
                The options for decoding the audio file, including 'initial_prompt', 'prompt', 'prefix', 'temperature'.
                - initial_prompt (str):
                    Optional text to provide as a prompt for the first window. This can be used to provide, or
                    "prompt-engineer" a context for transcription, e.g. custom vocabularies or proper nouns
                    to make it more likely to predict those word correctly.
                    Default: None.

                You can see "https://github.com/openai/whisper/blob/main/whisper/decoding.py" and "https://github.com/openai/whisper/blob/main/whisper/transcribe.py"
                for more details.

                - temperature (float):
                    The temperature for sampling from the model. Higher values mean more randomness.
                    Default: 0.0

            - cache_dir (str):
                The path of the cache directory for saving the model.
                For example, if you want to save the cache files in 'cache' directory, you can use 'cache' to specify the cache directory.
                Default: './'

        (3) Example:

            - If you want to use the 'base' model to convert 'input.wav' to 'output.srt' and save the cache files in 'cache' directory,
            you can call this function as follows:

                speech_recognition(model_name='base', input_audio_path='input.wav', output_subtitle_path='output.srt', cache_dir='cache')
    '''

    # Record the start time.
    start_time = time.time()

    print(f"=============== Loading Whisper-{model_name} ===============")

    # Load the model.
    model = whisper.load_model(name=model_name, download_root=cache_dir)

    print(f"Begin to utilize Whisper-{model_name} to transcribe the audio.")

    # Transcribe the audio.
    transcription = model.transcribe(audio=input_audio, language=decode_options["language"], verbose=False,
                                     initial_prompt=decode_options["initial_prompt"], temperature=decode_options["temperature"])

    # Record the end time.
    end_time = time.time()

    print(f"The process of speech recognition costs {end_time - start_time} seconds.")

    subtitles = []
    # Convert the transcription to subtitle and iterate over the segments.
    for i, segment in tqdm(enumerate(transcription["segments"])):

        # Convert the start time to subtitle format.
        start_time = datetime.timedelta(seconds=segment["start"])

        # Convert the end time to subtitle format.
        end_time = datetime.timedelta(seconds=segment["end"])

        # Get the subtitle text.
        text = segment["text"]

        # Append the subtitle to the subtitle list.
        subtitles.append(srt.Subtitle(index=i, start=start_time, end=end_time, content=text))

    # Convert the subtitle list to subtitle content.
    srt_content = srt.compose(subtitles)

    print(f"\n=============== Saving the subtitle to {output_subtitle_path} ===============")

    # Save the subtitle content to the subtitle file.
    with open(output_subtitle_path, "w", encoding="utf-8") as file:
        file.write(srt_content)

In the following block, you can modify your desired parameters and the path of input file.

In [6]:
# @title Parameter Setting of Whisper { run: "auto" }

''' In this block, you can modify your desired parameters and the path of input file. '''

# The name of the model you want to use.
# For example, you can use 'tiny', 'base', 'small', 'medium', 'large-v3' to specify the model name.
# @markdown **model_name**: The name of the model you want to use.
model_name = "medium" # @param ["tiny", "base", "small", "medium", "large-v3"]

# Define the suffix of the output file.
# @markdown **suffix**: The output file name is "output-{suffix}.* ", where .* is the file extention (.txt or .srt)
suffix = "信號與人生" # @param {type: "string"}

# Path to the output file.
output_subtitle_path = f"./output-{suffix}.srt"

# Path of the output raw text file from the SRT file.
output_raw_text_path = f"./output-{suffix}.txt"

# Path to the directory where the model and dataset will be cached.
cache_dir = "./"

# The language of the lecture video.
# @markdown **language**: The language of the lecture video.
language = "zh" # @param {type:"string"}

# Optional text to provide as a prompt for the first window.
# @markdown **initial_prompt**: Optional text to provide as a prompt for the first window.
initial_prompt = "請用中文" #@param {type:"string"}

# The temperature for sampling from the model. Higher values mean more randomness.
# @markdown  **temperature**: The temperature for sampling from the model. Higher values mean more randomness.
temperature = 0 # @param {type:"slider", min:0, max:1, step:0.1}

In [7]:
# Construct DecodingOptions
decode_options = {
    "language": language,
    "initial_prompt": initial_prompt,
    "temperature": temperature
}

In [8]:
# print message.
message = "Transcribe 李琳山教授 信號與人生 (2023)"
print(f"Setting: (1) Model: whisper-{model_name} (2) Language: {language} (2) Initial Prompt: {initial_prompt} (3) Temperature: {temperature}")
print(message)

Setting: (1) Model: whisper-medium (2) Language: zh (2) Initial Prompt: 請用繁體中文 (3) Temperature: 0
Transcribe 李琳山教授 信號與人生 (2023)


The code block below takes about **90 (240)** seconds to run when using the **base (medium)** model and **a T4 GPU**, although there might be some slight variation depending on the state of Colab.

In [9]:
# Running ASR.
speech_recognition(model_name=model_name, input_audio=input_audio_array, output_subtitle_path=output_subtitle_path, decode_options=decode_options, cache_dir=cache_dir)



100%|██████████████████████████████████████| 1.42G/1.42G [00:13<00:00, 114MiB/s]


Begin to utilize Whisper-medium to transcribe the audio.


100%|██████████| 104500/104500 [02:51<00:00, 610.78frames/s]


The process of speech recognition costs 203.53660106658936 seconds.


370it [00:00, 199369.54it/s]







You can check the result of automatic speech recognition.

In [10]:
''' Open the SRT file and read its content.
The format of SRT is:

[Index]
[Begin time] (hour:minute:second) --> [End time] (hour:minute:second)
[Transcription]

'''

with open(output_subtitle_path, 'r', encoding='utf-8') as file:
    content = file.read()

print(content)

1
00:00:00,000 --> 00:00:04,000
每次說這個學問是做出來的

2
00:00:06,000 --> 00:00:08,000
什麼意思?

3
00:00:08,000 --> 00:00:12,000
要做才會獲得學問

4
00:00:13,000 --> 00:00:16,000
你如果每天光是坐在那裡聽

5
00:00:17,000 --> 00:00:20,000
學問很可能是左耳進右耳出的

6
00:00:21,000 --> 00:00:23,000
你光是坐在那兒讀

7
00:00:23,000 --> 00:00:26,000
學問可能從眼睛進入腦海之後就忘掉了

8
00:00:26,000 --> 00:00:29,000
如何能夠學問在腦海裡面

9
00:00:31,000 --> 00:00:33,000
真的變成你自己學問

10
00:00:33,000 --> 00:00:35,000
就是要做

11
00:00:36,000 --> 00:00:39,000
可能有很多同學有這個經驗

12
00:00:39,000 --> 00:00:41,000
你如果去修某一門課

13
00:00:41,000 --> 00:00:44,000
或者做某一個實驗

14
00:00:44,000 --> 00:00:47,000
在期末就是要教一個final project

15
00:00:48,000 --> 00:00:50,000
那個final project就是要你把

16
00:00:51,000 --> 00:00:53,000
學到的很多東西

17
00:00:53,000 --> 00:00:56,000
最後整合在你的final project裡面

18
00:00:56,000 --> 00:00:58,000
最後做出來的時候

19
00:00:58,000 --> 00:01:00,000
就是把它們都整合了

20
00:01:00,000 --> 00:01:02,000
當你學期結束

21
00:01:02,000 --> 00:01:04,000
真的把final project做完的時候

22
00:01:04,000 --> 00:01:05,00

# Part3 - Preprocess the results of automatic speech recognition

In [11]:
def extract_and_save_text(srt_filename, output_filename):

    '''
    (1) Objective:
        - This function extracts the text from an SRT file and saves it to a new text file.
        - It also converts the Simplified Chinese to Traditional Chinese.

    (2) Arguments:

        - srt_filename: The path to the SRT file.

        - output_filename: The name of the output text file.

    (3) Example:
        - If your SRT file is named 'subtitle.srt' and you want to save the extracted text to a file named 'output.txt', you can use the function like this:
            extract_and_save_text('subtitle.srt', 'output.txt')

    '''

    # Open the SRT file and read its content.
    with open(srt_filename, 'r', encoding='utf-8') as file:
        content = file.read()

    # Use regular expression to remove the timecode.
    pure_text = re.sub(r'\d+\n\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}\n', '', content)

    # Remove the empty lines.
    pure_text = re.sub(r'\n\n+', '\n', pure_text)

    # Creating an instance of OpenCC for Simplified to Traditional Chinese conversion.
    cc = OpenCC('s2t')
    pure_text_conversion = cc.convert(pure_text)

    # Write the extracted text to a new file.
    with open(output_filename, 'w', encoding='utf-8') as output_file:
        output_file.write(pure_text_conversion)

    print(f'Extracted text has been saved to {output_filename}.\n\n')

    return pure_text_conversion

In [12]:
def chunk_text(text, max_length):
    """
    (1) Objective:
        - This function is used to split a long string into smaller strings of a specified length.

    (2) Arguments:
        - text: str, the long string to be split.
        - max_length: int, the maximum length of each smaller string.

    (3) Returns:
        - split_text: list, a list of smaller strings.

    (3) Example:
        - If you want to split a string named "long_string" into smaller strings of length 100, you can use the function like this:
            chunk_text(long_string, 100)

    """

    return textwrap.wrap(text, max_length)

In [13]:
''' In this block, you can modify your desired parameters and the path of input file. '''

# # The length of the text chunks.
chunk_length = 512

In [14]:
# Extracts the text from an SRT file and saves it to a new text file
pure_text = extract_and_save_text(srt_filename=output_subtitle_path, output_filename=output_raw_text_path)

# Split a long document into smaller chunks of a specified length
chunks = chunk_text(text=pure_text, max_length=512)

# You can see the number of words and contents in each paragraph.
print("Review the results of splitting the long text into several short texts.\n")
for index, chunk in enumerate(chunks):
    if index == 0:
        print(f"\n========== The {index + 1}-st segment of the split ({len(chunk)} words) ==========\n\n")
        for text in textwrap.wrap(chunk, 80):
            print(f"{text}\n")
    elif index == 1:
        print(f"\n========== The {index + 1}-nd segment of the split ({len(chunk)} words) ==========\n\n")
        for text in textwrap.wrap(chunk, 80):
            print(f"{text}\n")
    elif index == 2:
        print(f"\n========== The {index + 1}-rd segment of the split ({len(chunk)} words) ==========\n\n")
        for text in textwrap.wrap(chunk, 80):
            print(f"{text}\n")
    else:
        print(f"\n========== The {index + 1}-th segment of the split ({len(chunk)} words) ==========\n\n")
        for text in textwrap.wrap(chunk, 80):
            print(f"{text}\n")

Extracted text has been saved to ./output-信號與人生.txt.


Review the results of splitting the long text into several short texts.




每次說這個學問是做出來的 什麼意思? 要做才會獲得學問 你如果每天光是坐在那裡聽 學問很可能是左耳進右耳出的 你光是坐在那兒讀

學問可能從眼睛進入腦海之後就忘掉了 如何能夠學問在腦海裡面 真的變成你自己學問 就是要做 可能有很多同學有這個經驗 你如果去修某一門課 或者做某一個實驗

在期末就是要教一個final project 那個final project就是要你把 學到的很多東西 最後整合在你的final project裡面

最後做出來的時候 就是把它們都整合了 當你學期結束 真的把final project做完的時候 你會忽然發現 我真的學到很多東西 那就是做出來的學問

也許可以舉另外一個例子 就是你如果學了某一些很複雜的演算法 或者什麼 好像覺得那些不見得在你的腦海裡 可是後來老師出了個習題 那個習題教你寫一個很大的程式

要把所有東西都包進去 當你把這個程式寫完的時候你會發現 你忽然把演算法裡所有東西都弄通了 那就是學問是做出來的 所以我們永遠要記得 盡量多動手多做

在動手跟做的過程之中 學問纔可以變成是自己的 同樣的情形就是說 很多時候這樣動手或者做的表現或者成績 沒有一個成績單上的數字




使得很多人覺得那不重要 很多人甚至覺得這門課要做final project 我就不修了太累了 或者說那門課需要怎麼樣怎麼樣太累 我就不要做了

而不知道其實那個纔是讓你做的機會 然後可以學到最多 也就是說雖然很可能那麼辛苦的做很多事 沒有讓你獲得什麼具體成績 對你的overfitting可能沒有幫助

可是對你的全面學習是很有幫助 是該學的 那不要漏掉這些事 那這是我所說的 那這個課業內可以做的這些事 那剛才我們講到思考的時候 我覺得我漏掉一點

你如果修我的信號課你可能會發現 我上課沒講到一個數學式子的時候 我通常都不推他的 我是在解釋那個數學式子在說什麼話 同樣的呢 沒講到一個什麼什麼事情的時候

我通常就在解釋他在說什麼話 也就是說 我在講的就是我讀到特本那裡的時候 我心裡怎麼想的 也就是我

# Part4 - Summarization


## **You only need to choose one of the following parts.**

## **If you want to use Gemini, begin with this part.**
##### (1) You can refer to https://shorturl.at/X0NDY (Page 35) for obtaining Gemini API key.
##### (2) You can refer to https://ai.google.dev/models/gemini for more details about which models you can use.

In [67]:
def summarization(summarization_prompt, model_name="gemini-pro", temperature=0.0, top_p=1.0, max_tokens=512):
    """
    (1) Objective:
        - Use the OpenAI Chat API to summarize a given text.

    (2) Arguments:
        - summarization_prompt: The summarization prompt.
        - model_name: The model name, default is "gemini-pro". You can refer to "https://ai.google.dev/models/gemini" for more details.
        - temperature: Controls randomness in the response. Lower values make responses more deterministic, default is 0.0.
        - top_p: Controls diversity via nucleus sampling. Higher values lead to more diverse responses, default is 1.0.
        - max_tokens: The maximum number of tokens to generate in the completion, default is 512.

    (3) Return:
        - The summarized text.

    (4) Example:
        - If the text is "ABC" and the summarization prompt is "DEF", model_name is "gemini-pro",
          temperature is 0.0, top_p is 1.0, and max_tokens is 512, then you can call the function like this:

              summarization(text="ABC", summarization_prompt="DEF", model_name="gemini-pro", temperature=0.0, top_p=1.0, max_tokens=512)

    """

    # The user prompt is a concatenation of the summarization_prompt and text.
    user_prompt = summarization_prompt

    # Load the generative model.
    model = genai.GenerativeModel(model_name)

    # Set the generation configuration.
    generation_config = genai.GenerationConfig(temperature=temperature, top_p=top_p, max_output_tokens=max_tokens)

    while True:

        try:
            # Use the OpenAI Chat API to summarize the text.
            response = model.generate_content(contents=user_prompt, generation_config=generation_config)

            break

        except:
            # If the API call fails, wait for 1 second and try again.
            print("The API call fails, wait for 1 second and try again.")
            time.sleep(1)

    return response.text

In [70]:
# @title Parameter Setting of Gemini { run: "auto" }
''' In this block, you can modify your desired parameters and set your api key. '''

# Your google api key.
# @markdown **google_api_key**: Your google api key.
google_api_key = "AIzaSyBVjAGJ9KbB4rO4pYAYwr5T3vbiNi7YtFg" # @param {type:"string"}

# The model name. You can refer to "https://ai.google.dev/models/gemini" for more details.
# @markdown **model_name**: The model name. You can refer to "https://ai.google.dev/models/gemini" for more details.
model_name = "gemini-pro" # @param {type:"string"}

# Controls randomness in the response. Lower values make responses more deterministic
# @markdown **temperature**: Controls randomness in the response. Lower values make responses more deterministic.
temperature = 0.2 # @param {type:"slider", min:0, max:1, step:0.1}

# Controls diversity via nucleus sampling. Higher values lead to more diverse responses
# @markdown **top_p**: Controls diversity via nucleus sampling. Higher values lead to more diverse responses.
top_p = 1.0 # @param {type:"slider", min:0, max:1, step:0.1}

In [71]:
# Set Google API key.
genai.configure(api_key=google_api_key)

### We offer the following two methods for summarization.
Reference: https://reurl.cc/VzagLA

#### **If you want to use the method of Multi-Stage Summarization, begin with this part.**


In [72]:
# @title Prompt Setting of Gemini Multi-Stage Summarization: Paragraph { run: "auto" }
''' You can modify the summarization prompt and maximum number of tokens. '''
''' However, DO NOT modify the part of <text>.'''

# The maximum number of tokens to generate in the completion.
# @markdown **max_tokens**: The maximum number of tokens to generate in the completion.
max_tokens = 350 # @param {type:"integer"}

# @markdown #### Changing **summarization_prompt_template**
# @markdown You can modify the summarization prompt and maximum number of tokens. However, **DO NOT** modify the part of `<text>`.
summarization_prompt_template = "用 300 個字內寫出這段文字的摘要，其中包括要點和所有重要細節：<text>" # @param {type:"string"}

##### Step1: Split the long text into multiple smaller pieces and obtain summaries for each smaller text piece separately


The code block below takes about **40** seconds to run when using the (1) **gemini-pro** model, (2) length of chunks is 512 and (3) maximum number of tokens is 350, but the actual time may vary depending on the condition of Colab and the status of the Google API.

In [75]:
paragraph_summarizations = []

# First, we summarize each section that has been split up separately.
for index, chunk in enumerate(chunks):
  try:
    # Record the start time.
    start = time.time()

    # Construct summarization prompt.
    summarization_prompt = summarization_prompt_template.replace("<text>", chunk)

    # We summarize each section that has been split up separately.
    response = summarization(summarization_prompt=summarization_prompt, model_name=model_name, temperature=temperature, top_p=top_p, max_tokens=max_tokens)
    # response = summarization(summarization_prompt=summarization_prompt, model_name=model_name, temperature=temperature, top_p=top_p)

    # Calculate the execution time and round it to 2 decimal places.
    cost_time = round(time.time() - start, 2)

    # Print the summary and its length.
    print(f"----------------------------Summary of Segment {index + 1}----------------------------\n")
    for text in textwrap.wrap(response, 80):
        print(f"{text}\n")
    print(f"Length of summary for segment {index + 1}: {len(response)}")
    print(f"Time taken to generate summary for segment {index + 1}: {cost_time} sec.\n")

    # Record the result.
    paragraph_summarizations.append(response)
  except:
    continue

----------------------------Summary of Segment 1----------------------------

「做出來的學問」強調學習需要實踐。單純聽講或閱讀可能無法讓知識內化。通過動手做，例如完成專題或編寫程式，學生可以將所學知識整合並真正理解。這種實踐過程使知識成為

自己的，即使沒有傳統的成績評分。因此，學生應積極參與實作活動，以最大限度地吸收和保留學到的內容。

Length of summary for segment 1: 128
Time taken to generate summary for segment 1: 4.41 sec.

----------------------------Summary of Segment 2----------------------------

許多學生認為困難的課程或作業不重要，甚至因此放棄修課。然而，這些挑戰正是學習的機會，能培養全面的理解力。

作者建議學生在學習時，不要只關注具體成績，而應專注於理解概念和培養思考能力。通過思考數學公式和文本內容的含義，學生可以培養思考習慣和理解力。

作者強調，讀書時思考是至關重要的，因為它能幫助學生深入理解材料，並培養批判性思維能力。通過持續練習思考，學生可以提高他們的全面學習能力。

Length of summary for segment 2: 194
Time taken to generate summary for segment 2: 5.54 sec.

----------------------------Summary of Segment 3----------------------------

學習不僅限於課業，課外活動也提供了增長、進步和快樂的機會。打球可以培養手腦協調、團隊精神和人際互動。爬山可以增長見識，擴展視野。旅行可以增長見識，擴展事業。任何

能帶來快樂的活動都可以被視為學習，值得投入時間和精力。

Length of summary for segment 3: 107
Time taken to generate summary for segment 3: 4.53 sec.

----------------------------Summary of Segmen

In [76]:
# First, we collect all the summarizations obtained before and print them.

collected_summarization = ""
for index, paragraph_summarization in enumerate(paragraph_summarizations):
    collected_summarization += f"Summary of segment {index + 1}: {paragraph_summarization}\n"

print(collected_summarization)

Summary of segment 1: 「做出來的學問」強調學習需要實踐。單純聽講或閱讀可能無法讓知識內化。通過動手做，例如完成專題或編寫程式，學生可以將所學知識整合並真正理解。這種實踐過程使知識成為自己的，即使沒有傳統的成績評分。因此，學生應積極參與實作活動，以最大限度地吸收和保留學到的內容。
Summary of segment 2: 許多學生認為困難的課程或作業不重要，甚至因此放棄修課。然而，這些挑戰正是學習的機會，能培養全面的理解力。

作者建議學生在學習時，不要只關注具體成績，而應專注於理解概念和培養思考能力。通過思考數學公式和文本內容的含義，學生可以培養思考習慣和理解力。

作者強調，讀書時思考是至關重要的，因為它能幫助學生深入理解材料，並培養批判性思維能力。通過持續練習思考，學生可以提高他們的全面學習能力。
Summary of segment 3: 學習不僅限於課業，課外活動也提供了增長、進步和快樂的機會。打球可以培養手腦協調、團隊精神和人際互動。爬山可以增長見識，擴展視野。旅行可以增長見識，擴展事業。任何能帶來快樂的活動都可以被視為學習，值得投入時間和精力。
Summary of segment 4: 參與課外活動和校內外活動可以帶來許多成長和進步，包括電機工程以外的領域。這些活動提供了寶貴的學習機會，培養團隊合作、領導力和溝通等軟實力。儘管這些活動沒有考試或成績，但它們對於個人和職業發展至關重要。在電機工程領域，團隊合作和軟實力對於成功至關重要，因為很少有事情可以單獨完成。因此，參與課外活動和校內外活動對於培養這些技能並為未來的成功做好準備非常重要。
Summary of segment 5: 軟實力是指人際交往能力，包括溝通、協調、交友、說服、團隊精神和領導力等。這些能力對成功至關重要，即使是電機工程師也需要具備。

軟實力通常不是天生的，而是通過課外活動和學習機會培養的。作者在台大電機系就讀期間，通過努力培養了這些能力，並強調了它們的重要性。

作者認為，軟實力是電機工程師成功的關鍵，並建議學生重視課外活動和學習機會，以培養這些能力。
Summary of segment 6: 職業生涯的黃金時期通常在 35 至 55 歲之間。在此期間，個人通常具備了必要的技能和經驗，並處於事業的巔峰。

電機系畢業生的職業發展軌跡各不相同。有些人從一開

#####Step2: After obtaining summaries for each smaller text piece separately, process these summaries to generate the final summary.

In [77]:
# @title Prompt Setting of Gemini Multi-Stage Summarization: Total { run: "auto" }
''' You can modify the summarization prompt and maximum number of tokens. '''
''' However, DO NOT modify the part of <text>.'''

# We set the maximum number of tokens to ensure that the final summary does not exceed 550 tokens.
# @markdown **max_tokens**: We set the maximum number of tokens to ensure that the final summary does not exceed 550 tokens.
max_tokens = 550 # @param {type:"integer"}

# @markdown ### Changing **summarization_prompt_template**
# @markdown You can modify the summarization prompt and maximum number of tokens. However, **DO NOT** modify the part of `<text>`.
summarization_prompt_template = "在 500 字以內寫出以下文字的簡潔摘要：<text>" # @param {type:"string"}

The code block below takes about **20** seconds to run when using the (1) **gemini-pro** model, (2) length of chunks is 512 and (3) maximum number of tokens is 550, but the actual time may vary depending on the condition of Colab and the status of the Google API.

In [78]:
# Finally, we compile a final summary from the summaries of each section.

# Record the start time.
start = time.time()

# Run final summarization.
summarization_prompt = summarization_prompt_template.replace("<text>", collected_summarization)
final_summarization = summarization(summarization_prompt=summarization_prompt, model_name=model_name, temperature=temperature, top_p=top_p, max_tokens=max_tokens)

# Calculate the execution time and round it to 2 decimal places.
cost_time = round(time.time() - start, 2)

# Print the summary and its length.
print(f"----------------------------Final Summary----------------------------\n")
for text in textwrap.wrap(final_summarization, 80):
        print(f"{text}")
print(f"\nLength of final summary: {len(final_summarization)}")
print(f"Time taken to generate the final summary: {cost_time} sec.")

----------------------------Final Summary----------------------------

**摘要**  **段落 1：**學習需要實踐，通過動手做可以真正理解知識。  **段落 2：**困難的課程是學習機會，應專注於理解概念和培養思考能力。
**段落 3：**課外活動提供成長和快樂，任何帶來快樂的活動都可以被視為學習。  **段落
4：**課外活動培養團隊合作、領導力和溝通等軟實力，這些技能對電機工程師至關重要。  **段落
5：**軟實力是電機工程師成功的關鍵，應通過課外活動和學習機會培養這些能力。  **段落 6：**職業發展受實力、努力、大智和自我技能影響，並可能呈現不同模式。
**段落 7：**實力、努力和自我技能是職業發展的關鍵要素，設定長期的目標可以提供方向和激勵。  **段落
8：**長程目標應具有意義、可行、具體，並設定現實的時間框架。

Length of final summary: 339
Time taken to generate the final summary: 8.25 sec.


In [79]:
''' In this block, you can modify your desired output path of final summary. '''

output_path = f"./final-summary-{suffix}-gemini-multi-stage.txt"

# If you need to convert Simplified Chinese to Traditional Chinese, please set this option to True; otherwise, set it to False.
convert_to_tradition_chinese = False

if convert_to_tradition_chinese == True:
    # Creating an instance of OpenCC for Simplified to Traditional Chinese conversion.
    cc = OpenCC('s2t')
    final_summarization = cc.convert(final_summarization)

# Output your final summary
with open(output_path, "w") as fp:
    fp.write(final_summarization)

print(f"Final summary has been saved to {output_path}")

Final summary has been saved to ./final-summary-信號與人生-gemini-multi-stage.txt


#### **If you want to use the method of Refinement, begin with this part.**



In [81]:
# @title Prompt Setting of Gemini Refinement { run: "auto" }
''' You can modify the summarization prompt and maximum number of tokens. '''
''' However, DO NOT modify the part of <text>.'''

# We set the maximum number of tokens.
# @markdown **max_tokens**: We set the maximum number of tokens.
max_tokens = 550 # @param {type:"integer"}

# @markdown ### Changing **summarization_prompt_template** and **summarization_prompt_refine_template**
# @markdown You can modify the summarization prompt and maximum number of tokens. However, **DO NOT** modify the part of `<text>`.

# Initial prompt.
# @markdown **summarization_prompt_template**: Initial prompt.
summarization_prompt_template = "請在 300 字以內，提供以下文字的簡潔摘要:<text>" # @param {type:"string"}

# Refinement prompt.
# @markdown **summarization_prompt_refinement_template**: Refinement prompt.
summarization_prompt_refinement_template = "請在 500 字以內，結合原先的摘要和新的內容，提供簡潔的摘要:<text>" # @param {type:"string"}

Pipeline of the method of Refinement.

Step1: It starts by running a prompt on a small portion of the data, generating initial output.

Step2: For each following document, the previous output is fed in along with the new document.

Step3: The LLM is instructed to refine the output based on the new document's information.

Step4: This process continues iteratively until all documents have been processed.

The code block below takes about **45** seconds to run when using the (1) **gemini-pro** model and (2) maximum number of tokens is 500, but the actual time may vary depending on the condition of Colab and the status of the Google API.

In [82]:
paragraph_summarizations = []

# First, we summarize each section that has been split up separately.
for index, chunk in enumerate(chunks):

    if index == 0:
        # Record the start time.
        start = time.time()

        # Construct summarization prompt.
        summarization_prompt = summarization_prompt_template.replace("<text>", chunk)

        # Step1: It starts by running a prompt on a small portion of the data, generating initial output.
        first_paragraph_summarization = summarization(summarization_prompt=summarization_prompt, model_name=model_name, temperature=temperature, top_p=top_p, max_tokens=max_tokens)

        # Record the result.
        paragraph_summarizations.append(first_paragraph_summarization)

        # Calculate the execution time and round it to 2 decimal places.
        cost_time = round(time.time() - start, 2)

        # Print the summary and its length.
        print(f"----------------------------Summary of Segment {index + 1}----------------------------\n")
        for text in textwrap.wrap(first_paragraph_summarization, 80):
            print(f"{text}\n")
        print(f"Length of summary for segment {index + 1}: {len(first_paragraph_summarization)}")
        print(f"Time taken to generate summary for segment {index + 1}: {cost_time} sec.\n")

    else:
        # Record the start time.
        start = time.time()

        # Step2: For each following document, the previous output is fed in along with the new document.
        chunk_text = f"""前 {index} 段的摘要: {paragraph_summarizations[-1]}\n第 {index + 1} 段的內容: {chunk}"""

        # Construct refinement prompt for summarization.
        summarization_prompt = summarization_prompt_refinement_template.replace("<text>", chunk_text)

        # Step3: The LLM is instructed to refine the output based on the new document's information.
        try:
          paragraph_summarization = summarization(summarization_prompt=summarization_prompt, model_name=model_name, temperature=temperature, top_p=top_p, max_tokens=max_tokens)

          # Record the result.
          paragraph_summarizations.append(paragraph_summarization)
        except: continue
        # Calculate the execution time and round it to 2 decimal places.
        cost_time = round(time.time() - start, 2)

        # print results.
        print(f"----------------------------Summary of the First {index + 1} Segments----------------------------\n")
        for text in textwrap.wrap(paragraph_summarization, 80):
            print(f"{text}\n")
        print(f"Length of summary for the first {index + 1} segments: {len(paragraph_summarization)}")
        print(f"Time taken to generate summary for the first {index + 1} segments: {cost_time} sec.\n")

    # Step4: This process continues iteratively until all documents have been processed.

----------------------------Summary of Segment 1----------------------------

「做出來的學問」強調學習需要實踐。光靠聽講或閱讀無法真正吸收知識。只有透過動手做，將所學整合應用於實際專案或任務中，才能將知識內化為自己的。這種實作過程有助於加

深理解，並將學到的演算法或概念融會貫通。因此，學習過程中應積極參與動手實作，才能真正掌握學問。

Length of summary for segment 1: 127
Time taken to generate summary for segment 1: 4.76 sec.

----------------------------Summary of the First 2 Segments----------------------------

**簡潔摘要**

學習需要實踐，透過動手做才能真正吸收知識。實作過程有助於加深理解，將學到的演算法或概念融會貫通。因此，學習過程中應積極參與動手實作，才能真正掌握學問。

然而，許多人認為實作不重要，甚至因此放棄修課。但實作正是學習的機會，可以學到最多。雖然實作可能很辛苦，但對全面學習很有幫助。

在學習過程中，培養思考能力和習慣至關重要。最好的方法是讀書時思考數學式子和概念的意義。透過不斷思考，可以深入理解所學內容，並培養獨立思考的能力。

Length of summary for the first 2 segments: 223
Time taken to generate summary for the first 2 segments: 5.79 sec.

----------------------------Summary of the First 3 Segments----------------------------

**簡潔摘要**  學習需要實踐，透過動手做才能真正吸收知識。實作有助於加深理解，培養思考能力。因此，學習過程中應積極參與動手實作，才能真正掌握學問。  學習不

僅限於課業內，課業外也有許多值得學習的事物。只要能帶來增長、進步和快樂，任何活動都可以視為學習機會。例如，打球可以增進手腦協調和團隊精神，爬山可以培養毅力，旅行

可以擴展見識。因此，應積極參

In [83]:
''' In this block, you can modify your desired output path of final summary. '''

output_path = f"./final-summary-{suffix}-gemini-refinement.txt"

# If you need to convert Simplified Chinese to Traditional Chinese, please set this option to True; otherwise, set it to False.
convert_to_tradition_chinese = False

if convert_to_tradition_chinese == True:
    # Creating an instance of OpenCC for Simplified to Traditional Chinese conversion.
    cc = OpenCC('s2t')
    paragraph_summarizations[-1] = cc.convert(paragraph_summarizations[-1])

# Output your final summary
with open(output_path, "w") as fp:
    fp.write(paragraph_summarizations[-1])

# Show the result.
print(f"Final summary has been saved to {output_path}")
print(f"\n===== Below is the final summary ({len(paragraph_summarizations[-1])} words) =====\n")
for text in textwrap.wrap(paragraph_summarizations[-1], 64):
    print(text)

Final summary has been saved to ./final-summary-信號與人生-gemini-refinement.txt

===== Below is the final summary (297 words) =====

**簡潔摘要**
學習不僅限於課堂，實作和課外活動對於全面發展至關重要。實作加深理解，培養思考能力，而課外活動增進手腦協調、毅力、見識和人際互動。
課外活動提供體驗人際關係的機會，培養溝通、協調和領導等軟實力，這些軟實力對於電機工程領域的成功至關重要。
個人事業發展的關鍵因素包括實力（全面學習）、努力、大智和自我技能。這些因素比學業成績更能影響個人事業的成功。  實力來自於全面的
學習，努力是不可或缺的，自我技能則透過課外學習和成長機會獲得。個人可以設定長程目標，並透過這些因素的培養，提升個人發展和事業成功
。  長程目標是個人願意花費大量時間和精力去實現的目標，它能激勵個人向上衝刺。


# Part5 - Check the correctness of the submission file


In [84]:
# Check the correctness of the submission file.
import json
import re

your_submission_path = "YOUR_SUBMISSION_PATH"

def check_format(your_submission_path):

    final_score = 0

    # check the extension of the file.
    if not your_submission_path.endswith(".json"):
        print("Please save your submission file in JSON format.")
        return False, final_score
    else:
        try:
            with open(your_submission_path, "r") as fp:
                your_submission = json.load(fp)

            evaluation_result = your_submission["history"][0]["messages"][1]["content"]

            if "總分：" not in evaluation_result:
                # Correct format: 總分: <你的分數>
                print("Please make sure that the correct format of final score is included in the evaluation result.")
                print("The correct format is 總分: <你的分數>. For example, 總分: 97")
                return False, final_score

            evaluation_result = evaluation_result.strip()
            score_pattern = r"總分：\d+"
            score = re.findall(score_pattern, evaluation_result)

            if score:
                final_score = score[-1].replace("總分：", "")
                if "/100" in final_score:
                    final_score = final_score.replace("/100", "")
            else:
                print("Please make sure that the final score is included in the evaluation result.")
                return False, final_score

        except:
            print("Open the file failed. Please check the file path or save your submission file in correct JSON format")
            return False, final_score

    return True, final_score

format_correctness, final_score = check_format(your_submission_path)
if format_correctness== True:
    print("The format of your submission file is correct.")
    print(f"Your final score is {final_score}.")
else:
    print("The format of your submission file is wrong.")
    print("Please check the format of your submission file.")

Please save your submission file in JSON format.
The format of your submission file is wrong.
Please check the format of your submission file.
