# Lets work with Summarisation app with Fast API
Summarisation is the process of condensing a longer text into a shorter version that preserves the key ideas and intent. There are two main types:

Extractive: selects important sentences/phrases from the original.

Abstractive: generates new text that paraphrases the original (what Gemini does).

# Points to be considered to maintain high quality with summarisation output
1. Preprocess input (clean, remove noise, handle very long text).
2. Decide summarization type (extractive vs abstractive).
3. Build a prompt or model pipeline.
4. Generate summary.
5. Post-process (trim, check length, ensure clarity).

# Some Key factors while executing summarization

1. Length control: set word/token limits, or instruct max words.
2. Fidelity: ensure summary doesn’t add facts not in the source.
3. Coverage: capture all major points, not just the first paragraphs.
4. Style: concise, bullet points, executive, etc.
4. Input size: LLMs have context limits; long text may need chunking.
5. Repetition: avoid repeating the same point in the summary.
6. Latency and cost: larger inputs and outputs cost more and are slower.
7. Error handling: handle empty input or model failures gracefully.

In [1]:
# importing required libraries
import os
from dotenv import load_dotenv
import google.generativeai as genai


  from .autonotebook import tqdm as notebook_tqdm

All support for the `google.generativeai` package has ended. It will no longer be receiving 
updates or bug fixes. Please switch to the `google.genai` package as soon as possible.
See README for more details:

https://github.com/google-gemini/deprecated-generative-ai-python/blob/main/README.md

  import google.generativeai as genai


In [4]:
load_dotenv()  # Load environment variables from .env file
# Set the API key for Google Generative AI
api_key = os.getenv("GEMINI_API_KEY", "").strip()
if not api_key:
    raise RuntimeError("GEMINI_API_KEY is not set")

genai.configure(api_key=api_key)

In [None]:
# Initialize the Gemini model with specific generation configuration
model = genai.GenerativeModel(
    model_name="gemini-2.5-flash-lite",
    generation_config={
        "temperature": 0.2,
        "top_p": 0.8,#
        "top_k": 64,#
        "max_output_tokens": 1024,
    },
)

In [6]:
# Function to summarize text using the Gemini model
def summarize(text, max_words=120, style="concise"):
    prompt = (
        "You are a helpful assistant that summarizes text.\n"
        f"Max {max_words} words. Style: {style}.\n"
        "Return only the summary.\n\n"
        f"TEXT:\n{text}"
    )
    response = model.generate_content(prompt)
    return (response.text or "").strip()

In [7]:
# Example usage
sample_text = ("""
719© Springer International Publishing AG, part of Springer Nature 2018 J.
Van Huylenbroeck (ed.), Ornamental Crops, Handbook of Plant Breeding 11,
 https://doi.org/10.1007/978-3-319-90698-0_27Chapter 27RoseLeenLeus, 
 KatrijnVan Laere, JanDe Riek, andJohanVan HuylenbroeckAbstract 
 Since ancient times, roses have graced homes and gardens as an orna-mental plant. 
 Roses are used for many different ornamental purposes as cut ower, garden plant,
 and pot plant, as well as industrial (perfume), medicinal, and culinary applications. 
Roses have the largest economic value of all the ornamental plants
and have a long and well-documented tradition in selection and breeding. 
With more than 30,000 cultivars, roses have the largest breeding output among all crops,
 yet the demand for new cultivars continues unabated.
 The search for novel ornamental traits is still the main breeding goal. 
Besides, for cut roses differentiation in the product, increase in production,
 and better adaptation to (new) production areas are sought. 
In garden roses, breeders select for better adaptation to diseases including new 
and important diseases such as rose rosette disease (RRD). 
Rose breeding is challenging because of the very narrow genetic background in cultivated roses, 
polyploidy and/or differences in ploidy levels, reproductive barriers including lim-ited fertility 
and germination challenges, etc. Tools for the breeder include knowl-edge on the background of traits, 
(new) molecular techniques, and genome information. 
This chapter gives an overview of the challenges in rose breeding, breeding objectives, the hybridization process, 
               and conventional and molecular breeding tools. Several breeders were contacted to share an applied and 
               practical viewpoint on cut rose and garden rose breeding with an eye toward trends and evo-lutions 
               in modern rose breeding
"""    
)

# Generating final output 
summarize(sample_text, max_words=50, style="executive")

'Roses are the most economically valuable ornamental plant, with over 30,000 cultivars. Breeding focuses on novel traits, disease resistance, and production improvements. Challenges include narrow genetic backgrounds and reproductive barriers. This chapter explores rose breeding objectives, processes, and tools.'

As seen with proper Promp engineeering and with model parameters tuned to deliver strict (temp=0.2) and precise(top_p=0.8) (top_k=64) summary the model has delivered a crip cummary retaining the context of the material provided.
For summarization, typical safe defaults are:

temperature: 0.2–0.4

top_p: 0.7–0.9

top_k: 20–64

We can have the summary in bullet styles as needed in some specific educational and official tasks.lets check the same here.

In [15]:
summary=summarize(sample_text, max_words=60, style="bullet symbols")
print(summary)

*   Roses are highly valued ornamental plants with diverse uses and significant economic impact.
*   Breeding aims for novel ornamental traits, improved production, and disease resistance.
*   Challenges include narrow genetic backgrounds and reproductive barriers.
*   Modern breeding utilizes conventional and molecular tools, with input from breeders on current trends.


We see that we are not getting a numbered output even though its bulleted with "*" this can be achived by saying bullet points while calling the function summarize. lets check this .

In [18]:
summary=summarize(sample_text, max_words=60, style="bullet points")
print(summary)

1. Roses are highly valued ornamental plants with diverse uses and significant economic impact.
2. Rose breeding aims for novel ornamental traits, improved production, and disease resistance.
3. Challenges in rose breeding include narrow genetic backgrounds and reproductive barriers.
4. Modern rose breeding utilizes conventional and molecular tools, incorporating practical insights.


As the LLM models are by default good at creative generation of text if we need the summary to be more "abstractive" we can change the prompt to as follows .

In [17]:
def summarize_abstractive(text, max_words=120, style="concise"):
    prompt = (
        "You are a helpful assistant that summarizes text.\n"
        "Produce an abstractive summary: paraphrase the content in new wording, "
        "do not copy full sentences from the original.\n"
        f"Max {max_words} words. Style: {style}.\n"
        "Return only the summary.\n\n"
        f"TEXT:\n{text}"
    )
    response = model.generate_content(prompt)
    return (response.text or "").strip()
summary=summarize_abstractive(sample_text, max_words=60, style="bullet points")
print(summary)

*   Roses are highly valued ornamental plants with diverse uses and significant economic impact.
*   Despite extensive breeding, demand for new cultivars with novel traits persists.
*   Challenges in rose breeding include genetic limitations and reproductive complexities.
*   Modern breeding employs conventional and molecular techniques to address these challenges and meet market demands.


As seen above we can have a wonderful summary decorated with vocabulary when we ask for abstractive summary. But we should note that the points are simply presented they are not bulleted by either symbol or number . we have to explicitly do this by making following changes.

In [19]:
def summarize_abstractive(text, max_words=120, style="concise"):
    if style == "bullet points":
        format_rule = "Return the summary as a numbered list, one item per line, formatted like '1. ...'"
    else:
        format_rule = f"Style: {style}."

    prompt = (
        "You are a helpful assistant that summarizes text.\n"
        "Produce an abstractive summary: paraphrase the content in new wording, "
        "do not copy full sentences from the original.\n"
        f"Max {max_words} words. {format_rule}\n"
        "Return only the summary.\n\n"
        f"TEXT:\n{text}"
    )
    response = model.generate_content(prompt)
    return (response.text or "").strip()

summary = summarize_abstractive(sample_text, max_words=60, style="bullet points")
print(summary)


1. Roses are highly valued ornamental plants with diverse uses and significant economic impact.
2. Breeding aims for new ornamental traits, improved production, disease resistance, and adaptation.
3. Rose breeding faces challenges due to genetic limitations and reproductive complexities.
4. Modern breeding utilizes conventional and molecular techniques, incorporating breeder insights.


Here we get what we desire. 

We can have different types of summary styles as per task  requirements

Here are common summary styles you can request:

1. Concise – short paragraph, minimal words
2. Executive – highlights key decisions/outcomes
3. Bullet points – list of key ideas
4. Numbered list – ordered items (1., 2., 3.)
5. Simple – plain language, easy reading
6. Technical – preserves terminology and details
7. Highlights – emphasizes only the most important points
8. Action items – extracts tasks or next steps
9. Key takeaways – 3–5 core points
10. TL;DR – one‑sentence summary