In [2]:
text = "Solar energy is a renewable and sustainable power source that harnesses the energy of the sun to generate electricity or heat. It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits. Solar energy can be captured through photovoltaic (PV) cells, which convert sunlight directly into electricity, or through solar thermal systems, which use mirrors or lenses to concentrate sunlight to heat water or other fluids, producing steam to drive turbines. Unlike conventional energy sources, solar energy produces little to no greenhouse gas emissions, making it an essential tool in mitigating climate change. The development of solar technologies has significantly reduced the cost of solar panels, making it more accessible for homes, businesses, and even large-scale power plants. Moreover, solar power systems can be deployed in diverse settings, from remote off-grid locations to urban rooftops, enabling energy independence and security. However, challenges such as energy storage, efficiency, and the intermittent nature of sunlight remain hurdles that researchers are actively working to overcome. Despite these challenges, solar energy continues to grow as a cornerstone of global efforts toward sustainability and a low-carbon future, with governments, corporations, and individuals increasingly investing in solar technologies to reduce their carbon footprints and energy costs. Solar energy is one of the most promising and rapidly growing sources of renewable energy in the world today, playing a critical role in the transition away from fossil fuels toward more sustainable, clean energy solutions. At its core, solar energy works by capturing the sun's abundant radiation and converting it into usable forms of energy, such as electricity or heat. Solar panels, made up of photovoltaic (PV) cells, absorb sunlight and convert it directly into electricity, while solar thermal systems use mirrors or lenses to concentrate sunlight to produce heat, often used in industrial applications or to warm water for residential use. Solar energy has become increasingly cost-competitive with traditional energy sources, with technological advances making solar power more affordable for residential, commercial, and industrial applications. Moreover, solar energy has the distinct advantage of being a clean energy source, producing no emissions or pollutants during electricity generation, which significantly reduces its environmental impact compared to fossil fuels like coal, oil, and natural gas. This makes solar energy a crucial part of global efforts to combat climate change and reduce the carbon footprint of energy production. Another major benefit of solar energy is its scalability and versatility; from small rooftop installations to large solar farms, it can be tailored to fit diverse needs, whether in urban areas or remote, off-grid locations. Solar power can also contribute to energy independence by reducing reliance on imported fossil fuels, providing local communities with greater control over their energy supply. Despite these benefits, challenges remain, such as the intermittency of sunlight, which makes energy storage systems like batteries essential for ensuring a reliable power supply. Furthermore, while solar energy's potential is vast, the efficiency of current solar technologies still leaves room for improvement. Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly in the coming years. With increased government incentives, private sector investments, and growing public awareness of climate change, solar energy is set to play an even more vital role in the future energy landscape, offering a pathway to cleaner, more sustainable energy systems worldwide. Solar energy stands at the forefront of the global renewable energy revolution, offering an endless, eco-friendly, and increasingly efficient solution to the world’s growing energy demands. The sun provides an astonishing amount of energy each day—far more than humanity could ever consume—making solar power an inexhaustible resource when properly harnessed. Over the past few decades, technological advancements in photovoltaic cells, solar tracking systems, and energy storage have dramatically improved the efficiency and affordability of solar installations. Modern solar panels can now convert a greater portion of sunlight into electricity, while innovations in battery technology allow for the storage of excess power generated during the day for use at night or during cloudy periods. This has made solar energy more reliable and practical for both urban and rural settings, particularly in developing regions where access to electricity is limited. In addition to its environmental and economic advantages, solar power contributes to job creation and technological innovation. The solar industry employs millions of people worldwide in manufacturing, installation, maintenance, and research, stimulating economic growth while promoting environmental stewardship. Furthermore, governments across the globe are implementing policies and incentives—such as tax credits, feed-in tariffs, and subsidies—to encourage the adoption of solar energy systems. Beyond electricity generation, solar energy has diverse applications, including powering transportation through solar-powered vehicles, desalinating water in arid regions, and supporting agriculture through solar irrigation systems. These wide-ranging uses highlight the versatility of solar technology in addressing global challenges like energy poverty, water scarcity, and climate change. While the transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization, the progress made so far demonstrates that solar energy is not just an alternative—it is becoming a cornerstone of a cleaner, greener, and more sustainable future for all. Solar energy represents a transformative force in the quest for sustainable development, bridging the gap between modern energy needs and environmental preservation. As global populations grow and industrialization accelerates, the demand for clean, reliable, and affordable energy becomes increasingly urgent. Solar power addresses this demand by offering an energy source that is renewable, accessible, and virtually limitless. The environmental benefits of solar energy extend beyond its zero-emission operation—it also helps reduce air pollution, minimize water usage in energy production, and decrease dependence on depleting fossil fuel reserves. Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids, while smaller decentralized systems empower rural communities, allowing homes, schools, and businesses to operate independently of traditional infrastructure. In addition, the integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management, enabling consumers to monitor and optimize their energy use in real time. Solar-powered innovations, such as portable solar chargers, solar streetlights, and even solar-powered desalination plants, demonstrate the adaptability and creative potential of this renewable resource. On a broader scale, the widespread adoption of solar technology contributes to global climate goals by reducing carbon emissions and supporting international commitments like the Paris Agreement. The continuous research into next-generation solar cells—such as perovskite and organic photovoltaics—promises even higher efficiency and lower production costs, paving the way for a future where solar energy could become the dominant source of global power. Ultimately, solar energy is not just an alternative solution; it is a symbol of progress and hope for a cleaner, more equitable, and resilient planet, guiding humanity toward a sustainable energy future that harmonizes technology with nature."

In [3]:
from transformers import pipeline

summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

def dynamic_summarize(text, max_chunk_tokens=800, summary_max_length=200, summary_min_length=50):
    """
    Dynamically summarizes long text by chunking and recursive summarization.
    
    Args:
        text (str): The text to summarize.
        max_chunk_tokens (int): Approximate number of words per chunk.
        summary_max_length (int): Max length of each summary chunk.
        summary_min_length (int): Min length of each summary chunk.
        
    Returns:
        str: Final combined summary.
    """
    import math

    # Split into sentences to avoid breaking context abruptly
    sentences = text.split(". ")
    
    # Create chunks
    chunks = []
    current_chunk = []
    current_len = 0
    
    for sentence in sentences:
        sentence_len = len(sentence.split())
        if current_len + sentence_len <= max_chunk_tokens:
            current_chunk.append(sentence)
            current_len += sentence_len
        else:
            chunks.append(". ".join(current_chunk))
            current_chunk = [sentence]
            current_len = sentence_len
    if current_chunk:
        chunks.append(". ".join(current_chunk))
    
    # Summarize each chunk
    summaries = []
    for chunk in chunks:
        summary = summarizer(
            chunk,
            max_length=summary_max_length,
            min_length=summary_min_length,
            do_sample=False
        )[0]['summary_text']
        summaries.append(summary)
    
    # Combine and summarize again (hierarchical)
    combined_summary = " ".join(summaries)
    final_summary = summarizer(
        combined_summary,
        max_length=summary_max_length,
        min_length=summary_min_length,
        do_sample=False
    )[0]['summary_text']
    
    return final_summary


# Example usage
summary = dynamic_summarize(text)
print(summary)




Device set to use cpu
Your max_length is set to 200, but your input_length is only 128. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=64)


Solar energy represents a transformative force in the quest for sustainable development. Large-scale solar farms are now being established worldwide. The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management.


In [6]:
points = summary.split(".")
for point in points:
    print(point)

Solar energy represents a transformative force in the quest for sustainable development
 Large-scale solar farms are now being established worldwide
 The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management



In [7]:
from transformers import pipeline

# Load summarization model
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

def adaptive_summarize(text):
    """
    Summarizes long text dynamically with adaptive chunking and summary length.
    """
    import math

    # Adjust parameters based on text length
    word_count = len(text.split())
    max_chunk_size = 800  # words per chunk (adjust if model supports longer context)
    
    # Determine summary length dynamically
    if word_count < 800:
        max_len = 180
        min_len = 80
    elif word_count < 2000:
        max_len = 250
        min_len = 100
    elif word_count < 4000:
        max_len = 400
        min_len = 150
    else:
        max_len = 600
        min_len = 200

    # Split text into chunks
    sentences = text.split(". ")
    chunks = []
    current_chunk = []
    current_len = 0

    for sentence in sentences:
        sent_len = len(sentence.split())
        if current_len + sent_len <= max_chunk_size:
            current_chunk.append(sentence)
            current_len += sent_len
        else:
            chunks.append(". ".join(current_chunk))
            current_chunk = [sentence]
            current_len = sent_len
    if current_chunk:
        chunks.append(". ".join(current_chunk))

    # Summarize each chunk
    chunk_summaries = []
    for i, chunk in enumerate(chunks):
        summary = summarizer(
            chunk,
            max_length=max_len,
            min_length=min_len,
            do_sample=False
        )[0]['summary_text']
        chunk_summaries.append(summary)

    # Combine chunk summaries and summarize again (for coherence)
    combined_summary = " ".join(chunk_summaries)
    final_summary = summarizer(
        combined_summary,
        max_length=max_len,
        min_length=min_len,
        do_sample=False
    )[0]['summary_text']

    return final_summary


# Example usage
summary = adaptive_summarize(text)
print(summary)

Device set to use cpu
Your max_length is set to 250, but your input_length is only 213. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=106)


Solar energy is one of the most promising alternatives to fossil fuels. It produces little to no greenhouse gas emissions, making it an essential tool in mitigating climate change. Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids. The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management. Next-generation solar cells promise even higher efficiency and lower production costs, paving the way for a future where solar energy could become the dominant source of global power.


In [8]:
points = summary.split(".")
for point in points:
    print(point)

Solar energy is one of the most promising alternatives to fossil fuels
 It produces little to no greenhouse gas emissions, making it an essential tool in mitigating climate change
 Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids
 The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management
 Next-generation solar cells promise even higher efficiency and lower production costs, paving the way for a future where solar energy could become the dominant source of global power



In [10]:
def safe_detailed_summary(text):
    """
    Summarizes very long text safely (no IndexError),
    by chunking adaptively and merging summaries.
    """
    import math

    words = text.split()
    chunk_size = 800  # each chunk ≈ safe for BART-large-CNN
    chunks = [' '.join(words[i:i+chunk_size]) for i in range(0, len(words), chunk_size)]

    partial_summaries = []

    for i, chunk in enumerate(chunks):
        try:
            # skip empty or very short chunks
            if len(chunk.strip().split()) < 50:
                continue

            summary = summarizer(
                chunk,
                max_length=350,   # how long each chunk summary can be
                min_length=120,
                do_sample=False
            )[0]['summary_text']

            partial_summaries.append(summary)

        except IndexError:
            print(f"⚠️ Skipping chunk {i+1}: too long for model.")
            continue
        except Exception as e:
            print(f"Error on chunk {i+1}: {e}")
            continue

    if not partial_summaries:
        return "⚠️ Text too short or all chunks failed."

    # Combine the chunk summaries
    combined_summary = " ".join(partial_summaries)

    # Summarize again for final coherence
    final_summary = summarizer(
        combined_summary,
        max_length=500,
        min_length=200,
        do_sample=False
    )[0]['summary_text']

    return final_summary


# Example usage
summary = safe_detailed_summary(text)
print(summary)


Your max_length is set to 500, but your input_length is only 250. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=125)


Solar energy is a renewable and sustainable power source that harnesses the energy of the sun to generate electricity or heat. It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits. The development of solar technologies has significantly reduced the cost of solar panels, making it more accessible for homes, businesses, and even large-scale power plants. Despite these benefits, challenges remain, such as the intermittency of sunlight, which makes energy storage systems essential for ensuring a reliable power supply. Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly. The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization. But the progress made so far demonstrates that solar energy is not just an alternative—it is becoming a cornerstone of a 

In [11]:
points = summary.split(".")
for point in points:
    print(point)

Solar energy is a renewable and sustainable power source that harnesses the energy of the sun to generate electricity or heat
 It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits
 The development of solar technologies has significantly reduced the cost of solar panels, making it more accessible for homes, businesses, and even large-scale power plants
 Despite these benefits, challenges remain, such as the intermittency of sunlight, which makes energy storage systems essential for ensuring a reliable power supply
 Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly
 The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization
 But the progress made so far demonstrates that solar energy is not just an alternative—it is becoming a cornerstone of a 

In [13]:
import spacy

# Load spaCy English model
nlp = spacy.load("en_core_web_sm")


def safe_detailed_summary(text, chunk_word_limit=800, min_chunk_words=50):
    """
    Summarizes long text safely using chunking and merging, with spaCy for sentence tokenization.
    """
    if not text.strip():
        return "⚠️ Input text is empty."

    # Tokenize sentences using spaCy
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]

    chunks = []
    current_chunk = []
    word_count = 0

    # Build chunks respecting sentence boundaries
    for sentence in sentences:
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > chunk_word_limit and current_chunk:
            chunks.append(" ".join(current_chunk))
            current_chunk = []
            word_count = 0
        current_chunk.append(sentence)
        word_count += sentence_word_count
    if current_chunk:
        chunks.append(" ".join(current_chunk))

    partial_summaries = []

    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue  # skip tiny chunks
        try:
            summary = summarizer(
                chunk,
                max_length=350,
                min_length=120,
                do_sample=False
            )[0]['summary_text']
            partial_summaries.append(summary)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")
            continue

    if not partial_summaries:
        return "⚠️ Text too short or all chunks failed."

    # Only combine and summarize if multiple partial summaries exist
    if len(partial_summaries) > 1:
        combined_summary = " ".join(partial_summaries)
        try:
            final_summary = summarizer(
                combined_summary,
                max_length=500,
                min_length=200,
                do_sample=False
            )[0]['summary_text']
        except Exception as e:
            print(f"⚠️ Final summarization failed: {e}")
            final_summary = combined_summary
        return final_summary

    return partial_summaries[0]

# Example usage:
summary = safe_detailed_summary(text)
print(summary)


Your max_length is set to 500, but your input_length is only 250. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=125)


Solar energy is a renewable and sustainable power source that harnesses the energy of the sun to generate electricity or heat. It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits. Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids. Next-generation solar cells promise even higher efficiency and lower production costs, paving the way for a future where solar energy could become the dominant source of global power. The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization. The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management. The outlook for solar energy remains optimistic, with the global solar market projected to expand significantly. The development of solar technologies has significantly r

In [14]:
points = summary.split(".")
for point in points:
    print(point)

Solar energy is a renewable and sustainable power source that harnesses the energy of the sun to generate electricity or heat
 It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits
 Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids
 Next-generation solar cells promise even higher efficiency and lower production costs, paving the way for a future where solar energy could become the dominant source of global power
 The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization
 The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management
 The outlook for solar energy remains optimistic, with the global solar market projected to expand significantly
 The development of solar technologies has significantly r

In [15]:
import spacy

# Load spaCy
nlp = spacy.load("en_core_web_sm")


def chunk_text(text, max_words=800, overlap_sentences=2):
    """Chunk text into manageable pieces using sentence boundaries with optional overlap"""
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    chunks = []
    current_chunk = []
    word_count = 0

    for i, sentence in enumerate(sentences):
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > max_words and current_chunk:
            chunks.append(" ".join(current_chunk))
            # Overlap
            current_chunk = current_chunk[-overlap_sentences:]
            word_count = sum(len(s.split()) for s in current_chunk)
        current_chunk.append(sentence)
        word_count += sentence_word_count

    if current_chunk:
        chunks.append(" ".join(current_chunk))

    return chunks

def summarize_text(text, max_chunk_words=800, min_chunk_words=50):
    chunks = chunk_text(text, max_words=max_chunk_words)
    partial_summaries = []

    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue
        try:
            summary = summarizer(chunk, max_length=350, min_length=120, do_sample=False)[0]['summary_text']
            partial_summaries.append(summary)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")

    if not partial_summaries:
        return "⚠️ Text too short or all chunks failed."

    # If multiple partial summaries, summarize them again
    if len(partial_summaries) > 1:
        combined_summary = " ".join(partial_summaries)
        try:
            final_summary = summarizer(
                combined_summary,
                max_length=500,
                min_length=200,
                do_sample=False
            )[0]['summary_text']
            return final_summary
        except Exception as e:
            print(f"⚠️ Final summarization failed, returning combined partial summaries: {e}")
            return combined_summary

    return partial_summaries[0]

# Example usage:
summary = summarize_text(text)
print(summary)


Your max_length is set to 500, but your input_length is only 256. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=128)


Solar energy is becoming a cornerstone of a cleaner, greener, and more sustainable future for all. The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization. Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids. The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management. Ultimately, solar energy is not just an alternative solution; it is a symbol of progress and hope for a clean, more equitable, and resilient planet, guiding humanity toward a sustainable energy future. It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits. The outlook for solar energy remains optimistic, with the global solar market projected to expand significantly. The development of solar technologies has significantly reduced the

In [16]:
points = summary.split(".")
for point in points:
    print(point)

Solar energy is becoming a cornerstone of a cleaner, greener, and more sustainable future for all
 The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization
 Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids
 The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management
 Ultimately, solar energy is not just an alternative solution; it is a symbol of progress and hope for a clean, more equitable, and resilient planet, guiding humanity toward a sustainable energy future
 It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits
 The outlook for solar energy remains optimistic, with the global solar market projected to expand significantly
 The development of solar technologies has significantly reduced the

In [17]:

def chunk_text(text, max_words=800, overlap_sentences=2):
    """Chunk text into sentence-bound pieces with optional overlap"""
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    chunks = []
    current_chunk = []
    word_count = 0

    for i, sentence in enumerate(sentences):
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > max_words and current_chunk:
            chunks.append(" ".join(current_chunk))
            # Keep overlap sentences
            current_chunk = current_chunk[-overlap_sentences:]
            word_count = sum(len(s.split()) for s in current_chunk)
        current_chunk.append(sentence)
        word_count += sentence_word_count

    if current_chunk:
        chunks.append(" ".join(current_chunk))
    return chunks

def bullet_summarize(text, max_chunk_words=800, min_chunk_words=50):
    """
    Summarizes text into detailed bullet points.
    """
    chunks = chunk_text(text, max_words=max_chunk_words)
    bullet_points = []

    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue
        try:
            # Summarize each chunk
            summary = summarizer(chunk, max_length=350, min_length=120, do_sample=False)[0]['summary_text']
            # Split into sentences and format as bullets
            sentences = [s.strip() for s in summary.replace("\n", " ").split(". ") if s.strip()]
            bullets = [f"• {s}." if not s.endswith('.') else f"• {s}" for s in sentences]
            bullet_points.extend(bullets)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")

    # Combine all bullets into a final string
    if not bullet_points:
        return "⚠️ Text too short or all chunks failed."

    # Optional: summarize the bullets again if very long
    combined_bullets_text = " ".join([b.replace("• ", "") for b in bullet_points])
    if len(combined_bullets_text.split()) > 800:  # if too long, summarize again
        try:
            final_summary = summarizer(
                combined_bullets_text,
                max_length=500,
                min_length=200,
                do_sample=False
            )[0]['summary_text']
            # Split into bullets again
            sentences = [s.strip() for s in final_summary.replace("\n", " ").split(". ") if s.strip()]
            bullet_points = [f"• {s}." if not s.endswith('.') else f"• {s}" for s in sentences]
        except Exception as e:
            print(f"⚠️ Final summarization failed: {e}")

    return "\n".join(bullet_points)

# Example usage:
summary = bullet_summarize(text)
print(summary)


• Solar energy is a renewable and sustainable power source that harnesses the energy of the sun to generate electricity or heat.
• It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits.
• The development of solar technologies has significantly reduced the cost of solar panels, making it more accessible for homes, businesses, and even large-scale power plants.
• Despite these benefits, challenges remain, such as the intermittency of sunlight, which makes energy storage systems essential for ensuring a reliable power supply.
• Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly.
• Solar energy is becoming a cornerstone of a cleaner, greener, and more sustainable future for all.
• The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization.
• Large-

In [None]:
from sentence_transformers import SentenceTransformer, util
embedder = SentenceTransformer('all-MiniLM-L6-v2')  # lightweight, fast

def chunk_text(text, max_words=800, overlap_sentences=2):
    """Chunk text into sentence-bound pieces with optional overlap"""
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    chunks = []
    current_chunk = []
    word_count = 0

    for i, sentence in enumerate(sentences):
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > max_words and current_chunk:
            chunks.append(" ".join(current_chunk))
            current_chunk = current_chunk[-overlap_sentences:]  # overlap
            word_count = sum(len(s.split()) for s in current_chunk)
        current_chunk.append(sentence)
        word_count += sentence_word_count

    if current_chunk:
        chunks.append(" ".join(current_chunk))
    return chunks

def remove_similar_bullets(bullets, similarity_threshold=0.85):
    """Remove bullets that are too similar to each other"""
    embeddings = embedder.encode(bullets, convert_to_tensor=True)
    keep = []
    for i, emb in enumerate(embeddings):
        is_duplicate = False
        for j in keep:
            sim = util.cos_sim(emb, embeddings[j]).item()
            if sim > similarity_threshold:
                is_duplicate = True
                break
        if not is_duplicate:
            keep.append(i)
    return [bullets[i] for i in keep]

def bullet_summarize_unique(text, max_chunk_words=800, min_chunk_words=50):
    """Hierarchical bullet-point summarization with duplicate removal"""
    chunks = chunk_text(text, max_words=max_chunk_words)
    bullet_points = []

    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue
        try:
            summary = summarizer(chunk, max_length=350, min_length=120, do_sample=False)[0]['summary_text']
            sentences = [s.strip() for s in summary.replace("\n", " ").split(". ") if s.strip()]
            bullets = [f"• {s}." if not s.endswith('.') else f"• {s}" for s in sentences]
            bullet_points.extend(bullets)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")

    if not bullet_points:
        return "⚠️ Text too short or all chunks failed."

    # Remove duplicates / very similar points
    bullet_points = remove_similar_bullets(bullet_points)

    # Optional: summarize again if too long
    combined_text = " ".join([b.replace("• ", "") for b in bullet_points])
    if len(combined_text.split()) > 800:
        try:
            final_summary = summarizer(
                combined_text,
                max_length=500,
                min_length=200,
                do_sample=False
            )[0]['summary_text']
            sentences = [s.strip() for s in final_summary.replace("\n", " ").split(". ") if s.strip()]
            bullet_points = [f"• {s}." if not s.endswith('.') else f"• {s}" for s in sentences]
            bullet_points = remove_similar_bullets(bullet_points)
        except Exception as e:
            print(f"⚠️ Final summarization failed: {e}")

    return "\n".join(bullet_points)

# Example usage:
summary = bullet_summarize_unique(text)
print(summary)

In [19]:
from sentence_transformers import  util
from collections import defaultdict


# Predefined topics with keywords (can be extended)
TOPICS = {
    "Overview": ["overview", "introduction", "summary", "general", "background"],
    "Benefits": ["benefit", "advantage", "positive", "impact", "gain"],
    "Challenges": ["challenge", "problem", "issue", "difficulty", "limitation", "hurdle"],
    "Technology": ["technology", "method", "technique", "system", "innovation", "process"],
    "Applications": ["application", "use", "deployment", "implementation", "utilization"],
    "Future Outlook": ["future", "prospect", "trend", "growth", "development", "potential"]
}

def chunk_text(text, max_words=800, overlap_sentences=2):
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    chunks = []
    current_chunk = []
    word_count = 0

    for sentence in sentences:
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > max_words and current_chunk:
            chunks.append(" ".join(current_chunk))
            current_chunk = current_chunk[-overlap_sentences:]
            word_count = sum(len(s.split()) for s in current_chunk)
        current_chunk.append(sentence)
        word_count += sentence_word_count

    if current_chunk:
        chunks.append(" ".join(current_chunk))
    return chunks

def remove_similar_bullets(bullets, similarity_threshold=0.85):
    embeddings = embedder.encode(bullets, convert_to_tensor=True)
    keep = []
    for i, emb in enumerate(embeddings):
        is_duplicate = False
        for j in keep:
            sim = util.cos_sim(emb, embeddings[j]).item()
            if sim > similarity_threshold:
                is_duplicate = True
                break
        if not is_duplicate:
            keep.append(i)
    return [bullets[i] for i in keep]

def assign_topic(bullet):
    bullet_lower = bullet.lower()
    for topic, keywords in TOPICS.items():
        if any(k in bullet_lower for k in keywords):
            return topic
    return "Miscellaneous"

def structured_bullet_summarize(text, max_chunk_words=800, min_chunk_words=50):
    chunks = chunk_text(text, max_words=max_chunk_words)
    bullet_points = []

    # Summarize each chunk into bullets
    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue
        try:
            summary = summarizer(chunk, max_length=350, min_length=120, do_sample=False)[0]['summary_text']
            sentences = [s.strip() for s in summary.replace("\n", " ").split(". ") if s.strip()]
            bullets = [f"• {s}." if not s.endswith('.') else f"• {s}" for s in sentences]
            bullet_points.extend(bullets)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")

    if not bullet_points:
        return "⚠️ Text too short or all chunks failed."

    # Remove duplicates
    bullet_points = remove_similar_bullets(bullet_points)

    # Assign bullets to topics
    structured_summary = defaultdict(list)
    for bullet in bullet_points:
        topic = assign_topic(bullet)
        structured_summary[topic].append(bullet)

    # Format final structured summary
    output_lines = []
    for topic in TOPICS.keys():
        if structured_summary[topic]:
            output_lines.append(f"### {topic}")
            output_lines.extend(structured_summary[topic])
            output_lines.append("")  # empty line for spacing

    # Add miscellaneous bullets at the end
    if structured_summary["Miscellaneous"]:
        output_lines.append("### Miscellaneous")
        output_lines.extend(structured_summary["Miscellaneous"])

    return "\n".join(output_lines)

# Example usage:
summary = structured_bullet_summarize(text)
print(summary)


### Benefits
• It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits.
• Despite these benefits, challenges remain, such as the intermittency of sunlight, which makes energy storage systems essential for ensuring a reliable power supply.

### Technology
• Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly.

### Future Outlook
• The development of solar technologies has significantly reduced the cost of solar panels, making it more accessible for homes, businesses, and even large-scale power plants.
• Solar energy is becoming a cornerstone of a cleaner, greener, and more sustainable future for all.
• Ultimately, solar energy is not just an alternative solution; it is a symbol of progress and hope for a clean, more equitable, and resilient planet, guiding humanity toward a sustainable energy future.

### Miscellan

In [22]:
from sentence_transformers import util
from sklearn.cluster import AgglomerativeClustering
from collections import defaultdict


def chunk_text(text, max_words=800, overlap_sentences=2):
    """Chunk text into sentence-bound pieces with optional overlap"""
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    chunks = []
    current_chunk = []
    word_count = 0

    for sentence in sentences:
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > max_words and current_chunk:
            chunks.append(" ".join(current_chunk))
            current_chunk = current_chunk[-overlap_sentences:]
            word_count = sum(len(s.split()) for s in current_chunk)
        current_chunk.append(sentence)
        word_count += sentence_word_count

    if current_chunk:
        chunks.append(" ".join(current_chunk))
    return chunks

def remove_similar_bullets(bullets, similarity_threshold=0.85):
    embeddings = embedder.encode(bullets, convert_to_tensor=True)
    keep = []
    for i, emb in enumerate(embeddings):
        is_duplicate = False
        for j in keep:
            sim = util.cos_sim(emb, embeddings[j]).item()
            if sim > similarity_threshold:
                is_duplicate = True
                break
        if not is_duplicate:
            keep.append(i)
    return [bullets[i] for i in keep]

def cluster_bullets(bullets, n_clusters=None):
    """
    Cluster bullets based on semantic similarity.
    If n_clusters is None, it uses a distance threshold automatically.
    """
    embeddings = embedder.encode(bullets)
    
    if len(bullets) == 1:
        return {0: [bullets[0]]}  # Single bullet

    # Handle scikit-learn >=1.2
    if n_clusters is None:
        clustering = AgglomerativeClustering(
            n_clusters=None,
            metric='cosine',
            linkage='average',
            distance_threshold=0.4
        )
    else:
        clustering = AgglomerativeClustering(
            n_clusters=n_clusters,
            metric='cosine',
            linkage='average'
        )

    labels = clustering.fit_predict(embeddings)
    clustered = defaultdict(list)
    for label, bullet in zip(labels, bullets):
        clustered[label].append(bullet)
    return clustered

def generate_topic_names(clustered_bullets):
    """
    Generate a simple topic name for each cluster by extracting common keywords.
    """
    topics = {}
    for label, bullets in clustered_bullets.items():
        words = []
        for bullet in bullets:
            doc = nlp(bullet)
            words.extend([token.lemma_ for token in doc if token.is_alpha and not token.is_stop])
        if words:
            word_freq = {}
            for w in words:
                word_freq[w] = word_freq.get(w, 0) + 1
            top_word = max(word_freq, key=word_freq.get)
            topics[label] = top_word.capitalize()
        else:
            topics[label] = f"Topic {label+1}"
    return topics

def structured_bullet_summarize_auto(text, max_chunk_words=800, min_chunk_words=50):
    """Hierarchical bullet-point summarization with auto topic detection"""
    chunks = chunk_text(text, max_words=max_chunk_words)
    bullet_points = []

    # Summarize each chunk into bullets
    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue
        try:
            summary = summarizer(chunk, max_length=350, min_length=120, do_sample=False)[0]['summary_text']
            sentences = [s.strip() for s in summary.replace("\n", " ").split(". ") if s.strip()]
            bullets = [f"• {s}." if not s.endswith('.') else f"• {s}" for s in sentences]
            bullet_points.extend(bullets)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")

    if not bullet_points:
        return "⚠️ Text too short or all chunks failed."

    # Remove duplicates
    bullet_points = remove_similar_bullets(bullet_points)

    # Cluster bullets into topics
    clustered = cluster_bullets(bullet_points)

    # Generate topic names automatically
    topic_names = generate_topic_names(clustered)

    # Format final structured summary
    output_lines = []
    for label, bullets in clustered.items():
        output_lines.append(f"### {topic_names[label]}")
        output_lines.extend(bullets)
        output_lines.append("")  # spacing

    return "\n".join(output_lines)


# Example usage:
# text = "Your long text here..."
summary = structured_bullet_summarize_auto(text)
print(summary)


### Solar
• Solar energy is a renewable and sustainable power source that harnesses the energy of the sun to generate electricity or heat.
• The development of solar technologies has significantly reduced the cost of solar panels, making it more accessible for homes, businesses, and even large-scale power plants.
• Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly.
• Solar energy is becoming a cornerstone of a cleaner, greener, and more sustainable future for all.
• The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization.
• Ultimately, solar energy is not just an alternative solution; it is a symbol of progress and hope for a clean, more equitable, and resilient planet, guiding humanity toward a sustainable energy future.

### Promising
• It is one of the most promising alternatives to fossil fuels, prima

In [23]:
from sentence_transformers import util
from sklearn.cluster import AgglomerativeClustering
from collections import defaultdict

def chunk_text(text, max_words=800, overlap_sentences=2):
    """Chunk text into sentence-bound pieces with optional overlap"""
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    chunks = []
    current_chunk = []
    word_count = 0

    for sentence in sentences:
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > max_words and current_chunk:
            chunks.append(" ".join(current_chunk))
            current_chunk = current_chunk[-overlap_sentences:]
            word_count = sum(len(s.split()) for s in current_chunk)
        current_chunk.append(sentence)
        word_count += sentence_word_count

    if current_chunk:
        chunks.append(" ".join(current_chunk))
    return chunks

def remove_similar_bullets(bullets, similarity_threshold=0.85):
    embeddings = embedder.encode(bullets, convert_to_tensor=True)
    keep = []
    for i, emb in enumerate(embeddings):
        is_duplicate = False
        for j in keep:
            sim = util.cos_sim(emb, embeddings[j]).item()
            if sim > similarity_threshold:
                is_duplicate = True
                break
        if not is_duplicate:
            keep.append(i)
    return [bullets[i] for i in keep]

def cluster_bullets(bullets, n_clusters=None):
    """
    Cluster bullets based on semantic similarity.
    If n_clusters is None, uses distance_threshold automatically.
    """
    embeddings = embedder.encode(bullets)
    if len(bullets) == 1:
        return {0: [bullets[0]]}

    # Correct usage for sklearn >=1.2
    if n_clusters is None:
        clustering = AgglomerativeClustering(
            n_clusters=None,
            metric='cosine',       # fixed from 'affinity'
            linkage='average',
            distance_threshold=0.4
        )
    else:
        clustering = AgglomerativeClustering(
            n_clusters=n_clusters,
            metric='cosine',
            linkage='average'
        )

    labels = clustering.fit_predict(embeddings)
    clustered = defaultdict(list)
    for label, bullet in zip(labels, bullets):
        clustered[label].append(bullet)
    return clustered

def generate_topic_names(clustered_bullets):
    """
    Generate a simple topic name for each cluster by extracting common keywords.
    """
    topics = {}
    for label, bullets in clustered_bullets.items():
        words = []
        for bullet in bullets:
            doc = nlp(bullet)
            words.extend([token.lemma_ for token in doc if token.is_alpha and not token.is_stop])
        if words:
            word_freq = {}
            for w in words:
                word_freq[w] = word_freq.get(w, 0) + 1
            top_word = max(word_freq, key=word_freq.get)
            topics[label] = top_word.capitalize()
        else:
            topics[label] = f"Topic {label+1}"
    return topics

def structured_bullet_summarize_auto(text, max_chunk_words=800, min_chunk_words=50):
    """Hierarchical bullet-point summarization with auto topic detection"""
    chunks = chunk_text(text, max_words=max_chunk_words)
    bullet_points = []

    # Summarize each chunk into bullets
    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue
        try:
            summary = summarizer(chunk, max_length=350, min_length=120, do_sample=False)[0]['summary_text']
            sentences = [s.strip() for s in summary.replace("\n", " ").split(". ") if s.strip()]
            bullets = [f"• {s}." if not s.endswith('.') else f"• {s}" for s in sentences]
            bullet_points.extend(bullets)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")

    if not bullet_points:
        return "⚠️ Text too short or all chunks failed."

    # Remove duplicates
    bullet_points = remove_similar_bullets(bullet_points)

    # Cluster bullets into topics
    clustered = cluster_bullets(bullet_points)

    # Generate topic names automatically
    topic_names = generate_topic_names(clustered)

    # Format final structured summary
    output_lines = []
    for label, bullets in clustered.items():
        output_lines.append(f"### {topic_names[label]}")
        output_lines.extend(bullets)
        output_lines.append("")  # spacing

    return "\n".join(output_lines)


# Example usage:
# text = "Paste your long text here..."
summary = structured_bullet_summarize_auto(text)
print(summary)


### Solar
• Solar energy is a renewable and sustainable power source that harnesses the energy of the sun to generate electricity or heat.
• The development of solar technologies has significantly reduced the cost of solar panels, making it more accessible for homes, businesses, and even large-scale power plants.
• Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly.
• Solar energy is becoming a cornerstone of a cleaner, greener, and more sustainable future for all.
• The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization.
• Ultimately, solar energy is not just an alternative solution; it is a symbol of progress and hope for a clean, more equitable, and resilient planet, guiding humanity toward a sustainable energy future.

### Promising
• It is one of the most promising alternatives to fossil fuels, prima

In [25]:
from sentence_transformers import util
from sklearn.cluster import AgglomerativeClustering
from collections import defaultdict, Counter


def chunk_text(text, max_words=800, overlap_sentences=2):
    """Chunk text into sentence-based pieces."""
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    chunks, current_chunk, word_count = [], [], 0

    for sentence in sentences:
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > max_words and current_chunk:
            chunks.append(" ".join(current_chunk))
            current_chunk = current_chunk[-overlap_sentences:]
            word_count = sum(len(s.split()) for s in current_chunk)
        current_chunk.append(sentence)
        word_count += sentence_word_count

    if current_chunk:
        chunks.append(" ".join(current_chunk))
    return chunks


def remove_similar_bullets(bullets, threshold=0.85):
    """Remove bullets that are semantically too similar."""
    embeddings = embedder.encode(bullets, convert_to_tensor=True)
    keep = []
    for i, emb in enumerate(embeddings):
        if any(util.cos_sim(emb, embeddings[j]).item() > threshold for j in keep):
            continue
        keep.append(i)
    return [bullets[i] for i in keep]


def cluster_texts(texts, distance_threshold=0.4):
    """Cluster texts using embeddings and agglomerative clustering."""
    if len(texts) == 1:
        return {0: texts}

    embeddings = embedder.encode(texts)

    clustering = AgglomerativeClustering(
        n_clusters=None,         # auto cluster based on distance threshold
        metric='cosine',         # <- correct usage in sklearn >=1.2
        linkage='average',
        distance_threshold=distance_threshold
    )
    labels = clustering.fit_predict(embeddings)
    clusters = defaultdict(list)
    for label, text in zip(labels, texts):
        clusters[label].append(text)
    return clusters


def generate_topic_name(texts):
    """Generate a simple topic name from frequent keywords."""
    words = []
    for t in texts:
        doc = nlp(t)
        words.extend([token.lemma_.capitalize() for token in doc if token.is_alpha and not token.is_stop])
    if words:
        most_common = Counter(words).most_common(1)[0][0]
        return most_common
    return "Miscellaneous"


def summarize_to_nested_bullets(text, max_chunk_words=800, min_chunk_words=50):
    """Hierarchical nested bullet-point summarization."""
    chunks = chunk_text(text, max_words=max_chunk_words)
    bullet_points = []

    # Summarize chunks into bullets
    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue
        try:
            summary = summarizer(chunk, max_length=350, min_length=120, do_sample=False)[0]['summary_text']
            sentences = [s.strip() for s in summary.replace("\n", " ").split(". ") if s.strip()]
            bullets = [f"{s}." if not s.endswith('.') else s for s in sentences]
            bullet_points.extend(bullets)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")

    if not bullet_points:
        return "⚠️ Text too short or all chunks failed."

    # Remove duplicates
    bullet_points = remove_similar_bullets(bullet_points)

    # Cluster bullets into main topics
    main_clusters = cluster_texts(bullet_points, distance_threshold=0.35)
    nested_summary = {}

    for label, bullets in main_clusters.items():
        topic_name = generate_topic_name(bullets)
        # Cluster bullets within main topic for sub-bullets
        sub_clusters = cluster_texts(bullets, distance_threshold=0.2)
        sub_bullets = {}
        for sub_label, sub_bullets_list in sub_clusters.items():
            sub_topic_name = generate_topic_name(sub_bullets_list)
            sub_bullets[sub_topic_name] = sub_bullets_list
        nested_summary[topic_name] = sub_bullets

    # Format nested bullets
    output_lines = []
    for topic, sub_dict in nested_summary.items():
        output_lines.append(f"### {topic}")
        for subtopic, bullets in sub_dict.items():
            if subtopic != topic:
                output_lines.append(f"  - {subtopic}")
            for bullet in bullets:
                output_lines.append(f"    • {bullet}")
        output_lines.append("")  # spacing
    return "\n".join(output_lines)


# Example usage:
# text = "Paste your long text here..."
summary = summarize_to_nested_bullets(text)
print(summary)


### Energy
    • The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management.

### Promising
    • It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits.

### Solar
    • Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly.
  - Transition
    • The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization.

### Despite
    • Despite these benefits, challenges remain, such as the intermittency of sunlight, which makes energy storage systems essential for ensuring a reliable power supply.

### Large
    • Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids.



In [28]:
from sentence_transformers import util
from sklearn.cluster import AgglomerativeClustering
from collections import defaultdict, Counter
# import spacy

# # Load models
# nlp = spacy.load("en_core_web_sm")
# embedder = SentenceTransformer('all-MiniLM-L6-v2')
# from transformers import pipeline
# summarizer = pipeline("summarization")


def chunk_text(text, max_words=800, overlap_sentences=2):
    doc = nlp(text)
    sentences = [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    chunks, current_chunk, word_count = [], [], 0
    for sentence in sentences:
        sentence_word_count = len(sentence.split())
        if word_count + sentence_word_count > max_words and current_chunk:
            chunks.append(" ".join(current_chunk))
            current_chunk = current_chunk[-overlap_sentences:]
            word_count = sum(len(s.split()) for s in current_chunk)
        current_chunk.append(sentence)
        word_count += sentence_word_count
    if current_chunk:
        chunks.append(" ".join(current_chunk))
    return chunks


def remove_similar_bullets(bullets, threshold=0.85):
    embeddings = embedder.encode(bullets, convert_to_tensor=True)
    keep = []
    for i, emb in enumerate(embeddings):
        if any(util.cos_sim(emb, embeddings[j]).item() > threshold for j in keep):
            continue
        keep.append(i)
    return [bullets[i] for i in keep]


def cluster_texts(texts, distance_threshold=0.4):
    if len(texts) == 1:
        return {0: texts}

    embeddings = embedder.encode(texts)

    clustering = AgglomerativeClustering(
        n_clusters=None,
        metric='cosine',  # <- FIXED for sklearn >=1.2
        linkage='average',
        distance_threshold=distance_threshold
    )
    labels = clustering.fit_predict(embeddings)
    clusters = defaultdict(list)
    for label, text in zip(labels, texts):
        clusters[label].append(text)
    return clusters


def generate_topic_name(texts):
    words = []
    for t in texts:
        doc = nlp(t)
        words.extend([token.lemma_.capitalize() for token in doc if token.is_alpha and not token.is_stop])
    if words:
        most_common = Counter(words).most_common(1)[0][0]
        return most_common
    return "Miscellaneous"


def rank_bullets_by_importance(bullets, full_text):
    bullet_embeddings = embedder.encode(bullets, convert_to_tensor=True)
    text_embedding = embedder.encode([full_text], convert_to_tensor=True)[0]
    scores = [util.cos_sim(b, text_embedding).item() for b in bullet_embeddings]
    ranked = [b for _, b in sorted(zip(scores, bullets), reverse=True)]
    return ranked


def summarize_to_nested_bullets_ranked(text, max_chunk_words=800, min_chunk_words=50):
    chunks = chunk_text(text, max_words=max_chunk_words)
    bullet_points = []

    for i, chunk in enumerate(chunks):
        if len(chunk.split()) < min_chunk_words:
            continue
        try:
            summary = summarizer(chunk, max_length=350, min_length=120, do_sample=False)[0]['summary_text']
            sentences = [s.strip() for s in summary.replace("\n", " ").split(". ") if s.strip()]
            bullets = [f"{s}." if not s.endswith('.') else s for s in sentences]
            bullet_points.extend(bullets)
        except Exception as e:
            print(f"⚠️ Skipping chunk {i+1} due to error: {e}")

    if not bullet_points:
        return "⚠️ Text too short or all chunks failed."

    bullet_points = remove_similar_bullets(bullet_points)
    bullet_points = rank_bullets_by_importance(bullet_points, text)

    main_clusters = cluster_texts(bullet_points, distance_threshold=0.35)
    nested_summary = {}

    for label, bullets in main_clusters.items():
        topic_name = generate_topic_name(bullets)
        sub_clusters = cluster_texts(bullets, distance_threshold=0.2)
        sub_bullets = {}
        for sub_label, sub_bullets_list in sub_clusters.items():
            sub_topic_name = generate_topic_name(sub_bullets_list)
            sub_bullets_list = rank_bullets_by_importance(sub_bullets_list, text)
            sub_bullets[sub_topic_name] = sub_bullets_list
        nested_summary[topic_name] = sub_bullets

    output_lines = []
    for topic, sub_dict in nested_summary.items():
        output_lines.append(f"### {topic}")
        for subtopic, bullets in sub_dict.items():
            if subtopic != topic:
                output_lines.append(f"  - {subtopic}")
            for bullet in bullets:
                output_lines.append(f"    • {bullet}")
        output_lines.append("")
    return "\n".join(output_lines)


# Example usage:
# text = "Paste your text here"
summary = summarize_to_nested_bullets_ranked(text)
print(summary)


### Energy
    • The integration of solar energy with smart grids and Internet of Things (IoT) technologies has opened new possibilities for efficient energy management.

### Solar
  - Transition
    • The transition to a fully solar-powered world requires continued investment in infrastructure, storage, and grid modernization.
    • Nevertheless, as research and innovation continue, the outlook for solar energy remains optimistic, with the global solar market projected to expand significantly.

### Despite
    • Despite these benefits, challenges remain, such as the intermittency of sunlight, which makes energy storage systems essential for ensuring a reliable power supply.

### Large
    • Large-scale solar farms are now being established worldwide, feeding clean electricity directly into national grids.

### Promising
    • It is one of the most promising alternatives to fossil fuels, primarily due to its abundance and environmental benefits.

