# Document Analysis AI Models

This notebook explores the AI/ML models used in our document analysis app to handle various tasks such as summarization, sentiment analysis, content rephrasing, key idea extraction, and actionable recommendations. Each model is designed to understand the structure, context, and meaning of the text to provide relevant outputs for users.

## Summarization Model

The summarization model is responsible for generating concise summaries of large documents. It uses a sequence-to-sequence architecture to read the entire document and produce a coherent and comprehensive summary.

### Approach

The model utilizes a sequence-to-sequence transformer-based architecture that is trained on large datasets of documents and their corresponding summaries. The goal is to generate a summary that retains the essential information while removing redundant details.

### Code Example

```python
from transformers import pipeline

# Load pre-trained summarization model
summarizer = pipeline("summarization")

def generate_summary(document_text, max_length=150):
    """
    Generates a summary for the provided document text.

    Args:
        document_text (str): The full text of the document to summarize.
        max_length (int): The maximum length of the summary.

    Returns:
        str: A summarized version of the document.
    """
    summary = summarizer(document_text, max_length=max_length, min_length=50, do_sample=False)
    return summary[0]['summary_text']

# Example usage
document_text = "Your document text goes here..."
summary = generate_summary(document_text)
print("Generated Summary:", summary)
```

## Sentiment Analysis Model

The sentiment analysis model evaluates the sentiment of the provided text and returns a score that indicates the overall tone. This model helps users quickly gauge whether the document's content is positive, negative, or neutral.

### Approach

This model uses a pre-trained language model fine-tuned on sentiment classification datasets. It categorizes text based on sentiment scores ranging from -1 (very negative) to +1 (very positive).

### Code Example

```python
from transformers import pipeline

# Load pre-trained sentiment analysis model
sentiment_analyzer = pipeline("sentiment-analysis")

def analyze_sentiment(text):
    """
    Analyzes the sentiment of the provided text.

    Args:
        text (str): The text to analyze.

    Returns:
        dict: A dictionary containing the sentiment score and label.
    """
    result = sentiment_analyzer(text)[0]
    # Convert the sentiment label to a numeric score
    sentiment_score = 1 if result['label'] == 'POSITIVE' else -1
    return {
        'score': sentiment_score,
        'description': result['label']
    }

# Example usage
sentiment = analyze_sentiment("Your text to analyze sentiment goes here...")
print("Sentiment Analysis:", sentiment)
```

## Key Idea Extraction

The key idea extraction model identifies key points and core concepts from the document. This is useful for users who want a quick glance at the main ideas without reading the entire document.

### Approach

This model employs a Named Entity Recognition (NER) approach combined with topic modeling techniques to identify and extract important information from the text.

### Code Example

```python
from keybert import KeyBERT

# Load pre-trained KeyBERT model
keybert_model = KeyBERT()

def extract_key_ideas(text, top_n=5):
    """
    Extracts key ideas or concepts from the provided text.

    Args:
        text (str): The text to extract key ideas from.
        top_n (int): The number of key ideas to extract.

    Returns:
        list: A list of key ideas or concepts.
    """
    key_ideas = keybert_model.extract_keywords(text, keyphrase_ngram_range=(1, 2), stop_words='english', top_n=top_n)
    return [idea[0] for idea in key_ideas]

# Example usage
key_ideas = extract_key_ideas("Your document text goes here...")
print("Key Ideas:", key_ideas)
```

## Content Rephrasing Model

The content rephrasing model rewrites a section of the document based on a specified style or tone. This helps users adapt content for different contexts or audiences.

### Approach

The rephrasing model is built on a transformer-based architecture that can be fine-tuned for various tones, such as formal, casual, or storytelling. This allows the model to rephrase text to match the desired tone or style.

### Code Example

```python
from transformers import pipeline

# Load pre-trained text generation model for rephrasing
rephraser = pipeline("text2text-generation", model="t5-base")

def rephrase_text(text, style="formal"):
    """
    Rephrases the given text based on the specified style.

    Args:
        text (str): The text to rephrase.
        style (str): The desired style for rephrasing (e.g., "formal", "casual").

    Returns:
        str: The rephrased text.
    """
    prompt = f"Rephrase this text in a {style} style: {text}"
    rephrased = rephraser(prompt, max_length=200, do_sample=False)[0]['generated_text']
    return rephrased

# Example usage
rephrased_text = rephrase_text("Your text to rephrase goes here...", style="formal")
print("Rephrased Text:", rephrased_text)
```

## Actionable Recommendations

The actionable recommendations model analyzes a document and identifies follow-up actions or next steps. It is designed to aid in project management and task tracking.

### Approach

This model uses a combination of text classification and summarization techniques to detect key decisions, tasks, and responsibilities within a document. The model is fine-tuned to recognize phrases that imply actions or follow-ups.

### Code Example

```python
from transformers import pipeline

# Load pre-trained recommendation model
recommendation_model = pipeline("text2text-generation", model="t5-base")

def generate_recommendations(text):
    """
    Generates actionable recommendations or next steps based on the provided text.

    Args:
        text (str): The text to analyze for recommendations.

    Returns:
        str: The generated recommendations.
    """
    prompt = f"Based on this text, what are the next steps or actionable recommendations? {text}"
    recommendations = recommendation_model(prompt, max_length=150, do_sample=False)[0]['generated_text']
    return recommendations

# Example usage
recommendations = generate_recommendations("Your document text goes here...")
print("Actionable Recommendations:", recommendations)
```

## Conclusion

In this notebook, we explored the different AI/ML models used in the document analysis app. Each model has been designed and trained to perform specific tasks such as summarization, sentiment analysis, key idea extraction, content rephrasing, and generating actionable recommendations. These models enable users to efficiently analyze and interact with their documents.