# NLP Toolkit Demo

This notebook demonstrates the key features of our NLP Toolkit. We'll explore text analysis, document processing, and language tools through practical examples.

In [None]:
import sys
sys.path.append('..')

from src.text_analyzer import TextAnalyzer
from src.document_processor import DocumentProcessor
from src.language_tools import LanguageTools

# Initialize components
analyzer = TextAnalyzer()
processor = DocumentProcessor()
lang_tools = LanguageTools()

## 1. Text Analysis

Let's analyze a sample text using various NLP techniques.

In [None]:
sample_text = """
Artificial Intelligence has transformed the technology landscape dramatically. 
Companies like OpenAI, Google, and Microsoft are investing heavily in AI research. 
The development of large language models has opened new possibilities in natural 
language processing and understanding. However, concerns about AI safety and ethics 
remain important considerations for the future of this technology.
"""

# Analyze text
analysis_results = analyzer.analyze(sample_text)

# Display results
print("Sentiment Analysis:")
print(analysis_results['sentiment'])
print("\nNamed Entities:")
print(analysis_results['entities'])
print("\nKey Phrases:")
print(analysis_results['key_phrases'][:3])
print("\nSummary:")
print(analysis_results['summary'])

## 2. Document Processing

Now, let's process a larger document and extract meaningful insights.

In [None]:
# Create a sample document
with open('sample_document.txt', 'w') as f:
    f.write("""
    # Introduction to Machine Learning
    
    Machine Learning is a subset of artificial intelligence that focuses on developing
    systems that can learn from and make decisions based on data. Unlike traditional
    programming, where rules are explicitly coded, machine learning algorithms improve
    through experience.
    
    ## Types of Machine Learning
    
    1. Supervised Learning: The algorithm learns from labeled training data.
    2. Unsupervised Learning: The algorithm finds patterns in unlabeled data.
    3. Reinforcement Learning: The algorithm learns through trial and error.
    
    ## Applications
    
    Machine learning has numerous applications across industries:
    - Image and Speech Recognition
    - Natural Language Processing
    - Recommendation Systems
    - Autonomous Vehicles
    
    The future of machine learning looks promising as more data becomes available
    and computing power continues to increase.
    """)

# Process document
doc_results = processor.process_document('sample_document.txt')

print("Document Topics:")
for topic in doc_results['topics']:
    print(f"Topic {topic['id']}: {', '.join(topic['words'])}")

print("\nDocument Structure:")
print(doc_results['structure'])

print("\nDocument Summary:")
print(doc_results['summary'])

## 3. Language Tools

Let's explore translation, grammar checking, and text correction features.

In [None]:
# Translation
text_to_translate = "Artificial Intelligence is changing the world."
translation = lang_tools.translate(text_to_translate, target_lang='es')
print("Translation:")
print(f"Original: {translation['original']}")
print(f"Translated: {translation['translated']}")

# Grammar Check
text_with_errors = "The company have many employee who works hard."
grammar_check = lang_tools.check_grammar(text_with_errors)
print("\nGrammar Check:")
print(f"Score: {grammar_check['score']}")
print("Issues:")
for issue in grammar_check['issues']:
    print(f"- {issue['text']}: {issue.get('suggestion', 'No suggestion')}")

# Text Correction
correction = lang_tools.correct_text(
    text_with_errors,
    fix_spelling=True,
    fix_grammar=True,
    fix_punctuation=True
)
print("\nText Correction:")
print(f"Original: {correction['original']}")
print(f"Corrected: {correction['corrected']}")
print(f"Improvement Score: {correction['improvement_score']}")

## Conclusion

This demo showcases the main features of our NLP Toolkit:
1. Text Analysis: Sentiment analysis, entity recognition, key phrase extraction, and summarization
2. Document Processing: Topic modeling, structure analysis, and document summarization
3. Language Tools: Translation, grammar checking, and text correction

Feel free to explore more features and customize the toolkit for your specific needs!