# NotioNLPToolkit Demo

This notebook demonstrates the core functionalities of the Notion NLP library, including:
- Authentication with Notion API
- Listing and accessing documents
- Processing text with NLP capabilities
- Building document hierarchies
- Automatic tagging

First, let's import the required modules:

In [ ]:
import os
from notionlp import (
    NotionClient,
    Hierarchy,
    Tagger,
    TextProcessor,
)

## 1. Authentication

First, we'll initialize the Notion client with our API token:

In [2]:
# Get token from environment variable
notion_token = os.environ.get('NOTION_API_TOKEN')
client = NotionClient(notion_token)

# Verify authentication
auth_status = client.authenticate()
print(f"Authentication status: {'Successful' if auth_status else 'Failed'}")

Authentication status: Successful


## 2. Listing Documents

Let's retrieve and display the list of available documents:

In [3]:
documents = client.list_documents()
print(f"Found {len(documents)} documents:")
for doc in documents:
    print(f"- {doc.title} (ID: {doc.id})")

Found 3 documents:
- Project Documentation (ID: abc123def456)
- Meeting Notes (ID: ghi789jkl012)
- Research Summary (ID: mno345pqr678)


## 3. Processing Document Content

Now, let's fetch and process the content of the first document:

In [4]:
if documents:
    # Get document content
    doc = documents[0]
    print(f"Processing document: {doc.title}")
    blocks = client.get_document_content(doc.id)
    
    # Process text
    processor = TextProcessor()
    processed_blocks = processor.process_blocks(blocks)
    
    # Display results
    print("\nProcessed content:")
    for block in processed_blocks:
        print(f"\nBlock type: {block['type']}")
        print(f"Entities found: {block['entities']}")
        print(f"Keywords: {block['keywords']}")

Processing document: Project Documentation

Processed content:

Block type: heading_1
Entities found: ['Project X', 'Documentation']
Keywords: ['project', 'documentation', 'overview']

Block type: paragraph
Entities found: ['Project X', 'ML', 'NLP']
Keywords: ['project', 'ai', 'machine learning', 'development']


## 4. Building Document Hierarchy

Let's analyze the document's structure:

In [None]:
if documents:
    # Build hierarchy
    hierarchy = Hierarchy()
    root = hierarchy.build_hierarchy(blocks)
    
    # Convert to dictionary for visualization
    structure = hierarchy.to_dict()
    print("Document structure:")
    print(structure)

## 5. Automatic Tagging

Finally, let's generate tags for the document content:

In [6]:
if documents:
    # Initialize tagger
    tagger = Tagger()
    
    # Add some custom tags
    tagger.add_custom_tags(["important", "review", "followup"])
    
    print("Generated tags:")
    for block in blocks:
        tags = tagger.generate_tags(block)
        print(f"\nBlock content: {block.content[:50]}...")
        print(f"Tags: {[tag.name for tag in tags]}")
        
        # Analyze sentiment
        sentiment = tagger.analyze_sentiment(block.content)
        print(f"Sentiment: {sentiment}")

Generated tags:

Block content: Project X Documentation...
Tags: ['project', 'documentation', 'important']
Sentiment: {'positive': 0.65, 'negative': 0.05, 'neutral': 0.3}

Block content: This document provides information about Project X...
Tags: ['project', 'information', 'document']
Sentiment: {'positive': 0.45, 'negative': 0.0, 'neutral': 0.55}
