### Text Chunking Demo Notebook Overview
# 
*  This notebook demonstrates the text chunking functionality for financial complaint analysis.
# 
### What This Notebook Shows:
  - **Basic Chunking**: Simple text splitting with configurable parameters
  - **Multiple Methods**: Different chunking strategies (recursive character, sentence-based, paragraph-based)
  - **Parameter Experimentation**: Finding optimal chunking parameters for your data
  - **Real Data Integration**: Processing actual financial complaint narratives
  - **Performance Analysis**: Metrics and insights about chunking effectiveness

 ### Key Features:
  - Import and use the text chunking module from the `src` directory
  - Run various demonstrations of chunking strategies
  - Process actual complaint data from CSV files
  - Analyze chunking performance and save results
  - Export chunked data for further analysis
# 
 ### Notebook Structure:
*  The notebook is organized into cells that progressively build up the demonstration:
  1. **Setup**: Import modules and configure paths
  2. **Basic Demo**: Simple text chunking examples
  3. **Method Comparison**: Different chunking approaches
  4. **Parameter Tuning**: Finding optimal settings
  5. **Real Data**: Processing actual complaint narratives
  6. **Analysis**: Performance metrics and results
# 
 ### Expected Outputs:
  - Chunked narratives saved as CSV files
  - Performance summaries and metrics
  - Visualizations of chunking effectiveness
  - Recommendations for optimal parameters


### Adding the scripts directory to import functions

In [4]:
import sys
import os
current_dir = os.getcwd()
src_path = os.path.join(current_dir, '..', 'scripts')
if os.path.exists(src_path):
    sys.path.append(src_path)
else:
    # Try alternative path structure
    alt_src_path = os.path.join(current_dir, 'scripts')
    if os.path.exists(alt_src_path):
        sys.path.append(alt_src_path)
    else:
        print("Warning: Could not find src directory. Please ensure the path is correct.")


## Text Chunking Demo Notebook
# 
#### This notebook demonstrates the text chunking functionality for financial complaint analysis.
#### It shows how to:
  - Import and use the text chunking module
  - Run various demonstrations of chunking strategies
  - Process actual complaint data
  - Analyze chunking performance
- The notebook is organized into cells that progressively build up the demonstration, from basic functionality to full integration with real data.


In [5]:
# Import the demo module
from demo_text_chunking import (
    demo_basic_chunking,
    demo_chunking_methods,
    demo_parameter_experimentation,
    demo_optimal_strategy,
    demo_integration,
    NarrativeChunkingStrategy,
    experiment_parameters,
    create_optimal_strategy
)

print("Text Chunking Demo Module imported successfully!")



Text Chunking Demo Module imported successfully!


### Running the Text Chunking Demonstrations
# 
* In the next cell, we'll run all the text chunking demonstrations to see the module in action. 
### This will include:
# 
 - **Basic chunking**: Simple text splitting with default parameters
 - **Different methods**: Testing various chunking approaches (recursive character, sentence-based, paragraph-based)
 - **Parameter experimentation**: Finding optimal chunking parameters for our data
 - **Optimal strategy creation**: Automatically determining the best chunking configuration
 - **Integration demo**: Processing actual complaint data and saving results
* The demonstrations will show both the functionality and performance metrics of the chunking process.


In [6]:
# Define the main function that was missing
def main():
    """Run all demonstrations."""
    print("Text Chunking Module Demo")
    print("=" * 40)
    
    demo_basic_chunking()
    demo_chunking_methods()
    demo_parameter_experimentation()
    demo_optimal_strategy()
    demo_integration()
    
    print("\nDemo completed!")

# Now call the main function
main()


INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully processed 1 chunks from 1 narratives
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 0 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully processed 1 chunks from 2 narratives
INFO:text_chunking:Successfully chunked narrative into 0 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully processed 1 chunks from 2 narratives
INFO:text_chunking:Successfully chunked narrative into 0 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully processed 1 chunks from 2 narratives
INFO:text_chunking:Successfully chunked narrative into 0 chunks
INFO:text_chunking:S

Text Chunking Module Demo
=== Basic Text Chunking ===
Created 1 chunks
Chunk 1: This is a long narrative that needs to be chunked into smaller pieces for better...

=== Different Chunking Methods ===
recursive_character: 1 chunks
custom_sentence: 1 chunks
custom_paragraph: 1 chunks

=== Parameter Experimentation ===
Best: custom_sentence, size=50, overlap=0.1

=== Optimal Strategy Creation ===
Optimal: custom_sentence, size=1000, overlap=200

=== Integration with Actual Complaint Data ===


INFO:text_chunking:Successfully chunked narrative into 2 chunks
INFO:text_chunking:Successfully chunked narrative into 3 chunks
INFO:text_chunking:Successfully chunked narrative into 4 chunks
INFO:text_chunking:Successfully chunked narrative into 10 chunks
INFO:text_chunking:Successfully chunked narrative into 4 chunks
INFO:text_chunking:Successfully chunked narrative into 2 chunks
INFO:text_chunking:Successfully chunked narrative into 3 chunks
INFO:text_chunking:Successfully chunked narrative into 12 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 8 chunks
INFO:text_chunking:Successfully chunked narrative into 2 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 3 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 3 chunks
INFO:text_chunking:Successfully chunke

Loaded 82164 complaints from d:\10Acadamy\Intelligent-Complaint-Analysis-for-Financial-Services\notebooks\..\data\complaints_processed.csv
Using column: Consumer complaint narrative
After filtering empty narratives: 82164 complaints
Processing 100 narratives...


INFO:text_chunking:Successfully chunked narrative into 2 chunks
INFO:text_chunking:Successfully chunked narrative into 2 chunks
INFO:text_chunking:Successfully chunked narrative into 9 chunks
INFO:text_chunking:Successfully chunked narrative into 2 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 3 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 3 chunks
INFO:text_chunking:Successfully chunked narrative into 1 chunks
INFO:text_chunking:Successfully chunked narrative into 4 chunks
INFO:text_chunking:Successfully chunked narrative into 2 chunks
INFO:text_chunking:Successfully chunked narrative into 2 chunks
INFO:text_chunking:Successfully chunked narrative into 4 chunks
INFO:text_chunking:Successfully chunked narrative into 4 chunks
INFO:text_chunking:Successfully chunked narrative into 6 chunks
INFO:text_chunking:Successfully chunked 

Processed 100 narratives into 412 chunks
Average chunk length: 301.2 characters
Average word count: 53.8 words
Saved chunked results to CSV: True
Output file: d:\10Acadamy\Intelligent-Complaint-Analysis-for-Financial-Services\notebooks\..\data\chunked_narratives.csv
Saved chunking summary to: d:\10Acadamy\Intelligent-Complaint-Analysis-for-Financial-Services\notebooks\..\data\chunking_summary.csv

Demo completed!
