Skip to content

bltzr75/text-processing-microservices

Repository files navigation

Text Processing Microservices

Table of Contents

Project Requirements

Core Requirements

  1. Serving Service (FastAPI)

    • Expose a POST /summarize endpoint
    • Accept JSON text payload
    • Forward text to Processing Service via gRPC
    • Return processed result to client
  2. Processing Service (gRPC)

    • Expose ProcessText gRPC method
    • Perform NLP processing on input text
    • Return processed result to Serving Service
  3. NLP Features Implemented

    • Tokenization (split into words/tokens)
    • Sentence Splitting (split into sentences)
    • Keyword Extraction (top N important words/phrases)
    • Sentiment Analysis (positive/negative feedback)
    • Part-of-Speech Tagging (identifying nouns, verbs, etc.)
    • Named Entity Recognition (people, organizations, locations)
    • Summarization (extractive)
    • Text Classification (categorizing text)
  4. Technical Requirements

    • Python for both services
    • FastAPI for HTTP service
    • grpcio for gRPC service
    • Async I/O where relevant
    • Error handling and logging
    • Containerized setup with Docker/docker-compose
    • Health check endpoints
    • Unit tests for core logic
    • GitHub Actions for CI/CD

System Architecture Overview

graph TB
    subgraph "Client Layer"
        C[HTTP Client]
    end
    
    subgraph "API Gateway Layer"
        FS[FastAPI Serving Service<br/>:8000]
    end
    
    subgraph "Processing Layer"
        PS[gRPC Processing Service<br/>:50051]
        
        subgraph "NLP Pipelines"
            CP[Classical Pipeline<br/>spaCy-based]
            MP[Modern Pipeline<br/>Transformers-based]
        end
    end
    
    subgraph "Observability Layer"
        OT[OpenTelemetry<br/>Console Export]
    end
    
    C -->|POST /summarize| FS
    C -->|POST /compare| FS
    FS -->|gRPC ProcessText| PS
    PS --> CP
    PS --> MP
    PS -->|Metrics & Traces| OT
    FS -->|Metrics & Traces| OT
Loading

Architecture Components

  1. Serving Service: FastAPI-based HTTP API gateway (Port 8000)
  2. Processing Service: gRPC-based NLP processing engine (Port 50051)
  3. Pipeline System: Pluggable NLP processors with strategy pattern
  4. Configuration Management: Pydantic-based type-safe configs
  5. Observability: OpenTelemetry instrumentation (console export)
  6. Deployment: Docker containers orchestrated via docker-compose

Project Directory Structure

text-processing-microservices/
├── serving_service/                    # FastAPI HTTP Service
│   ├── main.py                         # FastAPI application
│   ├── api/                            
│   │   └── endpoints/                  
│   │       ├── summarize.py            # /summarize and /compare endpoints
│   │       └── health.py               # Health checks
│   ├── services/                       
│   │   └── grpc_client.py              # gRPC client wrapper
│   ├── models/                         
│   │   ├── requests.py                 # Request models
│   │   └── responses.py                # Response models
│   ├── middleware/
│   │   └── telemetry.py                # OpenTelemetry middleware
│   └── config/                         
│       └── settings.py                 # Service configuration
│
├── processing_service/                 # gRPC Processing Service
│   ├── main.py                         # gRPC server entry
│   ├── grpc_server/                    
│   │   ├── server.py                   # gRPC server implementation
│   │   └── servicer.py                 # ProcessText servicer
│   ├── processors/                     # NLP Processors
│   │   ├── base.py                     # Abstract processor
│   │   ├── classical/                  # Classical NLP
│   │   │   ├── tokenizer.py            # Tokenization
│   │   │   ├── sentence_splitter.py    # Sentence splitting
│   │   │   ├── pos_tagger.py           # POS tagging
│   │   │   ├── ner_extractor.py        # Named Entity Recognition
│   │   │   └── keyword_extractor.py    # TF-IDF keywords
│   │   └── modern/                     # Modern NLP
│   │       ├── summarizer.py           # DistilBART summarization
│   │       ├── sentiment_analyzer.py   # Sentiment analysis
│   │       └── text_classifier.py      # Zero-shot classification
│   ├── pipelines/                      # Pipeline Orchestrators
│   │   ├── classical_pipeline.py       # Classical flow
│   │   ├── modern_pipeline.py          # Modern flow
│   │   └── pipeline_manager.py         # Pipeline selection & comparison
│   ├── interceptors/
│   │   └── telemetry.py                # gRPC telemetry interceptor
│   └── utils/                          # Utilities
│       ├── metrics_collector.py        # Performance metrics
│       ├── pipeline_comparator.py      # Pipeline comparison
│       └── results_aggregator.py       # Results aggregation
│
├── shared/                             # Shared Components
│   ├── protos/                         # gRPC Definitions
│   │   └── text_processing.proto       # Proto definitions
│   ├── interfaces/                     # Abstract Interfaces
│   │   └── processor.py                # Processor interface
│   ├── exceptions/                     # Custom Exceptions
│   │   └── base.py                     # Exception hierarchy
│   ├── utils/                          # Utilities
│   │   ├── logging.py                  # Structured logging
│   │   └── telemetry.py                # OpenTelemetry setup
│   └── config/                         # Shared Configuration
│       └── base.py                     # Base config models
│
├── infrastructure/                     # Deployment & Operations
│   ├── docker/                         # Docker Configuration
│   │   ├── serving.Dockerfile          # FastAPI image
│   │   └── processing.Dockerfile       # gRPC image
│   └── scripts/                        # Utility Scripts
│       ├── generate_proto.sh           # Proto generation
│       ├── benchmark.py                # Performance benchmarking
│       └── download_models.sh          # Model downloads
│
├── tests/                              # Test Suite
│   ├── unit/                           # Unit Tests
│   ├── integration/                    # Integration Tests
│   └── fixtures/                       # Test Data
│
├── .github/                            # GitHub Configuration
│   └── workflows/                      
│       └── ci.yml                      # CI/CD pipeline
│
├── docker-compose.yml                  # Container orchestration
├── docker-compose.override.yml         # Development overrides
├── .env.example                        # Environment template
├── .gitignore                          # Git ignores
├── requirements.txt                    # Python dependencies
├── requirements-dev.txt                # Dev dependencies
├── pyproject.toml                      # Project metadata
├── Makefile                            # Build commands
└── README.md                           # This file

Technology Stack

  • Languages: Python 3.10.12
  • HTTP Framework: FastAPI 0.116.1
  • RPC Framework: gRPC 1.74.0 with Protocol Buffers
  • Classical NLP: spaCy 3.8.7
  • Modern NLP: HuggingFace Transformers 4.56.1
  • Configuration: Pydantic 2.11.9
  • Testing: pytest 8.4.2
  • Containerization: Docker, docker-compose
  • Observability: OpenTelemetry 1.37.0
  • CI/CD: GitHub Actions

Quick Start

Prerequisites

  • Docker and docker-compose installed
  • Python 3.10+ (for local development)
  • Make (for convenience commands)

Using Docker

# Clone repository
git clone <repository-url>
cd text-processing-microservices

# Copy environment configuration
cp .env.example .env

# Start services
make docker-up

# Wait for model downloads (first run only, ~1GB)
sleep 60

# Test classical pipeline
curl -X POST http://localhost:8000/summarize \
  -H "Content-Type: application/json" \
  -d '{"text": "The organization shall establish procedures.", "pipelines": ["classical"]}'

# Test modern pipeline
curl -X POST http://localhost:8000/summarize \
  -H "Content-Type: application/json" \
  -d '{"text": "The organization shall establish procedures.", "pipelines": ["modern"]}'

# Compare pipelines
curl -X POST http://localhost:8000/compare \
  -H "Content-Type: application/json" \
  -d '{"text": "The organization shall establish comprehensive procedures."}'

# Check health
curl http://localhost:8000/health

# Stop services
make docker-down

Installation

Local Development Setup

# Setup Python environment
python3.10 -m venv venv
source venv/bin/activate

# Install dependencies
make install

# Generate proto files with relative imports
make proto

# Download spaCy model
python -m spacy download en_core_web_sm

# Format code
make format

# Run linting
make lint

# Start services locally
python processing_service/main.py &
python serving_service/main.py

System Requirements

  • Python: 3.10.12 or higher
  • Docker: 20.10+ (for containerized deployment)
  • Memory: Minimum 4GB RAM (8GB recommended)
  • Storage: ~7GB for Docker images, ~1GB for models

Development

Available Make Commands

make help          # Show available commands
make install       # Install dependencies
make test          # Run tests
make test-cov      # Run tests with coverage
make lint          # Run linters
make format        # Format code
make clean         # Clean cache files
make proto         # Generate proto files
make docker-build  # Build Docker images
make docker-up     # Start services
make docker-down   # Stop services
make docker-logs   # View service logs

Code Quality

The project uses the following tools for code quality:

  • black: Code formatting
  • isort: Import sorting
  • flake8: Linting
  • mypy: Type checking

Run all checks:

make format
make lint

Protocol Buffers

Proto files must be regenerated after any changes:

make proto

This command generates Python code from proto definitions and fixes imports to use relative paths.

API Documentation

POST /summarize

Process text through selected NLP pipelines.

Request:

{
  "text": "Text to process",
  "pipelines": ["classical", "modern"],
  "return_metrics": true,
  "processor_params": {
    "keyword_extractor": {"n_keywords": 5}
  }
}

Response:

{
  "pipeline": "classical",
  "results": {
    "tokenizer": {"tokens": [...], "count": 15},
    "sentence_splitter": {"sentences": [...], "count": 2},
    "pos_tagger": {"pos_tags": [["The", "DT"], ...]},
    "ner_extractor": {"entities": [...]},
    "keyword_extractor": {"keywords": [["procedures", 0.95]]}
  },
  "metrics": {
    "total_processing_time": 4.8,
    "processor_count": 5
  }
}

POST /compare

Compare multiple pipelines on the same text.

Request:

{
  "text": "Text to compare",
  "pipelines": ["classical", "modern"]
}

Response includes:

  • Results from both pipelines
  • Performance comparison
  • Speed difference metrics

GET /health

Check service health status.

Response:

{
  "status": "healthy",
  "version": "1.0.0",
  "grpc_connected": true,
  "available_pipelines": ["classical", "modern"]
}

Testing

Running Tests

# All tests
make test

# With coverage
make test-cov

# Specific test file
pytest tests/unit/test_processors/classical/test_tokenizer.py -v

# Integration tests only
pytest tests/integration/ -v

Test Coverage

  • Total Tests: 129
  • Coverage: >80%
  • Execution Time: ~45 seconds (includes model loading)

Test Categories

  • Unit tests for individual components
  • Integration tests for service communication
  • End-to-end tests for complete workflows
  • Performance benchmarks

Performance Benchmarks

Benchmark Results

Pipeline Average Time Min Time Max Time Speed Factor
Classical 4.8s 4.2s 5.1s 1x (baseline)
Modern (initial) 12.6s 11.8s 13.2s 0.38x
Modern (cached) 0.28s 0.25s 0.31s 17x

Running Benchmarks

python infrastructure/scripts/benchmark.py --iterations 5 --pipelines classical modern

Performance Metrics

  • Classical vs Modern: Classical is ~48x faster on average (uncached)
  • Time Savings: Classical saves ~98% processing time
  • Model Loading: ~12 seconds for modern pipeline
  • Cache Impact: 45x speedup with cached models

Docker Deployment

Building Images

make docker-build

Running Services

# Start all services
make docker-up

# View logs
make docker-logs

# Stop services
make docker-down

Docker Compose Configuration

The project includes:

  • docker-compose.yml: Production configuration
  • docker-compose.override.yml: Development overrides with volume mounts

Container Details

  • Processing Service: 6.8GB (includes ML models)
  • Serving Service: 935MB
  • Model Cache: Persistent volume for model storage

CI/CD Pipeline

GitHub Actions Workflow

The CI/CD pipeline runs on push and pull requests:

jobs:
  test:
    - Run 129 unit tests
    - Run integration tests
    - Coverage report (>80%)
    
  lint:
    - Black formatting
    - isort imports
    - flake8 linting
    - mypy type checking
    
  docker-build:
    - Build serving image
    - Build processing image
    - Cache layers for speed

Running CI Locally

While the full CI runs on GitHub Actions, you can run tests locally:

make test
make lint
make docker-build

Monitoring & Observability

Current Implementation

The system includes OpenTelemetry instrumentation with:

  • Distributed tracing across services
  • Metrics collection (request rate, latency, errors)
  • Structured JSON logging with trace correlation
  • Console export for development

Telemetry Configuration

Enable telemetry by setting in .env:

OTEL_ENABLED=true

Metrics Collected

  • Request rate/latency/errors (RED metrics)
  • Pipeline processing times
  • Model inference latency
  • Token processing rate
  • Pipeline comparison metrics

Possible Enhancements

The system is prepared for production observability with:

  • Prometheus metrics export
  • Jaeger distributed tracing
  • Grafana dashboards
  • Log aggregation with Loki

Known Issues

Current Limitations

  1. Docker Image Size: Processing service is 6.8GB due to ML models

    • Workaround: Use volume mounts for models in production
  2. PyTorch Compatibility: Requires typing-extensions==4.8.0

    • Solution: Install typing-extensions before PyTorch
  3. First Run Performance: Initial model download takes several minutes

    • Solution: Pre-download models or use cached images

Dependency Notes

  • PyTorch CPU version is used to avoid CUDA dependencies
  • spaCy models must be downloaded separately
  • Transformer models are downloaded on first use

About

A production-ready microservices architecture for NLP text processing built with FastAPI and gRPC.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published