- Project Requirements
- System Architecture Overview
- Project Directory Structure
- Technology Stack
- Quick Start
- Installation
- Development
- API Documentation
- Testing
- Performance Benchmarks
- Docker Deployment
- CI/CD Pipeline
- Monitoring & Observability
- Known Issues
- Contributing
-
Serving Service (FastAPI)
- Expose a POST
/summarizeendpoint - Accept JSON text payload
- Forward text to Processing Service via gRPC
- Return processed result to client
- Expose a POST
-
Processing Service (gRPC)
- Expose
ProcessTextgRPC method - Perform NLP processing on input text
- Return processed result to Serving Service
- Expose
-
NLP Features Implemented
- Tokenization (split into words/tokens)
- Sentence Splitting (split into sentences)
- Keyword Extraction (top N important words/phrases)
- Sentiment Analysis (positive/negative feedback)
- Part-of-Speech Tagging (identifying nouns, verbs, etc.)
- Named Entity Recognition (people, organizations, locations)
- Summarization (extractive)
- Text Classification (categorizing text)
-
Technical Requirements
- Python for both services
- FastAPI for HTTP service
- grpcio for gRPC service
- Async I/O where relevant
- Error handling and logging
- Containerized setup with Docker/docker-compose
- Health check endpoints
- Unit tests for core logic
- GitHub Actions for CI/CD
graph TB
subgraph "Client Layer"
C[HTTP Client]
end
subgraph "API Gateway Layer"
FS[FastAPI Serving Service<br/>:8000]
end
subgraph "Processing Layer"
PS[gRPC Processing Service<br/>:50051]
subgraph "NLP Pipelines"
CP[Classical Pipeline<br/>spaCy-based]
MP[Modern Pipeline<br/>Transformers-based]
end
end
subgraph "Observability Layer"
OT[OpenTelemetry<br/>Console Export]
end
C -->|POST /summarize| FS
C -->|POST /compare| FS
FS -->|gRPC ProcessText| PS
PS --> CP
PS --> MP
PS -->|Metrics & Traces| OT
FS -->|Metrics & Traces| OT
- Serving Service: FastAPI-based HTTP API gateway (Port 8000)
- Processing Service: gRPC-based NLP processing engine (Port 50051)
- Pipeline System: Pluggable NLP processors with strategy pattern
- Configuration Management: Pydantic-based type-safe configs
- Observability: OpenTelemetry instrumentation (console export)
- Deployment: Docker containers orchestrated via docker-compose
text-processing-microservices/
├── serving_service/ # FastAPI HTTP Service
│ ├── main.py # FastAPI application
│ ├── api/
│ │ └── endpoints/
│ │ ├── summarize.py # /summarize and /compare endpoints
│ │ └── health.py # Health checks
│ ├── services/
│ │ └── grpc_client.py # gRPC client wrapper
│ ├── models/
│ │ ├── requests.py # Request models
│ │ └── responses.py # Response models
│ ├── middleware/
│ │ └── telemetry.py # OpenTelemetry middleware
│ └── config/
│ └── settings.py # Service configuration
│
├── processing_service/ # gRPC Processing Service
│ ├── main.py # gRPC server entry
│ ├── grpc_server/
│ │ ├── server.py # gRPC server implementation
│ │ └── servicer.py # ProcessText servicer
│ ├── processors/ # NLP Processors
│ │ ├── base.py # Abstract processor
│ │ ├── classical/ # Classical NLP
│ │ │ ├── tokenizer.py # Tokenization
│ │ │ ├── sentence_splitter.py # Sentence splitting
│ │ │ ├── pos_tagger.py # POS tagging
│ │ │ ├── ner_extractor.py # Named Entity Recognition
│ │ │ └── keyword_extractor.py # TF-IDF keywords
│ │ └── modern/ # Modern NLP
│ │ ├── summarizer.py # DistilBART summarization
│ │ ├── sentiment_analyzer.py # Sentiment analysis
│ │ └── text_classifier.py # Zero-shot classification
│ ├── pipelines/ # Pipeline Orchestrators
│ │ ├── classical_pipeline.py # Classical flow
│ │ ├── modern_pipeline.py # Modern flow
│ │ └── pipeline_manager.py # Pipeline selection & comparison
│ ├── interceptors/
│ │ └── telemetry.py # gRPC telemetry interceptor
│ └── utils/ # Utilities
│ ├── metrics_collector.py # Performance metrics
│ ├── pipeline_comparator.py # Pipeline comparison
│ └── results_aggregator.py # Results aggregation
│
├── shared/ # Shared Components
│ ├── protos/ # gRPC Definitions
│ │ └── text_processing.proto # Proto definitions
│ ├── interfaces/ # Abstract Interfaces
│ │ └── processor.py # Processor interface
│ ├── exceptions/ # Custom Exceptions
│ │ └── base.py # Exception hierarchy
│ ├── utils/ # Utilities
│ │ ├── logging.py # Structured logging
│ │ └── telemetry.py # OpenTelemetry setup
│ └── config/ # Shared Configuration
│ └── base.py # Base config models
│
├── infrastructure/ # Deployment & Operations
│ ├── docker/ # Docker Configuration
│ │ ├── serving.Dockerfile # FastAPI image
│ │ └── processing.Dockerfile # gRPC image
│ └── scripts/ # Utility Scripts
│ ├── generate_proto.sh # Proto generation
│ ├── benchmark.py # Performance benchmarking
│ └── download_models.sh # Model downloads
│
├── tests/ # Test Suite
│ ├── unit/ # Unit Tests
│ ├── integration/ # Integration Tests
│ └── fixtures/ # Test Data
│
├── .github/ # GitHub Configuration
│ └── workflows/
│ └── ci.yml # CI/CD pipeline
│
├── docker-compose.yml # Container orchestration
├── docker-compose.override.yml # Development overrides
├── .env.example # Environment template
├── .gitignore # Git ignores
├── requirements.txt # Python dependencies
├── requirements-dev.txt # Dev dependencies
├── pyproject.toml # Project metadata
├── Makefile # Build commands
└── README.md # This file
- Languages: Python 3.10.12
- HTTP Framework: FastAPI 0.116.1
- RPC Framework: gRPC 1.74.0 with Protocol Buffers
- Classical NLP: spaCy 3.8.7
- Modern NLP: HuggingFace Transformers 4.56.1
- Configuration: Pydantic 2.11.9
- Testing: pytest 8.4.2
- Containerization: Docker, docker-compose
- Observability: OpenTelemetry 1.37.0
- CI/CD: GitHub Actions
- Docker and docker-compose installed
- Python 3.10+ (for local development)
- Make (for convenience commands)
# Clone repository
git clone <repository-url>
cd text-processing-microservices
# Copy environment configuration
cp .env.example .env
# Start services
make docker-up
# Wait for model downloads (first run only, ~1GB)
sleep 60
# Test classical pipeline
curl -X POST http://localhost:8000/summarize \
-H "Content-Type: application/json" \
-d '{"text": "The organization shall establish procedures.", "pipelines": ["classical"]}'
# Test modern pipeline
curl -X POST http://localhost:8000/summarize \
-H "Content-Type: application/json" \
-d '{"text": "The organization shall establish procedures.", "pipelines": ["modern"]}'
# Compare pipelines
curl -X POST http://localhost:8000/compare \
-H "Content-Type: application/json" \
-d '{"text": "The organization shall establish comprehensive procedures."}'
# Check health
curl http://localhost:8000/health
# Stop services
make docker-down# Setup Python environment
python3.10 -m venv venv
source venv/bin/activate
# Install dependencies
make install
# Generate proto files with relative imports
make proto
# Download spaCy model
python -m spacy download en_core_web_sm
# Format code
make format
# Run linting
make lint
# Start services locally
python processing_service/main.py &
python serving_service/main.py- Python: 3.10.12 or higher
- Docker: 20.10+ (for containerized deployment)
- Memory: Minimum 4GB RAM (8GB recommended)
- Storage: ~7GB for Docker images, ~1GB for models
make help # Show available commands
make install # Install dependencies
make test # Run tests
make test-cov # Run tests with coverage
make lint # Run linters
make format # Format code
make clean # Clean cache files
make proto # Generate proto files
make docker-build # Build Docker images
make docker-up # Start services
make docker-down # Stop services
make docker-logs # View service logsThe project uses the following tools for code quality:
- black: Code formatting
- isort: Import sorting
- flake8: Linting
- mypy: Type checking
Run all checks:
make format
make lintProto files must be regenerated after any changes:
make protoThis command generates Python code from proto definitions and fixes imports to use relative paths.
Process text through selected NLP pipelines.
Request:
{
"text": "Text to process",
"pipelines": ["classical", "modern"],
"return_metrics": true,
"processor_params": {
"keyword_extractor": {"n_keywords": 5}
}
}Response:
{
"pipeline": "classical",
"results": {
"tokenizer": {"tokens": [...], "count": 15},
"sentence_splitter": {"sentences": [...], "count": 2},
"pos_tagger": {"pos_tags": [["The", "DT"], ...]},
"ner_extractor": {"entities": [...]},
"keyword_extractor": {"keywords": [["procedures", 0.95]]}
},
"metrics": {
"total_processing_time": 4.8,
"processor_count": 5
}
}Compare multiple pipelines on the same text.
Request:
{
"text": "Text to compare",
"pipelines": ["classical", "modern"]
}Response includes:
- Results from both pipelines
- Performance comparison
- Speed difference metrics
Check service health status.
Response:
{
"status": "healthy",
"version": "1.0.0",
"grpc_connected": true,
"available_pipelines": ["classical", "modern"]
}# All tests
make test
# With coverage
make test-cov
# Specific test file
pytest tests/unit/test_processors/classical/test_tokenizer.py -v
# Integration tests only
pytest tests/integration/ -v- Total Tests: 129
- Coverage: >80%
- Execution Time: ~45 seconds (includes model loading)
- Unit tests for individual components
- Integration tests for service communication
- End-to-end tests for complete workflows
- Performance benchmarks
| Pipeline | Average Time | Min Time | Max Time | Speed Factor |
|---|---|---|---|---|
| Classical | 4.8s | 4.2s | 5.1s | 1x (baseline) |
| Modern (initial) | 12.6s | 11.8s | 13.2s | 0.38x |
| Modern (cached) | 0.28s | 0.25s | 0.31s | 17x |
python infrastructure/scripts/benchmark.py --iterations 5 --pipelines classical modern- Classical vs Modern: Classical is ~48x faster on average (uncached)
- Time Savings: Classical saves ~98% processing time
- Model Loading: ~12 seconds for modern pipeline
- Cache Impact: 45x speedup with cached models
make docker-build# Start all services
make docker-up
# View logs
make docker-logs
# Stop services
make docker-downThe project includes:
docker-compose.yml: Production configurationdocker-compose.override.yml: Development overrides with volume mounts
- Processing Service: 6.8GB (includes ML models)
- Serving Service: 935MB
- Model Cache: Persistent volume for model storage
The CI/CD pipeline runs on push and pull requests:
jobs:
test:
- Run 129 unit tests
- Run integration tests
- Coverage report (>80%)
lint:
- Black formatting
- isort imports
- flake8 linting
- mypy type checking
docker-build:
- Build serving image
- Build processing image
- Cache layers for speedWhile the full CI runs on GitHub Actions, you can run tests locally:
make test
make lint
make docker-buildThe system includes OpenTelemetry instrumentation with:
- Distributed tracing across services
- Metrics collection (request rate, latency, errors)
- Structured JSON logging with trace correlation
- Console export for development
Enable telemetry by setting in .env:
OTEL_ENABLED=true- Request rate/latency/errors (RED metrics)
- Pipeline processing times
- Model inference latency
- Token processing rate
- Pipeline comparison metrics
The system is prepared for production observability with:
- Prometheus metrics export
- Jaeger distributed tracing
- Grafana dashboards
- Log aggregation with Loki
-
Docker Image Size: Processing service is 6.8GB due to ML models
- Workaround: Use volume mounts for models in production
-
PyTorch Compatibility: Requires typing-extensions==4.8.0
- Solution: Install typing-extensions before PyTorch
-
First Run Performance: Initial model download takes several minutes
- Solution: Pre-download models or use cached images
- PyTorch CPU version is used to avoid CUDA dependencies
- spaCy models must be downloaded separately
- Transformer models are downloaded on first use