A comprehensive, production-ready federated learning simulation framework with advanced efficiency metrics, testing, and visualization capabilities.
Author: Rahul Kavati
Status: ๐ข Production Ready with Comprehensive Testing
- ๐ฌ FL Simulation: Complete federated learning workflow with synthetic health data
- ๐ Efficiency Metrics: Comprehensive performance analysis and benchmarking
- ๐งช Testing Framework: 80%+ code coverage with unit, integration, and performance tests
- ๐ Visualization: Advanced metrics analysis and plotting capabilities
- ๐ Security Ready: Designed for future integration with CKKS encryption
- ๐ CI/CD: Automated testing and quality assurance via GitHub Actions
- ๐ Documentation: Comprehensive guides and API documentation
fl_simulation/
โโโ ๐ common/ # Core utilities and schemas
โ โโโ schemas.py # Data structures and validation
โ โโโ efficiency_metrics.py # FL performance analysis
โโโ ๐ simulation/ # FL simulation engine
โ โโโ client_simulation.py # Main simulation logic
โโโ ๐ data/ # Data generation and storage
โ โโโ clients/ # Client datasets
โ โโโ simulate_health_data.py
โโโ ๐ updates/ # Model updates storage
โ โโโ json/ # Human-readable updates
โ โโโ numpy/ # Binary updates for processing
โโโ ๐ visualize/ # Analysis and visualization
โ โโโ metrics_analysis.py # Metrics plotting and analysis
โโโ ๐ tests/ # Comprehensive testing suite
โ โโโ test_efficiency_metrics.py
โ โโโ test_client_simulation.py
โ โโโ run_tests.py
โโโ ๐ scripts/ # Automation and utilities
โ โโโ run_experiments.py # Multi-experiment runner
โโโ ๐ .github/workflows/ # CI/CD automation
# Clone the repository
git clone <your-repo-url>
cd fl_simulation
# Install dependencies
pip install -r requirements.txt
# Verify installation
python3 tests/run_tests.py --check-deps# Run FL simulation with 5 clients, 3 rounds
python3 simulation/client_simulation.py
# Expected output:
# Starting Federated Learning Simulation...
# Loaded data for 5 clients
# Initialized global model with 4 features
# --- Round 1/3 ---
# Training client client_0 with 200 samples...
# ...
# Simulation completed successfully!# Generate efficiency metrics and visualizations
python3 visualize/metrics_analysis.py
# View generated plots in metrics/ directory
open metrics/accuracy_analysis.png
open metrics/efficiency_metrics.pngOur framework provides comprehensive FL efficiency analysis:
- Total communication rounds
- Bytes transferred
- Communication overhead percentage
- Training time per round
- Convergence analysis
- Resource utilization
- Accuracy improvement tracking
- Weight convergence analysis
- Loss reduction metrics
- Memory usage optimization
- CPU utilization tracking
- Scalability analysis
# Run all tests
python3 tests/run_tests.py
# Run with coverage reporting
python3 tests/run_tests.py --coverage
# Run specific test modules
python3 tests/run_tests.py --module test_efficiency_metrics
# Performance testing
python3 tests/run_tests.py --performance- Unit Tests: Individual component validation
- Integration Tests: Component interaction testing
- Performance Tests: Speed and efficiency benchmarks
- Edge Case Tests: Error handling and boundary conditions
- GitHub Actions: Automated testing on every push/PR
- Code Coverage: Track test coverage over time
- Linting: Code quality and style enforcement
- Security: Vulnerability scanning and dependency checks
# Run multiple experiments with different configurations
python3 scripts/run_experiments.py
# This will run simulations with:
# - 2 rounds, 3 rounds, 4 rounds, 5 rounds
# - Generate comprehensive metrics for comparison
# - Save results for analysis# Modify simulation parameters in simulation/client_simulation.py
NUM_ROUNDS = 5 # Number of FL rounds
DATA_DIR = "data/clients" # Client data directory
OUTPUT_JSON = "updates/json" # JSON output location
OUTPUT_NPY = "updates/numpy" # NumPy output locationThe framework is designed to be easily extended for future collaborative work:
# Updates are saved in standardized formats:
# JSON: Human-readable for debugging
# NumPy: Binary format for encryption processing
# Example update structure:
{
"client_id": "client_0",
"round_id": 0,
"weight_delta": [0.1, -0.2, 0.3, -0.1],
"bias_delta": 0.05,
"num_samples": 200
}After running simulations, you'll find:
metrics/
โโโ fl_simulation_3rounds_5clients.json # Individual experiment
โโโ metrics_summary.json # Aggregated results
โโโ metrics_history.csv # CSV for analysis
โโโ accuracy_analysis.png # Accuracy plots
โโโ efficiency_metrics.png # Efficiency plots
โโโ convergence_trends.png # Convergence analysis
โโโ analysis_report.json # Detailed report
==================================================
FL EFFICIENCY METRICS SUMMARY
==================================================
Communication Rounds: 15
Bytes Transferred: 15.00 KB
Final Accuracy: 0.7980
Accuracy Improvement: 0.2980
Convergence Round: Not reached
Memory Usage: 0.0000 MB
==================================================
- Create Tests First: Follow TDD principles
- Update Documentation: Keep README current
- Run Quality Checks: Ensure all tests pass
- Update Requirements: Add new dependencies
- Test Coverage: Maintain 80%+ coverage
- Linting: Follow PEP 8 standards
- Type Hints: Use type annotations
- Documentation: Comprehensive docstrings
# Before committing:
python3 tests/run_tests.py --coverage
python3 -m flake8 common simulation visualize
python3 -m black --check common simulation visualize# Optional: Set custom paths
export FL_DATA_DIR="/path/to/data"
export FL_OUTPUT_DIR="/path/to/output"
export FL_LOG_LEVEL="INFO"# In simulation/client_simulation.py
global_model = LogisticRegression(
penalty=None,
fit_intercept=True,
solver="lbfgs",
max_iter=1000, # Increase for better convergence
warm_start=True,
random_state=42 # For reproducibility
)@dataclass
class FLEfficiencyMetrics:
timestamp: str
num_clients: int
num_rounds: int
total_samples: int
# ... and many more metricsclass FLEfficiencyCalculator:
def calculate_efficiency_metrics(self, clients_data, global_model, num_rounds, training_time=None)
def save_metrics(self, metrics, experiment_name=None)
def calculate_communication_efficiency(self, num_clients, num_rounds)def main():
"""Main FL simulation workflow"""
# Load client data
# Initialize global model
# Run FL rounds
# Calculate efficiency metrics
# Save results- Import Errors: Ensure Python path includes project root
- Missing Dependencies: Run
pip install -r requirements.txt - Test Failures: Check test output for specific error details
- Performance Issues: Verify system resources and data sizes
# Run with maximum verbosity
python3 tests/run_tests.py --verbosity 2
# Single test debugging
python3 -m pytest tests/test_efficiency_metrics.py::TestFLEfficiencyMetrics::test_metrics_creation -s- Check this README
- Review test output carefully
- Check dependency versions
- Verify test environment
- Create issue with detailed error information
- Distributed Training: Multi-node FL simulation
- Advanced Aggregation: FedProx, FedNova algorithms
- Real-time Monitoring: Live metrics dashboard
- Benchmarking Suite: Compare different FL approaches
- Export Formats: TensorFlow, PyTorch model export
- Paper Reproduction: Standard FL algorithm implementations
- Custom Algorithms: Easy integration of new FL methods
- Performance Analysis: Comprehensive benchmarking tools
- Publication Ready: Generate publication-quality plots
The framework is designed to be easily extended for collaborative research:
- Encryption Layer: Integration with CKKS homomorphic encryption
- Advanced Aggregation: Custom aggregation algorithms
- Multi-Party Computation: Secure multi-party FL protocols
- Blockchain Integration: Decentralized FL coordination
We welcome contributions! Please follow our development workflow:
- Fork the repository
- Create a feature branch
- Add comprehensive tests
- Ensure all tests pass
- Update documentation
- Submit a pull request
# Install development dependencies
pip install -r requirements.txt
# Run quality checks
python3 tests/run_tests.py --coverage
python3 -m flake8 common simulation visualize
python3 -m black common simulation visualize
python3 -m isort common simulation visualizeThis project is licensed under the MIT License - see the LICENSE file for details.
- Research Community: FL algorithm implementations and research papers
- Open Source: Testing and development tools
- Academic Institutions: Supporting federated learning research
- Industry Partners: Real-world FL applications and use cases
For detailed author information and research background, see AUTHOR.md.
Ready to revolutionize your federated learning research? ๐
Start with python3 simulation/client_simulation.py and explore the comprehensive testing framework with python3 tests/run_tests.py --coverage!