SecureShare demonstrates the implementation of federated learning for sentiment analysis. Federated learning is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples without exchanging them. This approach addresses critical issues in traditional centralized machine learning, such as data privacy, security, access rights, and access to heterogeneous data.
SecureShare showcases:
- Privacy preservation through local data processing
- Performance comparison between centralized and federated learning
- Communication efficiency in federated settings
- Model convergence across multiple clients
- Handling of data heterogeneity
By leveraging federated learning, SecureShare achieves comparable performance to centralized learning while maintaining data privacy and adapting to non-uniform data distributions across clients.
To set up SecureShare, follow these steps:
- Clone the repository:
git clone https://github.com/vishnux/SecureShare.git
- Navigate to the project directory:
cd SecureShare
- Install the required dependencies:
pip install -r requirements.txt
- Run the main script:
python secureshare.py
-
Federated Learning Implementation: Custom implementation of federated learning algorithm using scikit-learn's LogisticRegression as the base model.
-
Non-uniform Data Distribution: Simulates realistic scenarios where data is not uniformly distributed among clients.
-
Dynamic Learning Rate: Implements learning rate decay to improve convergence.
-
Performance Metrics: Tracks accuracy, F1 score, precision, and recall for comprehensive evaluation.
-
Centralized vs. Federated Comparison: Compares the performance of federated learning against a centralized model.
-
Visualizations: Generates insightful visualizations to demonstrate various aspects of federated learning.
SecureShare illustrate key aspects of federated learning:
-
Privacy Preservation: Demonstrates how data is distributed among clients, showing each client stores its data locally.
-
Performance Comparison: Compares the accuracy of federated learning with centralized learning across different numbers of clients.
-
Communication Efficiency: Shows how model accuracy improves over communication rounds for varying numbers of clients.
-
Model Convergence: Illustrates how individual client models and the global model converge over training rounds.
-
Data Heterogeneity: Visualizes the non-uniform distribution of data classes across clients.
These results highlight the effectiveness of SecureShare in maintaining privacy while achieving performance comparable to that of centralized learning.
Contributions to SecureShare are welcome! Please follow these steps:
- Fork the repository
- Create a new branch:
git checkout -b feature-branch-name
- Make your changes and commit them:
git commit -m 'Add some feature'
- Push to the branch:
git push origin feature-branch-name
- Submit a pull request
For major changes, please open an issue first to talk about what you'd like to change.
SecureShare is licensed under the Apache License.