Bringing Recommendations to the Edge

A one-stop solution to build your recommendation models, train them and, deploy them in a privacy-preserving manner-- right on the users' devices.

EnvisEdge allows you to easily explore new federated learning algorithms and deploy them into production.

The steps to building an awesome recommendation system are:

🔩 Standard ML training: Pick up any ML model and benchmark it using standard settings.
🎮 Federated Learning Simulation: Once you are satisfied with your model, explore a host of FL algorithms with the simulator.
🏭 Industrial Deployment: After all the testing and simulation, deploy easily using NimbleEdge suite
🚀 Edge Computing: Leverage all the benefits of edge computing

Repo Structure 🏢

NimbleEdge/EnvisEdge
├── CONTRIBUTING.md           <-- Please go through the contributing guidelines before starting 🤓
├── README.md                 <-- You are here 📌
├── docs                      <-- Tutorials and walkthroughs 🧐
├── experiments               <-- Recommendation models used by our services
└── fedrec                    <-- Whole magic takes place here 😜 
     ├── communications          <-- Modules for communication interfaces eg. Kafka
     ├── multiprocessing         <-- Modules to run parallel worker jobs
     ├── python_executors        <-- Contains worker modules eg. trainer and aggregator
     ├── serialization           <-- Message serializers
     └── utilities               <-- Helper modules
├── fl_strategies             <-- Federated learning algorithms for our services.
└── notebooks                 <-- Jupyter Notebook examples

QuickStart

Let's train Facebook AI's DLRM on the edge. DLRM has been a standard baseline for all neural network based recommendation models.

Clone this repo and change the argument datafile in configs/dlrm_fl.yml to the above path.

git clone https://github.com/NimbleEdge/EnvisEdge

model :
  name : 'dlrm'
  ...
  preproc :
    datafile : "<Path to Criteo>/criteo/train.txt"

Install the dependencies with conda or pip

mkdir env
cd env
virtualenv envisedge 
source envisedge/bin/activate 
pip3 install -r requirements.txt

Download kafka from Here 👈 and start the kafka server using the following commands

bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties

Create kafka topics for the job executor

bin/kafka-topics.sh --create --topic job-request-aggregator --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
bin/kafka-topics.sh --create --topic job-request-trainer --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
bin/kafka-topics.sh --create --topic job-response-aggregator --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
bin/kafka-topics.sh --create --topic job-response-trainer --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

To start the multiprocessing executor run the following command:

python executor.py --config configs/dlrm_fl.yml

Change the path in Dlrm_fl.yml to your data path.

preproc :
    datafile : "<Your path to data>/criteo_dataset/train.txt"

Run data preprocessing with preprocess_data and supply the config file. You should be able to generate per-day split from the entire dataset as well a processed data file

python preprocess_data.py --config configs/dlrm_fl.yml --logdir $HOME/logs/kaggle_criteo/exp_1

Begin Training

python train.py --config configs/dlrm_fl.yml --logdir $HOME/logs/kaggle_criteo/exp_3 --num_eval_batches 1000 --devices 0

Run tensorboard to view training loss and validation metrics at localhost:8888

tensorboard --logdir $HOME/logs/kaggle_criteo --port 8888

Contribute

Please go through our CONTRIBUTING guidelines before starting.
Star, fork, and clone the repo.
Do your work.
Push to your fork.
Submit a PR to NimbleEdge/EnvisEdge

We welcome you to the Discord for queries related to the library and contribution in general.

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github		.github
assets		assets
configs		configs
docs		docs
experiments		experiments
fedrec		fedrec
fl_strategies		fl_strategies
notebooks		notebooks
scala_core		scala_core
scripts		scripts
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
executor.py		executor.py
preprocess_data.py		preprocess_data.py
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py
train.py		train.py
train_fl.py		train_fl.py

License

souravcipher/EnvisEdge

Folders and files

Latest commit

History

Repository files navigation

Bringing Recommendations to the Edge

Repo Structure 🏢

QuickStart

Contribute

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages