FleetX

Fully utilize your GPU Clusters with FleetX for your model pre-training.

What is it?

FleetX is an out-of-the-box pre-trained model training toolkit for cloud users. It can be viewed as an extension package for Paddle's High-Level Distributed Training API paddle.distributed.fleet.
中文文档 | 快速开始 | 性能基线

Key Features

Pre-defined Models for Training
- define a Bert-Large or GPT-2 with one line code, which is commonly used self-supervised training model.
Friendly to User-defined Dataset
- plugin user-defined dataset and do training without much effort.
Distributed Training Best Practices
- the most efficient way to do distributed training is provided.

Installation

Install from pypi source

pip install fleet-x==0.0.7

Download whl package and install

# python2.7
wget --no-check-certificate https://fleet.bj.bcebos.com/test/fleet_x-0.0.7-py2-none-any.whl
pip install fleet_x-0.0.7-py2-none-any.whl

# python3
wget --no-check-certificate https://fleet.bj.bcebos.com/test/fleet_x-0.0.7-py3-none-any.whl
pip3 install fleet_x-0.0.7-py3-none-any.whl

A Distributed Resnet50 Training Example

import paddle
import paddle.distributed.fleet as fleet
import fleetx as X

paddle.enable_static() # only after 2.0rc

configs = X.parse_train_configs()
model = X.applications.Resnet50()

downloader = X.utils.Downloader()
imagenet_url = "https://fleet.bj.bcebos.com/small_datasets/yaml_example/imagenet.yaml"
local_path = downloader.download_from_bos(fs_yaml=imagenet_url, local_path='./data')
loader = model.get_train_dataloader(local_path, batch_size=32)

fleet.init(is_collective=True)
dist_strategy = fleet.DistributedStrategy()
dist_strategy.amp = True

optimizer = paddle.fluid.optimizer.SGD(learning_rate=configs.lr)
optimizer = fleet.distributed_optimizer(optimizer, strategy=dist_strategy)
optimizer.minimize(model.loss)

trainer = X.MultiGPUTrainer()
trainer.fit(model, loader, epoch=10)

How to launch your task

Multiple cards

fleetrun --gpus 0,1,2,3,4,5,6,7 resnet50_app.py

Citation

Please cite paddle.distributed.fleet or FleetX in your publications if it helps your research:

@electronic{fleet2020,
 title = {paddle.distributed.fleet: A Highly Scalable Distributed Training Engine of PaddlePaddle},
 url = {https://github.com/PaddlePaddle/FleetX},
}

Community

Slack

To connect with other users and contributors, welcome to join our Slack channel

Contribution

If you want to contribute code to Paddle Serving, please reference Contribution Guidelines

Feedback

For any feedback or to report a bug, please propose a GitHub Issue.

License

Apache 2.0 License

Name		Name	Last commit message	Last commit date
Latest commit History 1,188 Commits
benchmark		benchmark
core/operators		core/operators
deprecated		deprecated
docs		docs
examples		examples
python		python
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FleetX

What is it?

Key Features

Installation

A Distributed Resnet50 Training Example

How to launch your task

Citation

Community

Slack

Contribution

Feedback

License

About

Releases

Packages

Languages

License

yinhaofeng/FleetX

Folders and files

Latest commit

History

Repository files navigation

FleetX

What is it?

Key Features

Installation

A Distributed Resnet50 Training Example

How to launch your task

Citation

Community

Slack

Contribution

Feedback

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages