Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance device context pool #9293

Merged
merged 1 commit into from
Mar 22, 2018

Conversation

reyoung
Copy link
Collaborator

@reyoung reyoung commented Mar 21, 2018

No description provided.

@reyoung reyoung requested a review from dzhwinter March 21, 2018 08:31
namespace paddle {
namespace platform {

DeviceContextPool* DeviceContextPool::pool = nullptr;

const platform::DeviceContext* DeviceContextPool::Get(
const platform::Place& place) {
platform::DeviceContext* DeviceContextPool::Get(const platform::Place& place) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platform::DeviceContext* DeviceContextPool::Get(const platform::Place& place) const

@@ -65,6 +65,18 @@ bool is_cpu_place(const Place &);
bool places_are_same_class(const Place &, const Place &);
bool is_same_place(const Place &, const Place &);

struct PlaceHash {
std::size_t operator()(const Place &p) const {
constexpr size_t num_dev_bits = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 bit is not enough, the GPU box product has 32 cards in one node. Then will lead to an overlap of dev_id << num_dev_bits | p.which()

@@ -159,7 +160,7 @@ class DeviceContextPool {
}

/*! \brief Return handle of single device context. */
const platform::DeviceContext* Get(const platform::Place& place);
platform::DeviceContext* Get(const platform::Place& place);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add const suffix?

@@ -159,7 +160,7 @@ class DeviceContextPool {
}

/*! \brief Return handle of single device context. */
const platform::DeviceContext* Get(const platform::Place& place);
platform::DeviceContext* Get(const platform::Place& place);

template <typename Place>
const typename DefaultDeviceContextType<Place>::TYPE* GetByPlace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the const prefix

Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@dzhwinter dzhwinter merged commit 1d8fe2a into PaddlePaddle:develop Mar 22, 2018
mikeseven added a commit to mikeseven/Paddle that referenced this pull request Mar 22, 2018
* commit '9c35b0dc1ba0ace5acf721685802a21045ea1249': (36 commits)
  Fix dist compile error (PaddlePaddle#9320)
  Fix bug for backward tanspiler when using parallel_do operator. (PaddlePaddle#9282)
  update
  fix transpiler bug
  Update index_en.rst (PaddlePaddle#9286)
  "fix mixed_vector bug" (PaddlePaddle#9319)
  Update index_en.rst (PaddlePaddle#9280)
  Adjust some contents in write_docs_en.rst for Contribue Documentation (PaddlePaddle#9147)
  CMake refine for HIP support. Fix CI.
  Reuduce memory copy when communication between trainer and pserver. (PaddlePaddle#9271)
  Modified build.sh and remove build_doc.sh
  fix doc
  Enhance device context pool (PaddlePaddle#9293)
  Device blobs are created only in training. Added testing attribute
  Shrink batch_norm_grad's inputs
  updates
  prepare and create op before run
  wip
  small fix
  initial commit
  ...

# Conflicts:
#	cmake/external/eigen.cmake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants