Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix kill fail bug #11635

Merged
merged 6 commits into from
Jun 22, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,8 @@ RUN easy_install -U pip && \
pip install sphinx-rtd-theme==0.1.9 recommonmark

RUN pip install pre-commit 'ipython==5.3.0' && \
pip install 'ipykernel==4.6.0' 'jupyter==1.0.0'
pip install 'ipykernel==4.6.0' 'jupyter==1.0.0' && \
pip install opencv-python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip install opencv-python,和apt-get install 并不冲突, @velconia 已经测过,可以正常加载flowers数据集。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么pip install opencv-python会影响进程terminate呢?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以参考这个issue的回答


#For docstring checker
RUN pip install pylint pytest astroid isort
Expand Down
18 changes: 9 additions & 9 deletions paddle/scripts/paddle_build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
function print_usage() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddle_build.sh的gen_dockerfile函数里面需要pip install 么?不然latest等镜像没有这个包。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gen_dockerfile 里添加了 pip install /*.whl, 我理解会把 paddle python 依赖的所有包都加上, opencv-python也是其中之一

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@typhoonzero could you please take a look at this?

Copy link
Collaborator Author

@velconia velconia Jun 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whl installs packages here, in Paddle/python/setup.py.in

if '${CMAKE_SYSTEM_PROCESSOR}' not in ['arm', 'armv7-a', 'aarch64']:
    setup_requires+=['opencv-python']

echo -e "\n${RED}Usage${NONE}:
${BOLD}${SCRIPT_NAME}${NONE} [OPTION]"

echo -e "\n${RED}Options${NONE}:
${BLUE}build${NONE}: run build for x86 platform
${BLUE}build_android${NONE}: run build for android platform
Expand Down Expand Up @@ -198,7 +198,7 @@ function build_android() {
fi

ANDROID_STANDALONE_TOOLCHAIN=$ANDROID_TOOLCHAINS_DIR/$ANDROID_ARCH-android-$ANDROID_API

cat <<EOF
============================================
Generating the standalone toolchain ...
Expand All @@ -212,13 +212,13 @@ EOF
--arch=$ANDROID_ARCH \
--platform=android-$ANDROID_API \
--install-dir=$ANDROID_STANDALONE_TOOLCHAIN

BUILD_ROOT=${PADDLE_ROOT}/build_android
DEST_ROOT=${PADDLE_ROOT}/install_android

mkdir -p $BUILD_ROOT
cd $BUILD_ROOT

if [ $ANDROID_ABI == "armeabi-v7a" ]; then
cmake -DCMAKE_SYSTEM_NAME=Android \
-DANDROID_STANDALONE_TOOLCHAIN=$ANDROID_STANDALONE_TOOLCHAIN \
Expand Down Expand Up @@ -286,7 +286,7 @@ function build_ios() {
-DWITH_TESTING=OFF \
-DWITH_SWIG_PY=OFF \
-DCMAKE_BUILD_TYPE=Release

make -j 2
}

Expand Down Expand Up @@ -331,14 +331,14 @@ EOF
function bind_test() {
# the number of process to run tests
NUM_PROC=6

# calculate and set the memory usage for each process
MEM_USAGE=$(printf "%.2f" `echo "scale=5; 1.0 / $NUM_PROC" | bc`)
export FLAGS_fraction_of_gpu_memory_to_use=$MEM_USAGE

# get the CUDA device count
CUDA_DEVICE_COUNT=$(nvidia-smi -L | wc -l)

for (( i = 0; i < $NUM_PROC; i++ )); do
cuda_list=()
for (( j = 0; j < $CUDA_DEVICE_COUNT; j++ )); do
Expand Down
5 changes: 2 additions & 3 deletions python/paddle/fluid/tests/unittests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ if(NOT WITH_DISTRIBUTE)
endif(NOT WITH_DISTRIBUTE)

list(REMOVE_ITEM TEST_OPS test_seq_concat_op) # FIXME(helin): https://github.com/PaddlePaddle/Paddle/issues/8290
list(REMOVE_ITEM TEST_OPS test_modified_huber_loss_op) # FIXME(qijun) https://github.com/PaddlePaddle/Paddle/issues/5184
list(REMOVE_ITEM TEST_OPS test_modified_huber_loss_op) # FIXME(qijun) https://github.com/PaddlePaddle/Paddle/issues/5184
list(REMOVE_ITEM TEST_OPS test_lstm_unit_op) # # FIXME(qijun) https://github.com/PaddlePaddle/Paddle/issues/5185
list(REMOVE_ITEM TEST_OPS test_nce) # FIXME(qijun) https://github.com/PaddlePaddle/Paddle/issues/7778
list(REMOVE_ITEM TEST_OPS test_recurrent_op) # FIXME(qijun) https://github.com/PaddlePaddle/Paddle/issues/6152
Expand Down Expand Up @@ -43,12 +43,11 @@ list(REMOVE_ITEM TEST_OPS test_warpctc_op)
list(REMOVE_ITEM TEST_OPS test_dist_train)
list(REMOVE_ITEM TEST_OPS test_parallel_executor_crf)
list(REMOVE_ITEM TEST_OPS test_parallel_executor_fetch_feed)
# TODO(wuyi): this test hungs on CI, will add it back later
list(REMOVE_ITEM TEST_OPS test_listen_and_serv_op)
foreach(TEST_OP ${TEST_OPS})
py_test_modules(${TEST_OP} MODULES ${TEST_OP})
endforeach(TEST_OP)
py_test_modules(test_warpctc_op MODULES test_warpctc_op ENVS FLAGS_warpctc_dir=${WARPCTC_LIB_DIR} SERIAL)
py_test_modules(test_dist_train MODULES test_dist_train SERIAL)
py_test_modules(test_parallel_executor_crf MODULES test_parallel_executor_crf SERIAL)
py_test_modules(test_parallel_executor_fetch_feed MODULES test_parallel_executor_fetch_feed SERIAL)
set_tests_properties(test_listen_and_serv_op PROPERTIES TIMEOUT 20)
Original file line number Diff line number Diff line change
Expand Up @@ -94,15 +94,15 @@ def test_handle_signal_in_serv_op(self):
self._wait_ps_ready(p1.pid)

# raise SIGTERM to pserver
os.kill(p1.pid, signal.SIGKILL)
os.kill(p1.pid, signal.SIGINT)
p1.join()

# run pserver on CPU in async mode
p2 = self._start_pserver(False, False)
self._wait_ps_ready(p2.pid)

# raise SIGTERM to pserver
os.kill(p2.pid, signal.SIGKILL)
os.kill(p2.pid, signal.SIGTERM)
p2.join()


Expand Down