Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove ProtoData #5775

Merged
merged 2 commits into from
Nov 21, 2017
Merged

remove ProtoData #5775

merged 2 commits into from
Nov 21, 2017

Conversation

luotao1
Copy link
Contributor

@luotao1 luotao1 commented Nov 20, 2017

fix #5769

@luotao1
Copy link
Contributor Author

luotao1 commented Nov 20, 2017

76501c8 commit 删除了config_parser.py中的ProtoData,但没有删除Chunk.conf中的。然而test_Trainer.cpp的单测依然能过,原因是:

add_test(NAME test_Trainer
  COMMAND ${PADDLE_SOURCE_DIR}/paddle/.set_python_path.sh -d ${PADDLE_SOURCE_DIR}/python/
        ${PYTHON_EXECUTABLE} ${PADDLE_SOURCE_DIR}/paddle/trainer/tests/gen_proto_data.py &&
        ${PADDLE_SOURCE_DIR}/paddle/.set_python_path.sh -d ${PADDLE_SOURCE_DIR}/python/
        ${CMAKE_CURRENT_BINARY_DIR}/test_Trainer
    WORKING_DIRECTORY ${PADDLE_SOURCE_DIR}/paddle/)

导致运行gen_proto_data.py成功后,就返回。没有运行后面的test_Trainer。运行结果如下:

72: Test command: /home/luotao02/Paddle/paddle/.set_python_path.sh "-d" "/home/luotao02/Paddle/python/" "/home/luotao02/.jumbo/bin/python2.7" "/home/luotao02/Paddle/paddle/trainer/tests/gen_proto_data.py" "&&" "/home/luotao02/Paddle/paddle/.set_python_path.sh" "-d" "/home/luotao02/Paddle/python/" "/home/luotao02/Paddle/build/paddle/trainer/tests/test_Trainer"
72: Test timeout computed to be: 9.99988e+06
72: + getopts d: opt
72: + case $opt in
72: + PYPATH=/home/luotao02/Paddle/python/
72: + getopts d: opt
72: + shift 2
72: + export PYTHONPATH=/home/luotao02/Paddle/python/:
72: + PYTHONPATH=/home/luotao02/Paddle/python/:
72: + /home/luotao02/.jumbo/bin/python2.7 /home/luotao02/Paddle/paddle/trainer/tests/gen_proto_data.py '&&' /home/luotao02/Paddle/paddle/.set_python_path.sh -d /home/luotao02/Paddle/python/ /home/luotao02/Paddle/build/paddle/trainer/tests/test_Trainer
72: [INFO 2017-11-20 16:15:06,775 gen_proto_data.py:138] column 0 dict size=316, ignored 1277
72: [INFO 2017-11-20 16:15:06,775 gen_proto_data.py:138] column 1 dict size=40, ignored 0
72: [INFO 2017-11-20 16:15:06,775 gen_proto_data.py:138] column 2 dict size=17, ignored 0
72: [INFO 2017-11-20 16:15:06,776 gen_proto_data.py:138] column 3 dict size=307, ignored 1199
72: [INFO 2017-11-20 16:15:06,776 gen_proto_data.py:138] column 4 dict size=317, ignored 1275
72: [INFO 2017-11-20 16:15:06,777 gen_proto_data.py:138] column 5 dict size=316, ignored 1277
72: [INFO 2017-11-20 16:15:06,777 gen_proto_data.py:138] column 6 dict size=300, ignored 1209
72: [INFO 2017-11-20 16:15:06,778 gen_proto_data.py:138] column 7 dict size=283, ignored 1180
72: [INFO 2017-11-20 16:15:06,779 gen_proto_data.py:138] column 8 dict size=154, ignored 3743
72: [INFO 2017-11-20 16:15:06,781 gen_proto_data.py:138] column 9 dict size=145, ignored 3645
72: [INFO 2017-11-20 16:15:06,781 gen_proto_data.py:138] column 10 dict size=40, ignored 2
72: [INFO 2017-11-20 16:15:06,781 gen_proto_data.py:138] column 11 dict size=39, ignored 2
72: [INFO 2017-11-20 16:15:06,781 gen_proto_data.py:138] column 12 dict size=38, ignored 2
72: [INFO 2017-11-20 16:15:06,781 gen_proto_data.py:138] column 13 dict size=39, ignored 2
72: [INFO 2017-11-20 16:15:06,781 gen_proto_data.py:138] column 14 dict size=40, ignored 2
72: [INFO 2017-11-20 16:15:06,782 gen_proto_data.py:138] column 15 dict size=255, ignored 213
72: [INFO 2017-11-20 16:15:06,782 gen_proto_data.py:138] column 16 dict size=263, ignored 215
72: [INFO 2017-11-20 16:15:06,782 gen_proto_data.py:138] column 17 dict size=254, ignored 209
72: [INFO 2017-11-20 16:15:06,782 gen_proto_data.py:138] column 18 dict size=248, ignored 207
72: [INFO 2017-11-20 16:15:06,783 gen_proto_data.py:138] column 19 dict size=440, ignored 1232
72: [INFO 2017-11-20 16:15:06,784 gen_proto_data.py:138] column 20 dict size=440, ignored 1230
72: [INFO 2017-11-20 16:15:06,784 gen_proto_data.py:138] column 21 dict size=421, ignored 1179
72: [INFO 2017-11-20 16:15:06,785 gen_proto_data.py:215] feature_dim=4339
72: [INFO 2017-11-20 16:15:07,835 gen_proto_data.py:240] num_sequences=208
72: [INFO 2017-11-20 16:15:07,835 gen_proto_data.py:215] feature_dim=4339
72: [INFO 2017-11-20 16:15:08,043 gen_proto_data.py:240] num_sequences=35
1/2 Test #72: test_Trainer .....................   Passed    1.78 sec

@luotao1
Copy link
Contributor Author

luotao1 commented Nov 20, 2017

e131e96 将chunk.conf和对应的gen_proto_data.py删了,原因有两方面:

  • 使test_Trainer单测正常运行。
  • 因为ProtoDataProvider很久都没有人使用,将gen_proto_data.py里的逻辑用PyDataProvider写比较耗时。同时,test_Trainer单测中一共比较了4个config,去掉其中的一个,对整体测试逻辑没有较大的影响。

Copy link
Collaborator

@wangkuiyi wangkuiyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love to see the removal of big chunks of data files!

@luotao1 luotao1 merged commit 3c4d406 into PaddlePaddle:develop Nov 21, 2017
@luotao1 luotao1 deleted the ProtoData branch November 21, 2017 04:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove ProtoData in config_parser.py
2 participants