Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FP16 option in save_combine_op #10471

Merged
merged 5 commits into from
May 10, 2018

Conversation

sidgoyal78
Copy link
Contributor

Issue: it fails the check since the 3rd tensor has size > 2050.

@sidgoyal78 sidgoyal78 requested a review from kexinzhao May 8, 2018 00:42
std::vector<int> lod1 = {0, 1, 2, 3};
int numel1 = 100;
paddle::framework::LoD expect_lod1;
float* expect1 = CreateForSaveCombineOp<float>(10, 10, lod1, "test_var1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it is related to the error.
But the last number of your lod vector info should match the first dimension of your tensor, which is 10 here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (though it is not related to the error).

std::vector<int> lod3 = {0, 1, 2, 3};
int numel3 = 2050;
paddle::framework::LoD expect_lod3;
float* expect3 = CreateForSaveCombineOp<float>(1, 2050, lod3, "test_var3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your lod3 = {0, 1, 2, 3}, but the first dimension of your tensor is 1. Not sure if this is right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor Author

@sidgoyal78 sidgoyal78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kexinzhao . I found the reason which causes the failure. It seems that converting from int -> float16 -> float, results in loss of information.

After the values 2048, the values are:
i, float(float16(i))
2048, 2048
2049, 2048
2050, 2050
2051, 2050
2052, 2052
2053, 2052
2054, 2054
2055, 2054
2056, 2056
2057, 2056

Copy link
Contributor

@kexinzhao kexinzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sidgoyal78 sidgoyal78 merged commit 283c4db into PaddlePaddle:develop May 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants