Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcmpps is unnecessary #452

Merged
merged 3 commits into from Apr 22, 2019

Conversation

Projects
None yet
2 participants
@herumi
Copy link
Contributor

commented Apr 9, 2019

vcvtps2dq(vmm_aux1 | h->T_rd_sae, vmm_src); rounds down.
Then,

  • vcmpps with _cmp_nle_us always returns False
  • vmm_aux3 is always set to zero
  • vsubps does not change vmm_aux1

So these three mnemonics are not necessary.

@rsdubtso rsdubtso self-assigned this Apr 19, 2019

@rsdubtso

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2019

Hi @herumi, thanks for spotting this! I'll merge next week / over the weekend.

@rsdubtso

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2019

I've pushed a fix that I think is better than the original sequence that could produce incorrect results for integers that cannot fit into 32 bit integers. I will merge this (with a squash) if this looks fine to you.

@herumi

This comment has been minimized.

Copy link
Contributor Author

commented Apr 20, 2019

Thank you for checking and improving my patch. It's good!

@rsdubtso rsdubtso merged commit 27b7f13 into intel:master Apr 22, 2019

rsdubtso added a commit that referenced this pull request Apr 22, 2019

mkl-dnn pushed a commit that referenced this pull request Apr 24, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.