Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SVML functions #40

Closed
nemequ opened this issue Apr 1, 2019 · 3 comments
Closed

SVML functions #40

nemequ opened this issue Apr 1, 2019 · 3 comments
Labels
instruction-set-support Implementing new SIMD ISA extensions portably

Comments

@nemequ
Copy link
Member

nemequ commented Apr 1, 2019

I've been hesitating on this one because I expect many of them to be a bit painful to implement, but SVML support would be a pretty nice addition to SIMDe:

  • _mm256_acos_pd
  • _mm256_acos_ps
  • _mm256_acosh_pd
  • _mm256_acosh_ps
  • _mm256_asin_pd
  • _mm256_asin_ps
  • _mm256_asinh_pd
  • _mm256_asinh_ps
  • _mm256_atan_pd
  • _mm256_atan_ps
  • _mm256_atan2_pd
  • _mm256_atan2_ps
  • _mm256_atanh_pd
  • _mm256_atanh_ps
  • _mm256_cbrt_pd
  • _mm256_cbrt_ps
  • _mm256_cdfnorm_pd
  • _mm256_cdfnorm_ps
  • _mm256_cdfnorminv_pd
  • _mm256_cdfnorminv_ps
  • _mm256_cexp_ps
  • _mm256_clog_ps
  • _mm256_cos_pd
  • _mm256_cos_ps
  • _mm256_cosd_pd
  • _mm256_cosd_ps
  • _mm256_cosh_pd
  • _mm256_cosh_ps
  • _mm256_csqrt_ps
  • _mm256_div_epi8
  • _mm256_div_epi16
  • _mm256_div_epi32
  • _mm256_div_epi64
  • _mm256_div_epu8
  • _mm256_div_epu16
  • _mm256_div_epu32
  • _mm256_div_epu64
  • _mm256_erf_pd
  • _mm256_erf_ps
  • _mm256_erfc_pd
  • _mm256_erfc_ps
  • _mm256_erfcinv_pd
  • _mm256_erfcinv_ps
  • _mm256_erfinv_pd
  • _mm256_erfinv_ps
  • _mm256_exp_pd
  • _mm256_exp_ps
  • _mm256_exp10_pd
  • _mm256_exp10_ps
  • _mm256_exp2_pd
  • _mm256_exp2_ps
  • _mm256_expm1_pd
  • _mm256_expm1_ps
  • _mm256_hypot_pd
  • _mm256_hypot_ps
  • _mm256_idiv_epi32
  • _mm256_idivrem_epi32
  • _mm256_invcbrt_pd
  • _mm256_invcbrt_ps
  • _mm256_invsqrt_pd
  • _mm256_invsqrt_ps
  • _mm256_irem_epi32
  • _mm256_log_pd
  • _mm256_log_ps
  • _mm256_log10_pd
  • _mm256_log10_ps
  • _mm256_log1p_pd
  • _mm256_log1p_ps
  • _mm256_log2_pd
  • _mm256_log2_ps
  • _mm256_logb_pd
  • _mm256_logb_ps
  • _mm256_pow_pd
  • _mm256_pow_ps
  • _mm256_rem_epi8
  • _mm256_rem_epi16
  • _mm256_rem_epi32
  • _mm256_rem_epi64
  • _mm256_rem_epu8
  • _mm256_rem_epu16
  • _mm256_rem_epu32
  • _mm256_rem_epu64
  • _mm256_sin_pd
  • _mm256_sin_ps
  • _mm256_sincos_pd
  • _mm256_sincos_ps
  • _mm256_sind_pd
  • _mm256_sind_ps
  • _mm256_sinh_pd
  • _mm256_sinh_ps
  • _mm256_svml_ceil_pd
  • _mm256_svml_ceil_ps
  • _mm256_svml_floor_pd
  • _mm256_svml_floor_ps
  • _mm256_svml_round_pd
  • _mm256_svml_round_ps
  • _mm256_svml_sqrt_pd
  • _mm256_svml_sqrt_ps
  • _mm256_tan_pd
  • _mm256_tan_ps
  • _mm256_tand_pd
  • _mm256_tand_ps
  • _mm256_tanh_pd
  • _mm256_tanh_ps
  • _mm256_trunc_pd
  • _mm256_trunc_ps
  • _mm256_udiv_epi32
  • _mm256_udivrem_epi32
  • _mm256_urem_epi32
  • _mm512_acos_pd
  • _mm512_mask_acos_pd
  • _mm512_acos_ps
  • _mm512_mask_acos_ps
  • _mm512_acosh_pd
  • _mm512_mask_acosh_pd
  • _mm512_acosh_ps
  • _mm512_mask_acosh_ps
  • _mm512_asin_pd
  • _mm512_mask_asin_pd
  • _mm512_asin_ps
  • _mm512_mask_asin_ps
  • _mm512_asinh_pd
  • _mm512_mask_asinh_pd
  • _mm512_asinh_ps
  • _mm512_mask_asinh_ps
  • _mm512_atan2_pd
  • _mm512_mask_atan2_pd
  • _mm512_atan2_ps
  • _mm512_mask_atan2_ps
  • _mm512_atan_pd
  • _mm512_mask_atan_pd
  • _mm512_atan_ps
  • _mm512_mask_atan_ps
  • _mm512_atanh_pd
  • _mm512_mask_atanh_pd
  • _mm512_atanh_ps
  • _mm512_mask_atanh_ps
  • _mm512_cbrt_pd
  • _mm512_mask_cbrt_pd
  • _mm512_cbrt_ps
  • _mm512_mask_cbrt_ps
  • _mm512_cdfnorm_pd
  • _mm512_mask_cdfnorm_pd
  • _mm512_cdfnorm_ps
  • _mm512_mask_cdfnorm_ps
  • _mm512_cdfnorminv_pd
  • _mm512_mask_cdfnorminv_pd
  • _mm512_cdfnorminv_ps
  • _mm512_mask_cdfnorminv_ps
  • _mm512_ceil_pd
  • _mm512_mask_ceil_pd
  • _mm512_ceil_ps
  • _mm512_mask_ceil_ps
  • _mm512_cos_pd
  • _mm512_mask_cos_pd
  • _mm512_cos_ps
  • _mm512_mask_cos_ps
  • _mm512_cosd_pd
  • _mm512_mask_cosd_pd
  • _mm512_cosd_ps
  • _mm512_mask_cosd_ps
  • _mm512_cosh_pd
  • _mm512_mask_cosh_pd
  • _mm512_cosh_ps
  • _mm512_mask_cosh_ps
  • _mm512_erf_pd
  • _mm512_mask_erf_pd
  • _mm512_erfc_pd
  • _mm512_mask_erfc_pd
  • _mm512_erf_ps
  • _mm512_mask_erf_ps
  • _mm512_erfc_ps
  • _mm512_mask_erfc_ps
  • _mm512_erfinv_pd
  • _mm512_mask_erfinv_pd
  • _mm512_erfinv_ps
  • _mm512_mask_erfinv_ps
  • _mm512_erfcinv_pd
  • _mm512_mask_erfcinv_pd
  • _mm512_erfcinv_ps
  • _mm512_mask_erfcinv_ps
  • _mm512_exp10_pd
  • _mm512_mask_exp10_pd
  • _mm512_exp10_ps
  • _mm512_mask_exp10_ps
  • _mm512_exp2_pd
  • _mm512_mask_exp2_pd
  • _mm512_exp2_ps
  • _mm512_mask_exp2_ps
  • _mm512_exp_pd
  • _mm512_mask_exp_pd
  • _mm512_exp_ps
  • _mm512_mask_exp_ps
  • _mm512_expm1_pd
  • _mm512_mask_expm1_pd
  • _mm512_expm1_ps
  • _mm512_mask_expm1_ps
  • _mm512_floor_pd
  • _mm512_mask_floor_pd
  • _mm512_floor_ps
  • _mm512_mask_floor_ps
  • _mm512_hypot_pd
  • _mm512_mask_hypot_pd
  • _mm512_hypot_ps
  • _mm512_mask_hypot_ps
  • _mm512_div_epi32
  • _mm512_mask_div_epi32
  • _mm512_div_epi8
  • _mm512_div_epi16
  • _mm512_div_epi64
  • _mm512_invsqrt_pd
  • _mm512_mask_invsqrt_pd
  • _mm512_invsqrt_ps
  • _mm512_mask_invsqrt_ps
  • _mm512_rem_epi32
  • _mm512_mask_rem_epi32
  • _mm512_rem_epi8
  • _mm512_rem_epi16
  • _mm512_rem_epi64
  • _mm512_log10_pd
  • _mm512_mask_log10_pd
  • _mm512_log10_ps
  • _mm512_mask_log10_ps
  • _mm512_log1p_pd
  • _mm512_mask_log1p_pd
  • _mm512_log1p_ps
  • _mm512_mask_log1p_ps
  • _mm512_log2_pd
  • _mm512_mask_log2_pd
  • _mm512_log_pd
  • _mm512_mask_log_pd
  • _mm512_log_ps
  • _mm512_mask_log_ps
  • _mm512_logb_pd
  • _mm512_mask_logb_pd
  • _mm512_logb_ps
  • _mm512_mask_logb_ps
  • _mm512_nearbyint_pd
  • _mm512_mask_nearbyint_pd
  • _mm512_nearbyint_ps
  • _mm512_mask_nearbyint_ps
  • _mm512_pow_pd
  • _mm512_mask_pow_pd
  • _mm512_pow_ps
  • _mm512_mask_pow_ps
  • _mm512_recip_pd
  • _mm512_mask_recip_pd
  • _mm512_recip_ps
  • _mm512_mask_recip_ps
  • _mm512_rint_pd
  • _mm512_mask_rint_pd
  • _mm512_rint_ps
  • _mm512_mask_rint_ps
  • _mm512_svml_round_pd
  • _mm512_mask_svml_round_pd
  • _mm512_sin_pd
  • _mm512_mask_sin_pd
  • _mm512_sin_ps
  • _mm512_mask_sin_ps
  • _mm512_sinh_pd
  • _mm512_mask_sinh_pd
  • _mm512_sinh_ps
  • _mm512_mask_sinh_ps
  • _mm512_sind_pd
  • _mm512_mask_sind_pd
  • _mm512_sind_ps
  • _mm512_mask_sind_ps
  • _mm512_tan_pd
  • _mm512_mask_tan_pd
  • _mm512_tan_ps
  • _mm512_mask_tan_ps
  • _mm512_tand_pd
  • _mm512_mask_tand_pd
  • _mm512_tand_ps
  • _mm512_mask_tand_ps
  • _mm512_tanh_pd
  • _mm512_mask_tanh_pd
  • _mm512_tanh_ps
  • _mm512_mask_tanh_ps
  • _mm512_trunc_pd
  • _mm512_mask_trunc_pd
  • _mm512_trunc_ps
  • _mm512_mask_trunc_ps
  • _mm512_div_epu32
  • _mm512_mask_div_epu32
  • _mm512_div_epu8
  • _mm512_div_epu16
  • _mm512_div_epu64
  • _mm512_rem_epu32
  • _mm512_mask_rem_epu32
  • _mm512_rem_epu8
  • _mm512_rem_epu16
  • _mm512_rem_epu64
  • _mm512_sincos_pd
  • _mm512_mask_sincos_pd
  • _mm512_sincos_ps
  • _mm512_mask_sincos_ps
  • _mm_acos_pd
  • _mm_acos_ps
  • _mm_acosh_pd
  • _mm_acosh_ps
  • _mm_asin_pd
  • _mm_asin_ps
  • _mm_asinh_pd
  • _mm_asinh_ps
  • _mm_atan_pd
  • _mm_atan_ps
  • _mm_atan2_pd
  • _mm_atan2_ps
  • _mm_atanh_pd
  • _mm_atanh_ps
  • _mm_cbrt_pd
  • _mm_cbrt_ps
  • _mm_cdfnorm_pd
  • _mm_cdfnorm_ps
  • _mm_cdfnorminv_pd
  • _mm_cdfnorminv_ps
  • _mm_cexp_ps
  • _mm_clog_ps
  • _mm_cos_pd
  • _mm_cos_ps
  • _mm_cosd_pd
  • _mm_cosd_ps
  • _mm_cosh_pd
  • _mm_cosh_ps
  • _mm_csqrt_ps
  • _mm_div_epi8
  • _mm_div_epi16
  • _mm_div_epi32
  • _mm_div_epi64
  • _mm_div_epu8
  • _mm_div_epu16
  • _mm_div_epu32
  • _mm_div_epu64
  • _mm_erf_pd
  • _mm_erf_ps
  • _mm_erfc_pd
  • _mm_erfc_ps
  • _mm_erfcinv_pd
  • _mm_erfcinv_ps
  • _mm_erfinv_pd
  • _mm_erfinv_ps
  • _mm_exp_pd
  • _mm_exp_ps
  • _mm_exp10_pd
  • _mm_exp10_ps
  • _mm_exp2_pd
  • _mm_exp2_ps
  • _mm_expm1_pd
  • _mm_expm1_ps
  • _mm_hypot_pd
  • _mm_hypot_ps
  • _mm_idiv_epi32
  • _mm_idivrem_epi32
  • _mm_invcbrt_pd
  • _mm_invcbrt_ps
  • _mm_invsqrt_pd
  • _mm_invsqrt_ps
  • _mm_irem_epi32
  • _mm_log_pd
  • _mm_log_ps
  • _mm_log10_pd
  • _mm_log10_ps
  • _mm_log1p_pd
  • _mm_log1p_ps
  • _mm_log2_pd
  • _mm_log2_ps
  • _mm_logb_pd
  • _mm_logb_ps
  • _mm_pow_pd
  • _mm_pow_ps
  • _mm_rem_epi8
  • _mm_rem_epi16
  • _mm_rem_epi32
  • _mm_rem_epi64
  • _mm_rem_epu8
  • _mm_rem_epu16
  • _mm_rem_epu32
  • _mm_rem_epu64
  • _mm_sin_pd
  • _mm_sin_ps
  • _mm_sincos_pd
  • _mm_sincos_ps
  • _mm_sind_pd
  • _mm_sind_ps
  • _mm_sinh_pd
  • _mm_sinh_ps
  • _mm_svml_ceil_pd
  • _mm_svml_ceil_ps
  • _mm_svml_floor_pd
  • _mm_svml_floor_ps
  • _mm_svml_round_pd
  • _mm_svml_round_ps
  • _mm_svml_sqrt_pd
  • _mm_svml_sqrt_ps
  • _mm_tan_pd
  • _mm_tan_ps
  • _mm_tand_pd
  • _mm_tand_ps
  • _mm_tanh_pd
  • _mm_tanh_ps
  • _mm_trunc_pd
  • _mm_trunc_ps
  • _mm_udiv_epi32
  • _mm_udivrem_epi32
  • _mm_urem_epi32
@nemequ nemequ added the GSoC/Outreachy-ideas Ideas for Google Summer of Code or Outreachy projects label Jan 15, 2020
@mr-c mr-c added instruction-set-support Implementing new SIMD ISA extensions portably and removed GSoC/Outreachy-ideas Ideas for Google Summer of Code or Outreachy projects labels May 17, 2020
@mr-c
Copy link
Collaborator

mr-c commented Jul 6, 2020

Congratulations @himanshi18037 !

@himanshi18037
Copy link
Contributor

Congratulations @himanshi18037 !

Thanks:) @mr-c

@nemequ
Copy link
Member Author

nemequ commented Jul 15, 2020

This is finished; great job @himanshi18037!

@nemequ nemequ closed this as completed Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
instruction-set-support Implementing new SIMD ISA extensions portably
Projects
None yet
Development

No branches or pull requests

3 participants