Skip to content

Commit 7b99d43

Browse files
authored
Merge pull request #915 from jalvesz/intrinsics
intrinsics module with alternative implementations
2 parents 6c2565d + 12612bc commit 7b99d43

13 files changed

+1022
-1
lines changed

doc/specs/stdlib_intrinsics.md

+158
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
title: intrinsics
3+
---
4+
5+
# The `stdlib_intrinsics` module
6+
7+
[TOC]
8+
9+
## Introduction
10+
11+
The `stdlib_intrinsics` module provides replacements for some of the well known intrinsic functions found in Fortran compilers for which either a faster and/or more accurate implementation is found which has also proven of interest to the Fortran community.
12+
13+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
14+
### `stdlib_sum` function
15+
16+
#### Description
17+
18+
The `stdlib_sum` function can replace the intrinsic `sum` for `real`, `complex` or `integer` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when summing large (e..g, >2**10 elements) arrays, for repetitive summation of smaller arrays consider the classical `sum`.
19+
20+
#### Syntax
21+
22+
`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x [,mask] )`
23+
24+
`res = ` [[stdlib_intrinsics(module):stdlib_sum(interface)]] ` (x, dim [,mask] )`
25+
26+
#### Status
27+
28+
Experimental
29+
30+
#### Class
31+
32+
Pure function.
33+
34+
#### Argument(s)
35+
36+
`x`: N-D array of either `real`, `complex` or `integer` type. This argument is `intent(in)`.
37+
38+
`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`.
39+
40+
`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`.
41+
42+
#### Output value or Result value
43+
44+
If `dim` is absent, the output is a scalar of the same `type` and `kind` as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned.
45+
46+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
47+
### `stdlib_sum_kahan` function
48+
49+
#### Description
50+
51+
The `stdlib_sum_kahan` function can replace the intrinsic `sum` for `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential complemented by an `elemental` kernel based on the [kahan summation](https://doi.org/10.1145%2F363707.363723) strategy to reduce the round-off error:
52+
53+
```fortran
54+
elemental subroutine kahan_kernel_<kind>(a,s,c)
55+
type(<kind>), intent(in) :: a
56+
type(<kind>), intent(inout) :: s
57+
type(<kind>), intent(inout) :: c
58+
type(<kind>) :: t, y
59+
y = a - c
60+
t = s + y
61+
c = (t - s) - y
62+
s = t
63+
end subroutine
64+
```
65+
66+
#### Syntax
67+
68+
`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x [,mask] )`
69+
70+
`res = ` [[stdlib_intrinsics(module):stdlib_sum_kahan(interface)]] ` (x, dim [,mask] )`
71+
72+
#### Status
73+
74+
Experimental
75+
76+
#### Class
77+
78+
Pure function.
79+
80+
#### Argument(s)
81+
82+
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
83+
84+
`dim` (optional): scalar of type `integer` with a value in the range from 1 to n, where n equals the rank of `x`.
85+
86+
`mask` (optional): N-D array of `logical` values, with the same shape as `x`. This argument is `intent(in)`.
87+
88+
#### Output value or Result value
89+
90+
If `dim` is absent, the output is a scalar of the same type and kind as to that of `x`. Otherwise, an array of rank n-1, where n equals the rank of `x`, and a shape similar to that of `x` with dimension `dim` dropped is returned.
91+
92+
#### Example
93+
94+
```fortran
95+
{!example/intrinsics/example_sum.f90!}
96+
```
97+
98+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
99+
### `stdlib_dot_product` function
100+
101+
#### Description
102+
103+
The `stdlib_dot_product` function can replace the intrinsic `dot_product` for 1D `real`, `complex` or `integer` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when crunching large arrays, for repetitive products of smaller arrays consider the classical `dot_product`.
104+
105+
#### Syntax
106+
107+
`res = ` [[stdlib_intrinsics(module):stdlib_dot_product(interface)]] ` (x, y)`
108+
109+
#### Status
110+
111+
Experimental
112+
113+
#### Class
114+
115+
Pure function.
116+
117+
#### Argument(s)
118+
119+
`x`: 1D array of either `real`, `complex` or `integer` type. This argument is `intent(in)`.
120+
121+
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`.
122+
123+
#### Output value or Result value
124+
125+
The output is a scalar of `type` and `kind` same as to that of `x` and `y`.
126+
127+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
128+
### `stdlib_dot_product_kahan` function
129+
130+
#### Description
131+
132+
The `stdlib_dot_product_kahan` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential, complemented by the same `elemental` kernel based on the [kahan summation](https://doi.org/10.1145%2F363707.363723) used for `stdlib_sum` to reduce the round-off error.
133+
134+
#### Syntax
135+
136+
`res = ` [[stdlib_intrinsics(module):stdlib_dot_product_kahan(interface)]] ` (x, y)`
137+
138+
#### Status
139+
140+
Experimental
141+
142+
#### Class
143+
144+
Pure function.
145+
146+
#### Argument(s)
147+
148+
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
149+
150+
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`.
151+
152+
#### Output value or Result value
153+
154+
The output is a scalar of the same type and kind as to that of `x` and `y`.
155+
156+
```fortran
157+
{!example/intrinsics/example_dot_product.f90!}
158+
```

example/CMakeLists.txt

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ add_subdirectory(constants)
1313
add_subdirectory(error)
1414
add_subdirectory(hashmaps)
1515
add_subdirectory(hash_procedures)
16+
add_subdirectory(intrinsics)
1617
add_subdirectory(io)
1718
add_subdirectory(linalg)
1819
add_subdirectory(logger)

example/intrinsics/CMakeLists.txt

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
ADD_EXAMPLE(sum)
2+
ADD_EXAMPLE(dot_product)
+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
program example_dot_product
2+
use stdlib_kinds, only: sp
3+
use stdlib_intrinsics, only: stdlib_dot_product, stdlib_dot_product_kahan
4+
implicit none
5+
6+
real(sp), allocatable :: x(:), y(:)
7+
real(sp) :: total_prod(3)
8+
9+
allocate( x(1000), y(1000) )
10+
call random_number(x)
11+
call random_number(y)
12+
13+
total_prod(1) = dot_product(x,y) !> compiler intrinsic
14+
total_prod(2) = stdlib_dot_product(x,y) !> chunked summation over inner product
15+
total_prod(3) = stdlib_dot_product_kahan(x,y) !> chunked kahan summation over inner product
16+
print *, total_prod(1:3)
17+
18+
end program example_dot_product

example/intrinsics/example_sum.f90

+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
program example_sum
2+
use stdlib_kinds, only: sp
3+
use stdlib_intrinsics, only: stdlib_sum, stdlib_sum_kahan
4+
implicit none
5+
6+
real(sp), allocatable :: x(:)
7+
real(sp) :: total_sum(3)
8+
9+
allocate( x(1000) )
10+
call random_number(x)
11+
12+
total_sum(1) = sum(x) !> compiler intrinsic
13+
total_sum(2) = stdlib_sum(x) !> chunked summation
14+
total_sum(3) = stdlib_sum_kahan(x)!> chunked kahan summation
15+
print *, total_sum(1:3)
16+
17+
end program example_sum

src/CMakeLists.txt

+3
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,9 @@ set(fppFiles
1717
stdlib_hash_64bit_fnv.fypp
1818
stdlib_hash_64bit_pengy.fypp
1919
stdlib_hash_64bit_spookyv2.fypp
20+
stdlib_intrinsics_dot_product.fypp
21+
stdlib_intrinsics_sum.fypp
22+
stdlib_intrinsics.fypp
2023
stdlib_io.fypp
2124
stdlib_io_npy.fypp
2225
stdlib_io_npy_load.fypp

src/stdlib_constants.fypp

+17-1
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
11
#:include "common.fypp"
22
#:set KINDS = REAL_KINDS
3+
#:set I_KINDS_TYPES = list(zip(INT_KINDS, INT_TYPES, INT_KINDS))
4+
#:set R_KINDS_TYPES = list(zip(REAL_KINDS, REAL_TYPES, REAL_SUFFIX))
5+
#:set C_KINDS_TYPES = list(zip(CMPLX_KINDS, CMPLX_TYPES, CMPLX_SUFFIX))
6+
37
module stdlib_constants
48
!! Constants
59
!! ([Specification](../page/specs/stdlib_constants.html))
6-
use stdlib_kinds, only: #{for k in KINDS[:-1]}#${k}$, #{endfor}#${KINDS[-1]}$
10+
use stdlib_kinds
711
use stdlib_codata, only: SPEED_OF_LIGHT_IN_VACUUM, &
812
VACUUM_ELECTRIC_PERMITTIVITY, &
913
VACUUM_MAG_PERMEABILITY, &
@@ -60,5 +64,17 @@ module stdlib_constants
6064
real(dp), parameter, public :: u = ATOMIC_MASS_CONSTANT%value !! Atomic mass constant
6165

6266
! Additional constants if needed
67+
#:for k, t, s in I_KINDS_TYPES
68+
${t}$, parameter, public :: zero_${s}$ = 0_${k}$
69+
${t}$, parameter, public :: one_${s}$ = 1_${k}$
70+
#:endfor
71+
#:for k, t, s in R_KINDS_TYPES
72+
${t}$, parameter, public :: zero_${s}$ = 0._${k}$
73+
${t}$, parameter, public :: one_${s}$ = 1._${k}$
74+
#:endfor
75+
#:for k, t, s in C_KINDS_TYPES
76+
${t}$, parameter, public :: zero_${s}$ = (0._${k}$,0._${k}$)
77+
${t}$, parameter, public :: one_${s}$ = (1._${k}$,0._${k}$)
78+
#:endfor
6379

6480
end module stdlib_constants

0 commit comments

Comments
 (0)