Skip to content

Commit

Permalink
Merge pull request #426 from jvdp1/median
Browse files Browse the repository at this point in the history
Addition of a subroutine to compute the median of array elements
  • Loading branch information
milancurcic committed Jul 23, 2021
2 parents 88e2219 + a13c700 commit dd81cf5
Show file tree
Hide file tree
Showing 10 changed files with 779 additions and 8 deletions.
94 changes: 94 additions & 0 deletions doc/specs/stdlib_stats.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ The Pearson correlation between two rows (or columns), say `x` and `y`, of `arra

`result = [[stdlib_stats(module):corr(interface)]](array, dim [, mask])`

### Class

Generic subroutine

### Arguments

`array`: Shall be a rank-1 or a rank-2 array of type `integer`, `real`, or `complex`.
Expand Down Expand Up @@ -83,6 +87,10 @@ The scaling can be changed with the logical argument `corrected`. If `corrected`

`result = [[stdlib_stats(module):cov(interface)]](array, dim [, mask [, corrected]])`

### Class

Generic subroutine

### Arguments

`array`: Shall be a rank-1 or a rank-2 array of type `integer`, `real`, or `complex`.
Expand Down Expand Up @@ -134,6 +142,10 @@ Returns the mean of all the elements of `array`, or of the elements of `array` a

`result = [[stdlib_stats(module):mean(interface)]](array, dim [, mask])`

### Class

Generic subroutine

### Arguments

`array`: Shall be an array of type `integer`, `real`, or `complex`.
Expand Down Expand Up @@ -166,6 +178,80 @@ program demo_mean
end program demo_mean
```

## `median` - median of array elements

### Status

Experimental

### Description

Returns the median of all the elements of `array`, or of the elements of `array`
along dimension `dim` if provided, and if the corresponding element in `mask` is
`true`.

The median of the elements of `array` is defined as the "middle"
element, after that the elements are sorted in an increasing order, e.g. `array_sorted =
sort(array)`. If `n = size(array)` is an even number, the median is:

```
median(array) = array_sorted( floor( (n + 1) / 2.))
```

and if `n` is an odd number, the median is:

```
median(array) = mean( array_sorted( floor( (n + 1) / 2.):floor( (n + 1) / 2.) + 1 ) )
```

The current implementation is a quite naive implementation that relies on sorting
the whole array, using the subroutine `[[stdlib_sorting(module):ord_sort(interface)]]`
provided by the `[[stdlib_sorting(module)]]` module.

### Syntax

`result = [[stdlib_stats(module):median(interface)]](array [, mask])`

`result = [[stdlib_stats(module):median(interface)]](array, dim [, mask])`

### Class

Generic subroutine

### Arguments

`array`: Shall be an array of type `integer` or `real`.

`dim`: Shall be a scalar of type `integer` with a value in the range from 1 to `n`, where `n` is the rank of `array`.

`mask` (optional): Shall be of type `logical` and either a scalar or an array of the same shape as `array`.

### Return value

If `array` is of type `real`, the result is of type `real` with the same kind as `array`.
If `array` is of type `real` and contains IEEE `NaN`, the result is IEEE `NaN`.
If `array` is of type `integer`, the result is of type `real(dp)`.

If `dim` is absent, a scalar with the median of all elements in `array` is returned. Otherwise, an array of rank `n-1`, where `n` equals the rank of `array`, and a shape similar to that of `array` with dimension `dim` dropped is returned.

If `mask` is specified, the result is the median of all elements of `array` corresponding to `true` elements of `mask`. If every element of `mask` is `false`, the result is IEEE `NaN`.


### Example

```fortran
program demo_median
use stdlib_stats, only: median
implicit none
real :: x(1:6) = [ 1., 2., 3., 4., 5., 6. ]
real :: y(1:2, 1:3) = reshape([ 1., 2., 3., 4., 5., 6. ], [ 2, 3])
print *, median(x) !returns 3.5
print *, median(y) !returns 3.5
print *, median(y, 1) !returns [ 1.5, 3.5, 5.5 ]
print *, median(y, 1,y > 3.) !returns [ NaN, 4.0, 5.5 ]
end program demo_median
```

## `moment` - central moments of array elements

### Status
Expand Down Expand Up @@ -199,6 +285,10 @@ The _k_-th order moment about `center` is defined as :

`result = [[stdlib_stats(module):moment(interface)]](array, order, dim [, center [, mask]])`

### Class

Generic subroutine

### Arguments

`array`: Shall be an array of type `integer`, `real`, or `complex`.
Expand Down Expand Up @@ -264,6 +354,10 @@ The use of the term `n-1` for scaling is called Bessel 's correction. The scalin

`result = [[stdlib_stats(module):var(interface)]](array, dim [, mask [, corrected]])`

### Class

Generic subroutine

### Arguments

`array`: Shall be an array of type `integer`, `real`, or `complex`.
Expand Down
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ set(fppFiles
stdlib_stats_corr.fypp
stdlib_stats_cov.fypp
stdlib_stats_mean.fypp
stdlib_stats_median.fypp
stdlib_stats_moment.fypp
stdlib_stats_moment_all.fypp
stdlib_stats_moment_mask.fypp
Expand Down
6 changes: 6 additions & 0 deletions src/Makefile.manual
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ SRCFYPP =\
stdlib_stats_corr.fypp \
stdlib_stats_cov.fypp \
stdlib_stats_mean.fypp \
stdlib_stats_median.fypp \
stdlib_stats_moment.fypp \
stdlib_stats_moment_all.fypp \
stdlib_stats_moment_mask.fypp \
Expand Down Expand Up @@ -121,6 +122,11 @@ stdlib_stats_mean.o: \
stdlib_optval.o \
stdlib_kinds.o \
stdlib_stats.o
stdlib_stats_median.o: \
stdlib_optval.o \
stdlib_kinds.o \
stdlib_sorting.o \
stdlib_stats.o
stdlib_stats_moment.o: \
stdlib_optval.o \
stdlib_kinds.o \
Expand Down
63 changes: 63 additions & 0 deletions src/common.fypp
Original file line number Diff line number Diff line change
Expand Up @@ -156,4 +156,67 @@ ${prefix + joinstr.join([line.strip() for line in txt.split("\n")]) + suffix}$
#:endcall
#:enddef

#!
#! Generates an array rank suffix for subarrays along a dimension
#!
#! Args:
#! varname (str): Name of the variable to be used as origin
#! rank (int): Rank of the original variable
#! dim (int): Dimension of the variable
#!
#! Returns:
#! Array rank suffix string enclosed in braces
#!
#! E.g.,
#! select_subvector('j', 5, 2)
#! -> (j1, :, j3, j4, j5)
#!
#! Used, e.g., in
#! stdlib_stats_median.fypp
#!
#:def select_subvector(varname, rank, idim)
#:assert rank > 0
#:call join_lines(joinstr=", ", prefix="(", suffix=")")
#:for i in range(1, idim)
${varname}$${i}$
#:endfor
:
#:for i in range(idim + 1, rank + 1)
${varname}$${i}$
#:endfor
#:endcall
#:enddef

#!
#! Generates an array rank suffix for arrays
#!
#! Args:
#! varname (str): Name of the variable to be used as origin
#! rank (int): Rank of the original array variable
#! idim (int): Dimension of the variable dropped
#!
#! Returns:
#! Array rank suffix string enclosed in braces
#!
#! E.g.,
#! reduce_subvector('j', 5, 2)
#! -> (j1, j3, j4, j5)
#!
#! Used, e.g., in
#! stdlib_stats_median.fypp
#!
#:def reduce_subvector(varname, rank, idim)
#:assert rank > 0
#:if rank > 1
#:call join_lines(joinstr=", ", prefix="(", suffix=")")
#:for i in range(1, idim)
${varname}$${i}$
#:endfor
#:for i in range(idim + 1, rank + 1)
${varname}$${i}$
#:endfor
#:endcall
#:endif
#:enddef

#:endmute
67 changes: 61 additions & 6 deletions src/stdlib_stats.fypp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#:set RANKS = range(1, MAXRANK + 1)
#:set REDRANKS = range(2, MAXRANK + 1)
#:set RC_KINDS_TYPES = REAL_KINDS_TYPES + CMPLX_KINDS_TYPES
#:set IR_KINDS_TYPES_OUTPUT = list(zip(INT_KINDS,INT_TYPES, ['dp']*len(INT_KINDS))) + list(zip(REAL_KINDS, REAL_TYPES, REAL_KINDS))
module stdlib_stats
!! Provides support for various statistical methods. This includes currently
!! descriptive statistics
Expand All @@ -11,14 +12,14 @@ module stdlib_stats
implicit none
private
! Public API
public :: corr, cov, mean, moment, var
public :: corr, cov, mean, median, moment, var


interface corr
!! version: experimental
!!
!! Pearson correlation of array elements
!! ([Specification](../page/specs/stdlib_stats.html#description))
!! ([Specification](../page/specs/stdlib_stats.html#corr-pearson-correlation-of-array-elements))
#:for k1, t1 in RC_KINDS_TYPES
#:set RName = rname("corr",1, t1, k1)
module function ${RName}$(x, dim, mask) result(res)
Expand Down Expand Up @@ -110,7 +111,7 @@ module stdlib_stats
!! version: experimental
!!
!! Covariance of array elements
!! ([Specification](../page/specs/stdlib_stats.html#description_1))
!! ([Specification](../page/specs/stdlib_stats.html#cov-covariance-of-array-elements))
#:for k1, t1 in RC_KINDS_TYPES
#:set RName = rname("cov",1, t1, k1)
module function ${RName}$(x, dim, mask, corrected) result(res)
Expand Down Expand Up @@ -209,7 +210,7 @@ module stdlib_stats
!! version: experimental
!!
!! Mean of array elements
!! ([Specification](../page/specs/stdlib_stats.html#description_2))
!! ([Specification](../page/specs/stdlib_stats.html#mean-mean-of-array-elements))
#:for k1, t1 in RC_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("mean_all",rank, t1, k1)
Expand Down Expand Up @@ -305,11 +306,65 @@ module stdlib_stats
end interface mean


interface median
!! version: experimental
!!
!! Median of array elements
!! ([Specification](../page/specs/stdlib_stats.html#median-median-of-array-elements))
#:for k1, t1, o1 in IR_KINDS_TYPES_OUTPUT
#:for rank in RANKS
#:set name = rname("median_all",rank, t1, k1, o1)
module function ${name}$ (x, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
logical, intent(in), optional :: mask
real(${o1}$) :: res
end function ${name}$
#:endfor
#:endfor

#:for k1, t1, o1 in IR_KINDS_TYPES_OUTPUT
#:for rank in RANKS
#:set name = rname("median",rank, t1, k1, o1)
module function ${name}$(x, dim, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
integer, intent(in) :: dim
logical, intent(in), optional :: mask
real(${o1}$) :: res${reduced_shape('x', rank, 'dim')}$
end function ${name}$
#:endfor
#:endfor

#:for k1, t1, o1 in IR_KINDS_TYPES_OUTPUT
#:for rank in RANKS
#:set name = rname('median_all_mask',rank, t1, k1, o1)
module function ${name}$(x, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
logical, intent(in) :: mask${ranksuffix(rank)}$
real(${o1}$) :: res
end function ${name}$
#:endfor
#:endfor

#:for k1, t1, o1 in IR_KINDS_TYPES_OUTPUT
#:for rank in RANKS
#:set name = rname('median_mask',rank, t1, k1, o1)
module function ${name}$(x, dim, mask) result(res)
${t1}$, intent(in) :: x${ranksuffix(rank)}$
integer, intent(in) :: dim
logical, intent(in) :: mask${ranksuffix(rank)}$
real(${o1}$) :: res${reduced_shape('x', rank, 'dim')}$
end function ${name}$
#:endfor
#:endfor

end interface


interface var
!! version: experimental
!!
!! Variance of array elements
!! ([Specification](../page/specs/stdlib_stats.html#description_4))
!! ([Specification](../page/specs/stdlib_stats.html#var-variance-of-array-elements))

#:for k1, t1 in RC_KINDS_TYPES
#:for rank in RANKS
Expand Down Expand Up @@ -418,7 +473,7 @@ module stdlib_stats
!! version: experimental
!!
!! Central moment of array elements
!! ([Specification](../page/specs/stdlib_stats.html#description_3))
!! ([Specification](../page/specs/stdlib_stats.html#moment-central-moments-of-array-elements))
#:for k1, t1 in RC_KINDS_TYPES
#:for rank in RANKS
#:set RName = rname("moment_all",rank, t1, k1)
Expand Down
Loading

0 comments on commit dd81cf5

Please sign in to comment.