Merge pull request #68 from milankl/mk/conversion

Match SoftPosit with 2022 posit standard
milankl · Jun 10, 2022 · d55118a · d55118a
2 parents aa307d5 + add266c
commit d55118a
Show file tree

Hide file tree

Showing 40 changed files with 1,129 additions and 1,086 deletions.
diff --git a/Project.toml b/Project.toml
@@ -1,13 +1,13 @@
 name = "SoftPosit"
 uuid = "0775deef-a35f-56d7-82da-cfc52f91364d"
-version = "0.4.0"
+version = "0.5.0"
 
 [deps]
 SoftPosit_jll = "f9aa12f2-fb2a-5e38-99be-91dba0a1f972"
 
 [compat]
-SoftPosit_jll = "0.4.1"
-julia = "1.6"
+SoftPosit_jll = "0.4.1, 0.4.2"
+julia = "1.6, 1.7"
 
 [extras]
 Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

diff --git a/README.md b/README.md
@@ -4,9 +4,14 @@
 
 [Julia](https://julialang.org/) types for the C-based [SoftPosit](https://gitlab.com/cerlane/SoftPosit) library - a posit arithmetic emulator.
 
-Posit numbers are an alternative to floating-point numbers. Posits extend floats by introducing regime bits, that allow for a higher precision around one, yet a wide dynamic range of representable numbers. For further information see https://posithub.org
+Posit numbers are an alternative to floating-point numbers. Posits extend floats by introducing regime bits
+that allow for a higher precision around one, yet a wide dynamic range of representable numbers.
+For further information see [posithub.org](https://posithub.org).
 
-If this library doesn't support a desired functionality or for anything else, please raise an issue.
+If this library doesn't support a desired functionality or for anything else, please
+[raise an issue](https://github.com/milankl/SoftPosit.jl/issues).
+
+Note: This library is not yet conform with the [2022 Standard for Posit Arithmetic](https://posithub.org/docs/posit_standard-2.pdf).
 
 # Installation
 
@@ -20,15 +25,23 @@ where `]` opens the package manager. Then simply `using SoftPosit` which enables
 
 # 8, 16 and 32bit posit formats
 
-SoftPosit.jl emulates the following Posit number formats `Posit(n,es)`, with `n` number of bits including `es` exponent bits: Posit(8,0), Posit(16,1), Posit(32,2) as primitive types called
+SoftPosit.jl emulates the following Posit number formats `Posit(n,es)`, with `n` number of bits including `es` exponent bits:
+Posit(8,0), Posit(16,1), Posit(32,2) as primitive types called
 
     Posit8, Posit16, Posit32
 
-following the [draft standard](https://posithub.org/docs/posit_standard.pdf). Additionally, the following off-standard formats are defined as primitive types, which are internally stored as 32bit (the remaining bits are kept as zeros): Posit(8,1), Posit(8,2), Posit(16,1), Posit(16,2), Posit(24,1), and Posit(24,2) called `Posit8_1`, `Posit8_2`, `Posit16_1`, `Posit16_2`, `Posit24_1`, and `Posit24_2`.
+following the [draft standard](https://posithub.org/docs/posit_standard.pdf) although this will be changed to always 2 exponent
+bits to match the [2022 standard](https://posithub.org/docs/posit_standard-2.pdf). Additionally, the following formats are defined
+as primitive types, which are internally stored as 32bit (the remaining bits are kept as zeros): Posit(8,1), Posit(8,2), Posit(16,1),
+Posit(16,2), Posit(24,1), and Posit(24,2) called `Posit8_1`, `Posit8_2`, `Posit16_1`, `Posit16_2`, `Posit24_1`, and `Posit24_2`.
 
-For all the types `Posit8, Posit16, Posit32, Posit8_2, Posit16_2, Posit24_2` conversions between Integers and Floats and basic arithmetic operations `+`, `-`, `*`, `/` and `sqrt` (among others) are defined. Unfortunately, `Posit8_1, Posit16_1, Posit24_1` are not yet fully supported by the underlying C library.
+For all the types `Posit8, Posit16, Posit32, Posit8_2, Posit16_2, Posit24_2` conversions between Integers and Floats and
+basic arithmetic operations `+`, `-`, `*`, `/` and `sqrt` (among others) are defined.
+Unfortunately, `Posit8_1, Posit16_1, Posit24_1` are not fully supported by the underlying C library.
 
-To support quires, `Quire8`, `Quire16` and `Quire32` are implemented as 32 / 128 / 512bit types for fused multiply-add and fused multiply-subtract. Additional math functions like `exp`,`log`,`sin`,`cos`,`tan` are defined via conversion to `Float64` (no support yet of the C library) and therefore do not have error-free rounding.
+To support quires, `Quire8`, `Quire16` and `Quire32` are implemented as 32 / 128 / 512bit types for fused multiply-add and
+fused multiply-subtract. Additional math functions like `exp`,`log`,`sin`,`cos`,`tan` are defined via conversion to `Float64`
+(no support yet of the C library) and therefore do not have error-free rounding.
 
 # Examples
 
@@ -54,10 +67,12 @@ please read [softposit_examples.ipynb](https://github.com/milankl/SoftPosit.jl/b
 
 # Rounding mode
 
-Following the posit standard, posits should underflow between [-minpos/2,0) and (0,minpos/2] and
-never overflow in (-∞,-maxpos] and [maxpos,∞). Float ±infinity and Not-a-Number are mapped to posit
-Not-a-Real NaR, and NaR is mapped back to NaN. In the current version v0.4 the implementaion of this
-rounding mode make a few changes
+Following the [2022 posit standard](https://posithub.org/docs/posit_standard-2.pdf),
+posits should never underflow nor overflow. Consequently, `(0,minpos/2]` and `[-minpos/2,0)`
+are mapped to `±minpos`, respectively, and `[maxpos,∞)` and `(-∞,-maxpos]` to `±maxpos`.
+Float ±infinity and Not-a-Number are mapped to posit Not-a-Real NaR, and NaR is mapped back to NaN.
+In the current version v0.4 the implementaion of this rounding mode make a few changes that
+will be resolved in upcoming releases.
 
 - **Posit8 and Posit32**: No underflow, all numbers in [-minpos,0) are mapped to -minpos (and (0,minpos] to minpos)
 - **Posit16**: Underflow occurs at minpos/4 and overflow occurs at floatmax(Float32)/4

diff --git a/src/SoftPosit.jl b/src/SoftPosit.jl
@@ -1,43 +1,17 @@
 module SoftPosit
 
-    import SoftPosit_jll
-    # For compatibility with previous versions of SofPosit.jl which used the name
-    # `SoftPositPath`.
-    const SoftPositPath = SoftPosit_jll.softposit
+    # import SoftPosit_jll
+    # # For compatibility with previous versions of SofPosit.jl which used the name
+    # # `SoftPositPath`.
+    # const SoftPositPath = SoftPosit_jll.softposit
 
-    export AbstractPosit, Posit8, Posit16, Posit32,
-        Posit8_1, Posit16_1, Posit24_1,
-        Posit8_2, Posit16_2, Posit24_2,
-        notareal, minusone,
-        AbstractQuire, Quire8, Quire16, Quire32, fms,
-        Posit16_old, Float32_old
+    export AbstractPosit, Posit8, Posit16, Posit32, Posit16_1,
+        notareal
 
-    import Base: Float64, Float32, Float16, Int32, Int64,
-        UInt8, UInt16, UInt32,
-        (+), (-), (*), (/), (<), (<=), (==), sqrt,
-        bitstring, round, one, zero, promote_rule, eps,
-        floatmin, floatmax, signbit, sign, isfinite,
-        nextfloat, prevfloat, fma,
-        exp, exp2, exp10, log, log2, log10, cos, sin, tan,
-        expm1,log1p
-
-    include("typedef.jl")
-    include("conversionFloatToPosit.jl")
-    include("conversionPositToFloat.jl")
-    include("conversionPositToPosit.jl")
-    include("conversionIntToPosit.jl")
-    include("conversionPositToInt.jl")
-    include("conversionHexBinToPosit.jl")
-    include("conversionBoolToPosit.jl")
-    include("conversionQuire.jl")
-    include("arithmetic.jl")
-    include("comparison.jl")
+    include("type_definitions.jl")
+    include("comparisons.jl")
     include("constants.jl")
-    include("round.jl")
+    include("conversions.jl")
+    include("arithmetics.jl")
     include("print.jl")
-    include("nextprevfloat.jl")
-    include("eps.jl")
-    include("quire.jl")
-    include("explog_trigonometric.jl")
-
 end
diff --git a/src/arithmetic.jl b/src/arithmetic.jl
diff --git a/src/arithmetics.jl b/src/arithmetics.jl
@@ -0,0 +1,49 @@
+# NEGATION via two's complement
+Base.:(-)(x::T) where {T<:AbstractPosit} = reinterpret(T,-unsigned(x))
+
+# SIGNBIT from the corresponding Int
+Base.signbit(x::AbstractPosit) = signbit(signed(x))
+
+# SIGN redefine, not x/|x| as in Base for floats
+function Base.sign(x::T) where {T<:AbstractPosit}
+    iszero(x) && return zero(T)
+    isnan(x) && return notareal(T)
+    return signbit(x) ? minusone(T) : one(T)
+end
+
+# TWO ARGUMENT ARITHMETIC, addition, subtraction, multiplication, division via conversion
+for op in (:(+), :(-), :(*), :(/))
+    @eval begin
+        Base.$op(x::T,y::T) where {T<:AbstractPosit} = convert(T,$op(float(x),float(y)))
+    end
+end
+
+# ONE ARGUMENT ARITHMETIC, sqrt, exp, log, etc. via conversion
+for op in (:sqrt, :exp, :exp2, :exp10, :expm1, :log, :log2, :log10, :log1p,
+            :sin, :cos, :tan)
+    @eval begin
+        Base.$op(x::T) where {T<:AbstractPosit} = convert(T,$op(float(x)))
+    end
+end
+
+Base.sincos(x::AbstractPosit) = sin(x),cos(x)   # not in eval loop because of convert
+
+# complex trigonometric functions
+for P in (:Posit8, :Posit16, :Posit16_1, :Posit32)
+        @eval begin
+                sin(x::Complex{$P}) = Complex{$P}(sin(Complex{Base.floattype($P)}(x)))
+                cos(x::Complex{$P}) = Complex{$P}(cos(Complex{Base.floattype($P)}(x)))
+                exp(x::Complex{$P}) = cos(im*x) - im*sin(im*x)
+        end
+end
+
+# nextfloat, prevfloat have a wrap-around behaviour nextfloat(maxpos) = NaR, nextfloat(NaR) = -maxpos
+Base.nextfloat(x::T) where {T<:AbstractPosit} = reinterpret(T,reinterpret(Base.uinttype(T),x)+one(Base.uinttype(T)))
+Base.prevfloat(x::T) where {T<:AbstractPosit} = reinterpret(T,reinterpret(Base.uinttype(T),x)-one(Base.uinttype(T)))
+
+# precision (taken from lookup table)
+eps(::Type{Posit8}) = reinterpret(Posit8,0x28)
+eps(::Type{Posit16}) = reinterpret(Posit16,0x0a00)
+eps(::Type{Posit16_1}) = reinterpret(Posit16_1,0x0100)
+eps(::Type{Posit32}) = reinterpret(Posit32,0x00a0_0000)
+eps(x::AbstractPosit) = max(x-prevfloat(x),nextfloat(x)-x)
diff --git a/src/comparison.jl b/src/comparison.jl
diff --git a/src/comparisons.jl b/src/comparisons.jl
@@ -0,0 +1,9 @@
+# == for posits only true if and only if types and bits match exactly (===, egality)
+Base.:(==)(x::AbstractPosit,y::AbstractPosit) = x === y
+Base.isnan(x::AbstractPosit) = x == notareal(x)         # use isnan for "is NaR?" check
+Base.isfinite(x::AbstractPosit) = ~isnan(x)             # finite if not NaR
+Base.iszero(x::AbstractPosit) = x == zero(x)
+
+# COMPARISONS via two's complement (- of uints)
+Base.:(<)(x::T,y::T) where {T<:AbstractPosit} = -unsigned(x) > -unsigned(y)
+Base.:(<=)(x::T,y::T) where {T<:AbstractPosit} = -unsigned(x) >= -unsigned(y)