Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
157 commits
Select commit Hold shift + click to select a range
459abf6
utils: add macro for assume
bettio Mar 3, 2025
4638a5d
utils: add int*_write_to_ascii_buf functions
bettio Mar 9, 2025
74c9d0b
NIFs: refactor integer_to_binary/list
bettio Mar 9, 2025
9e8dec3
NIFs: refactor `binary_to_integer/1`
bettio Mar 22, 2025
ec96ac3
BIFs: refactor binary arith helpers before introducing bigints
bettio Mar 30, 2025
297a250
Add bigint implementation (`intn.c`)
bettio Mar 30, 2025
e215f63
intn: fix warning in Hacker's Delight code
bettio Mar 30, 2025
e169c7f
intn: optimize nlz function
bettio Mar 30, 2025
7d0b03e
Implement bigint basic conversion and display functions
bettio Mar 30, 2025
6b79c52
BIFs: implement first bigint operation (`erlang:*/2`)
bettio Mar 30, 2025
f30f2df
tests: add first bigint tests (`bigint.erl`)
bettio Mar 30, 2025
e87065e
Add support to bigint to `erlang:binary_to_integer/1`
bettio Mar 30, 2025
279336e
term: use boxed integer sign bit
bettio Apr 1, 2025
a278dc6
term: add integer sign predicates and getter
bettio Apr 1, 2025
c80720c
BIFs: erlang:is_function/2 use new term_is_any_non_neg_integer
bettio Apr 1, 2025
c2b1ddf
Use sign bit for big integers (instead of 2-complement)
bettio Apr 8, 2025
4ca4f5a
BIFs: `neg_boxed_helper` use minimum boxed size (on 32-bit systems)
bettio Apr 14, 2025
04d0fe4
term: implement big integer term_compare
bettio Apr 14, 2025
63dcb77
utils: add functions for uint/int and sign conversions
bettio Apr 16, 2025
063be10
Add big integer to double conversion
bettio Apr 16, 2025
c7e2983
BIFs: refactor double to integer functions
bettio Apr 17, 2025
f1a1e1d
BIFs: implement float to big integer support
bettio Apr 18, 2025
f3a9464
tests: bigint.erl: test limits around +-(2^256 - 1)
bettio Apr 18, 2025
8042870
intn: add intn_from_integer_bytes function
bettio Apr 22, 2025
0a3644b
move intn_to_term_size to term.h
bettio Apr 22, 2025
e5b860c
externalterm: parse big integers
bettio Apr 23, 2025
b85ac96
opcodesswitch: add support to big integer constants
bettio Apr 24, 2025
fcfebff
intn: add intn_to_integer_bytes and intn_required_unsigned_integer_bytes
bettio Apr 28, 2025
2582014
externalterm: encode big integers as SMALL_BIG_EXT
bettio Apr 28, 2025
b63a664
Merge pull request #1552 from bettio/biggerint
bettio Jun 27, 2025
8259320
Merge branch 'main' into feature/bigint
bettio Jun 28, 2025
515ec03
Merge branch 'main' into feature/bigint
bettio Jul 1, 2025
5862806
Merge branch 'main' into feature/bigint
bettio Jul 5, 2025
0e8539f
Merge branch 'main' into feature/bigint
bettio Jul 24, 2025
6bf48ba
feature/bigint: fix small typo
pguyot Jul 25, 2025
1cfed5f
Merge pull request #1775 from pguyot/w30/fix-small-typo
bettio Jul 25, 2025
f86d3a8
Merge branch 'main' into feature/bigint
bettio Sep 10, 2025
d9c4a17
Merge branch 'main' into feature/bigint
bettio Sep 14, 2025
3b7cc48
intn: clang-format Hacker's Delight code
bettio Sep 15, 2025
27ba8d3
Merge pull request #1827 from bettio/clang-format-intn
bettio Sep 15, 2025
790a953
Simplify valgrind-suppressions.sup
bettio Sep 15, 2025
67b161e
Merge pull request #1828 from bettio/update-valgrind-suppressions
bettio Sep 15, 2025
1cd36e8
BIFs: refactor bitwise ops
bettio May 18, 2025
8c8a04c
intn: add or/and/xor bitwise functions
bettio May 25, 2025
d031cb7
BIFs: add support for bigint to `bor`/`band`/`bxor` functions
bettio May 25, 2025
95dd3e1
intn: add functions for left (`bsl`) and right shift (`bsr`)
bettio Jun 9, 2025
2355d0f
BIFs: add support for bigint to `bsl`/`bsr` functions
bettio Jun 10, 2025
4c44e1c
intn: add `intn_bnot` function for bitwise not
bettio Sep 13, 2025
420d878
BIFs: add support for bigint to `bnot` function
bettio Sep 13, 2025
fca024b
bigint: fix some typos
pguyot Sep 24, 2025
62e3833
Merge pull request #1844 from pguyot/w40/feature-bigint-fix-typos
bettio Sep 26, 2025
0dc41d7
Merge pull request #1738 from bettio/biggerint-bitwise
bettio Sep 27, 2025
8d67c22
Merge branch 'main' into feature/bigint
bettio Sep 27, 2025
ad043be
intn: add `intn_submn(u)` functions for subtraction
bettio Sep 26, 2025
bd98068
BIFs: add support for bigint to erlang:'-'/2 function
bettio Sep 26, 2025
4330f24
intn: add signed addition: `intn_addmn`
bettio Sep 27, 2025
5658542
intn: code is much simpler with sub implemented on top of add
bettio Sep 27, 2025
0255eab
BIFs: add support for bigint to erlang:'+'/2 function
bettio Sep 28, 2025
2e58165
intn: add `intn_divmnu` for unsigned division
bettio Sep 28, 2025
1b4385e
BIFs: add support for bigint to erlang:div/2 function
bettio Sep 28, 2025
bb28050
BIFs: add support for bigint to `erlang:rem/2` function
bettio Sep 29, 2025
908ad20
BIFs: add support for bigint to `erlang:abs/1`,`neg/1` functions
bettio Sep 29, 2025
4c53895
doc: update differences-with-beam.md page
bettio Oct 1, 2025
0b00f02
doc: update UPDATING: add information about `bsl` overflows
bettio Oct 1, 2025
a7b168c
doc: programmers-guide: update point about integers
bettio Oct 1, 2025
0221e6e
doc: memory-management.md: update sections about integers
bettio Oct 1, 2025
dc695da
doc: memory-management: fix map boxed tag (that is 0x2C)
bettio Oct 1, 2025
5c09086
doc: memory-management: update some info about match/sub/refc binaries
bettio Oct 1, 2025
2ab8725
CHANGELOG: update it after big integer support
bettio Oct 1, 2025
822b2ba
intn: do not parse integers > 256 bit
bettio Oct 1, 2025
21c271b
Merge pull request #1845 from bettio/biggerint-arith
bettio Oct 3, 2025
829924a
Merge pull request #1859 from bettio/intn_parse-check-overflow
bettio Oct 3, 2025
1bfca44
Merge pull request #1858 from bettio/bigint-doc
bettio Oct 3, 2025
bb8f839
Document and test `erlang:binary_to_term` behavior (regard big integers)
bettio Oct 2, 2025
8a1af38
Merge pull request #1862 from bettio/doc-test-badarg-too-big-int
bettio Oct 5, 2025
99d5041
intn: cleanup `uint16_t` helpers and divmnu constants
bettio Sep 30, 2025
8ccdc98
intn: add `intn_negate_sign` function
bettio Sep 30, 2025
da3fe52
intn: use a table for maximum lengths in `to_string` function
bettio Sep 30, 2025
13b8a57
intn: polish `neg_in_place` (now `neg_and_count_in_place`)
bettio Sep 30, 2025
2cbf3cf
intn: define INTN_BSL_MAX_RES_LEN
bettio Sep 30, 2025
5abfc39
Merge pull request #1856 from bettio/intn-cleanup
bettio Oct 6, 2025
75aaad6
bigint: test: add test for erlang:integer_to_list
bettio Oct 3, 2025
af8bb0c
NIFs: `list_to_integer`: fix it and support for big integers
bettio Oct 6, 2025
d223c4f
term: add `term_is_int` as replacement for `term_is_integer`
bettio Sep 17, 2025
521c207
term: rename all term_is_(neg/pos/non_neg)_integer functions
bettio Oct 7, 2025
83a8df2
term: document existing int functions
bettio Oct 7, 2025
7790e78
Merge pull request #1879 from bettio/bigint-list-conv
bettio Oct 8, 2025
b9f6ad6
Merge pull request #1863 from bettio/term_is_int
bettio Oct 8, 2025
53ef1c6
Merge branch 'main' into merge-jit-into-bigint
bettio Oct 8, 2025
6bf12d7
intn: fix big integer `0x80000000` to `int64` conversion
bettio Oct 15, 2025
60fd614
JIT: add support for big integer encoding
bettio Oct 10, 2025
578371c
tests: bigint: make sure big literals are used only in `big_literals/0`
bettio Oct 13, 2025
9d3a2e5
tests: bigint: improve big literal testing
bettio Oct 13, 2025
da48a88
opcodesswitch.h: fix "error: unused function 'decode_nbits_integer'"
bettio Oct 13, 2025
232dc88
JIT: add support for negative boxed integers
bettio Oct 13, 2025
3fef141
tests: bigint: add test for is_integer and is_number
bettio Oct 13, 2025
1aa06cb
Merge branch 'main' into merge-jit-into-bigint
bettio Oct 14, 2025
006206f
tests: bigint: add test for < and >= guards
bettio Oct 14, 2025
d5cf99b
Merge pull request #1884 from bettio/merge-jit-into-bigint
bettio Oct 16, 2025
a65de52
Merge pull request #1905 from bettio/bigint-fix-uint32-conv
bettio Oct 16, 2025
618592c
externalterm: fix crash due to uninitialized value in SMALL_BIG_EXT
bettio Oct 19, 2025
7565922
Merge pull request #1911 from bettio/bigint-fix-externalterm-bug
bettio Oct 19, 2025
6816741
intn: add mulmn and divmn functions for signed operations
bettio Oct 22, 2025
e57776d
intn: comment out print_num debug function
bettio Oct 22, 2025
32cb1a5
intn: always use intn_digit_t type in public API
bettio Oct 22, 2025
3def2e3
CI: build-and-test.yaml: remove valgrind-suppressions.sup
bettio Oct 23, 2025
a54b9b6
nif.c: clarify make_bigint helper function
bettio Oct 23, 2025
1056586
bif.c: add clarification about comparison against 0
bettio Oct 23, 2025
3c5b2ca
opcodesswitch.h: clarify decode_nbits_integer / large_integer_to_term
bettio Oct 23, 2025
8040db1
programmers-guide.md: rephrase statement about integers
bettio Oct 23, 2025
0afc977
Merge pull request #1921 from bettio/use-recent-valgrind
bettio Oct 24, 2025
5187147
bif.c: remove args_to_bigint function
bettio Oct 24, 2025
75e08ea
bif.c: term_to_bigint: change intn_digit_t parameter type to `const **`
bettio Oct 24, 2025
d2c5528
tests: add bigint_stress test
bettio Oct 14, 2025
3a6309e
Merge pull request #1918 from bettio/intn-consistent-api
bettio Oct 24, 2025
aac7c1b
Move to utils.h from bif.c functions for safe left and right shift
bettio Oct 21, 2025
972c4d1
utils.h: document int utilities
bettio Oct 21, 2025
55864b3
utils: reorder bsl/bsr functions
bettio Oct 21, 2025
b0db34a
utils.h: use size_t for shift size (in bits)
bettio Oct 21, 2025
70519ec
Merge pull request #1916 from bettio/utils-cleanup-and-doc
bettio Oct 24, 2025
7ff1af5
bif.c: avoid out-of-bounds read in `make_bigint` on bsl overflow
bettio Oct 23, 2025
fb09706
Move and rename `size_round_to` from intn.c to utils.h
bettio Oct 23, 2025
d50da15
Fix heap over-allocation in calculate_heap_usage
bettio Oct 23, 2025
7b5409a
Merge pull request #1928 from bettio/remove-bif-args_to_bigint
bettio Oct 25, 2025
f657254
Merge pull request #1914 from bettio/bigint_stress
bettio Oct 25, 2025
9a184e0
Merge pull request #1922 from bettio/bigint-clarifications
bettio Oct 25, 2025
0009c83
intn.c: fix: normalize len before performing intn_cmp
bettio Oct 23, 2025
dcf4c27
intn: make digit_bit_size a constant
bettio Oct 24, 2025
13a0a46
Merge pull request #1920 from bettio/bigint-fixes
bettio Oct 25, 2025
8392ab4
intn: rename (u)int64 utils
bettio Oct 24, 2025
38cdd9b
intn: remove redundant `mn` suffix: e.g.: `addmn` -> `add`
bettio Oct 24, 2025
f416458
intn: add doxygen documentation
bettio Oct 25, 2025
d771d8a
Merge pull request #1929 from bettio/improve-intn-function-names
bettio Oct 26, 2025
545755d
Merge pull request #1930 from bettio/document-intn
bettio Oct 26, 2025
f03f115
Fix documentation about normalized-not normalized, change intn_to_double
bettio Oct 26, 2025
3ae8c65
Remove license file for deleted file
bettio Oct 26, 2025
6da47a1
Merge pull request #1936 from bettio/support-unnormalized-intn
bettio Oct 26, 2025
a1aa48b
Merge pull request #1937 from bettio/remove-unused-file
bettio Oct 26, 2025
1725da5
bif.c: rename term_to_bigint to conv_term_to_bigint
bettio Oct 25, 2025
83fe7b1
Add `term_to_bigint` and `term_is_bigint`
bettio Oct 26, 2025
9808df3
Add new term_initialize_bigint function
bettio Oct 26, 2025
ba2c1a4
Rename term_create_uninitialized_intn and term_intn_to_term_size
bettio Oct 26, 2025
10ce113
term: rename and clarify BOXED_INTN_SIZE macro
bettio Oct 26, 2025
6ce3ea8
bif.c: use understandable names
bettio Oct 26, 2025
7946e89
bif.c: move bigint helpers
bettio Oct 26, 2025
310c0f6
intn: intn_from_integer_bytes: set sign when sign != NULL
bettio Oct 28, 2025
950c9ae
Implement minimal bigint binary pattern matching
bettio Oct 28, 2025
891350b
Do not use _Static_assert in headers
bettio Oct 28, 2025
6fc6d49
jit.erl: Add missing skip_compact_term for big integers
bettio Oct 28, 2025
014c48b
Merge pull request #1933 from bettio/remove-unused-file
bettio Oct 28, 2025
48c8624
Merge pull request #1934 from bettio/bif-bigint-cleanup
bettio Oct 29, 2025
d59f496
Merge pull request #1940 from bettio/add-missing-skip_compact_term
bettio Oct 31, 2025
b5e4511
Merge pull request #1939 from bettio/fix-static_assert-in-header
bettio Oct 31, 2025
f7676a8
Merge pull request #1938 from bettio/minimal-bigint-pattern-matching
bettio Nov 1, 2025
dd0f93a
utils: remove int32/64_is_negative
bettio Nov 1, 2025
bdcfbe0
utils: remove redundant function
bettio Nov 1, 2025
99e48e6
Merge pull request #1952 from bettio/remove-useless-optimization
bettio Nov 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .github/workflows/build-and-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -324,8 +324,16 @@ jobs:
run: sudo apt update -y

- name: "Install deps"
if: matrix.container != ''
run: sudo apt install -y ${{ matrix.compiler_pkgs}} cmake gperf zlib1g-dev doxygen valgrind libmbedtls-dev

- name: "Install deps"
if: matrix.container == ''
run: |
sudo apt install -y ${{ matrix.compiler_pkgs}} cmake gperf zlib1g-dev doxygen libmbedtls-dev libc6-dbg
# Get a more recent valgrind
sudo snap install valgrind --classic

- name: "Checkout repo"
uses: actions/checkout@v4
with:
Expand Down
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added `erlang:list_to_bitstring`
- Reimplemented `lists:keyfind`, `lists:keymember` and `lists:member` as NIFs
- Added `AVM_PRINT_PROCESS_CRASH_DUMPS` option
- Added support for big integers up to 256-bit (sign + 256-bit magnitude)
- Added support for big integers in `binary_to_term/1` and `term_to_binary/1,2`

### Changed

Expand All @@ -68,6 +70,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Entry point now is `init:boot/1` if it exists. It starts the kernel application and calls `start/0` from the
identified startup module. Users who started kernel application (typically for distribution) must no longer
do it. Startint `net_kernel` is still required.
- All arithmetic operations (`+`, `-`, `*`, `div`, `rem`, `abs`, etc.) now support integers up to 256-bit
- All bitwise operations (`band`, `bor`, `bxor`, `bnot`, `bsl`, `bsr`) now support integers up to 256-bit
- Float conversion functions now support converting to/from big integers
- `bsl` now properly checks for overflow

### Changed
- `binary_to_integer/1` no longer accepts binaries such as `<<"0xFF">>` or `<<" 123">>`
- `binary_to_integer` and `list_to_integer` do not raise anymore `overflow` error, they raise
instead `badarg`.

### Fixed

Expand All @@ -78,6 +89,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- packbeam: fix memory leak preventing building with address sanitizer
- Fixed a bug where empty atom could not be created on some platforms, thus breaking receiving a message for a registered process from an OTP node.
- Fix a memory leak in distribution when a BEAM node would monitor a process by name.
- Fix `list_to_integer`, it was likely buggy with integers close to INT64_MAX

## [0.6.7] - Unreleased

Expand Down
6 changes: 6 additions & 0 deletions UPDATING.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ port socket driver, are also represented by a port and some matching code may ne
`is_pid/1` to `is_port/1`.
- Ports and pids can be registered. Function `globalcontext_get_registered_process` result now is
a term that can be a `port()` or a `pid()`.
- `bsl` (Bitshift left) now checks for overflows, this shouldn't be a practical issue for existing
code, since integers were limited to 64 bits, however make sure to bitmask values before left
bitshifts: e.g. `(16#FFFF band 0xF) bsl 252`.
- `binary_to_integer` and `list_to_integer` do not raise `overflow` error anymore, they instead
raise `badarg` when trying to parse an integer that exceeds 256 bits. Update any relevant error
handling code.

## v0.6.4 -> v0.6.5

Expand Down
109 changes: 104 additions & 5 deletions doc/src/differences-with-beam.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,110 @@ AtomVM does not implement some key features of the BEAM. Some of these limitatio
worked on and this list might be outdated. Do not hesitate to check GitHub issues or contact us
when in doubt.

### Wide precision integers

AtomVM currently only supports 64 bits integers. This is being worked on. However, please note
that AtomVM is unlikely to support arbitrary precision integers as libraries for such support
usually are quite large.
### Integer precision and overflow

AtomVM supports integers up to 256-bit with an additional sign flag, while BEAM supports unlimited
precision integers. This fundamental difference has several implications:

#### Integer limits

- **Maximum value**: `16#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF` (256
ones, which equals `2^256 - 1`)
- **Minimum value**: `-16#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF` (which
equals `-(2^256 - 1)`)

Note that AtomVM does not use two's complement for big integers. The sign is stored as a separate
flag, which means `INTEGER_MAX = -INTEGER_MIN`.

#### Overflow errors

Unlike BEAM, AtomVM raises `overflow` errors when integer operations exceed 256-bit capacity:

```erlang
IntMax = 16#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,
% The following will raise an overflow error on AtomVM, but succeeds on BEAM:
Result = IntMax + 1 % overflow error

% Also applies to subtraction and multiplication:
-IntMax - 1 % overflow error
IntMax * 2 % overflow error
```

Handling overflows:

```erlang
safe_calc(MaybeOvfFun) ->
try MaybeOvfFun() of
I when is_integer(I) -> {ok, I}
catch
error:overflow -> {error, overflow}
end.

% Returns `{ok, Result}`, Result is a 255 bit integer
safe_calc(fun() -> factorial(57) end).

% Returns `{error, overflow}`, since 261 bit integers are not allowed
safe_calc(fun() -> factorial(58) end).
```

Overflow can also occur with:
- Bit shift left operations: `1 bsl 257` raises overflow (shifting beyond the 256-bit boundary).
When shifting values with multiple set bits, mask first to prevent overflow: `16#FFFF bsl 252`
would overflow, but `(16#FFFF band 0xF) bsl 252` succeeds
- Float to integer conversions: `ceil/1`, `round/1`, etc. when the result exceeds 256-bit

Note: While BEAM raises `system_limit` error for operations like
`1 bsl 2000000000000000000000000000000000`, AtomVM consistently uses `overflow` error for all
integer capacity violations.

Note: Integer literals larger than 256 bits in source code will compile successfully with
Erlang/Elixir compilers, but the resulting BEAM files will fail to load on AtomVM. This also
applies to compile-time constant expressions that evaluate to integers exceeding 256 bits, such as
`1 bsl 300`. These expressions are evaluated by the compiler and stored as constants in the BEAM
file, causing the same load-time failure. Always ensure that integer constants in your code are
within AtomVM's supported range.

Note: The `erlang:binary_to_term/1,2` function raises a `badarg` error when attempting to
deserialize binary data containing an integer larger than 256 bits. This differs from BEAM, which
can deserialize integers of any size. Applications that exchange serialized terms with BEAM nodes
should be aware of this limitation.

Note: String and binary conversion functions such as `erlang:binary_to_integer/1,2`,
`erlang:list_to_integer/1,2`, and Elixir's `String.to_integer/1,2` raise a `badarg` error when the
input represents an integer exceeding 256 bits. For example,
`erlang:binary_to_integer(<<"10000000000000000000000000000000000000000000000000000000000000000">>, 16)`
will fail with `badarg` on AtomVM, while it succeeds on BEAM. Applications parsing user input or
external data should validate that numeric values fall within AtomVM's supported range.

#### Bitwise operations edge cases

The 256-bit limitation creates specific edge cases with bitwise operations that would require 257
bits:

On BEAM (unlimited precision), returns `-IntMax - 1` (requires 257 bits):

```erlang
1> IntMax = 16#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF.
115792089237316195423570985008687907853269984665640564039457584007913129639935
2> integer_to_binary(-1 bxor IntMax, 16).
<<"-10000000000000000000000000000000000000000000000000000000000000000">>
3> integer_to_binary(bnot IntMax, 16).
<<"-10000000000000000000000000000000000000000000000000000000000000000">>
```

On AtomVM (256-bit limited), returns 0 (cannot represent 257th bit):

```erlang
1> IntMax = 16#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF.
115792089237316195423570985008687907853269984665640564039457584007913129639935
2> -1 bxor IntMax.
0
3> bnot IntMax.
0
```

This occurs because AtomVM cannot create an integer with the 257th bit set to 1 with negative sign.
Since `-0` is not allowed, the result is normalized to `0`.

### Bit syntax

Expand Down
114 changes: 104 additions & 10 deletions doc/src/memory-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,15 @@ loaded) a fixed size table. Management of the global atom table is outside of t

### Integers

An integer is represented as a single word, with the low-order 4 bits having the value `0xF` (`1111b`). The high order word-size-6 bits are used to represent the integer value:
AtomVM supports integers up to 256 bits with an additional sign bit stored outside the numeric
payload. The representation strategy depends on the integer's size and uses canonicalization to
ensure each value has exactly one representation.

#### Immediate Integers

Small integers are represented as a single word, with the low-order 4 bits having the value `0xF`
(`1111b`). The high order word-size-4 bits are used to represent the integer value using two's
complement:

|< 4>|
+===========================+====+
Expand All @@ -189,11 +197,13 @@ An integer is represented as a single word, with the low-order 4 bits having the
| |
|<---------- word-size --------->|

The magnitude of an integer is therefore limited to `2^{word-size - 4}` in an AtomVM program (e.g., on a 32-bit platform, `+- 134,217,728`).
On 32-bit systems, immediate integers can represent signed values in the range `[-2^27, 2^27-1]` (28
bits + 4-bit tag = 32 bits).
On 64-bit systems, immediate integers can represent signed values in the range `[-2^59, 2^59-1]` (60
bits + 4-bit tag = 64 bits).

```{attention}
Arbitrarily large integers (bignums) are not currently supported in AtomVM.
```
For integers outside these ranges, AtomVM uses boxed representations (see Boxed Integers section
below).

### nil

Expand Down Expand Up @@ -242,6 +252,88 @@ A boxed term pointer is a single-word term that contains the address of the refe

Because terms (and hence the heap) are always aligned on boundaries that are divisible by the word size, the low-order 2 bits of a term address are always 0. Consequently, the high-order word-size - 2 (`1,073,741,824`, on a 32-bit platform) are sufficient to address any term address in the AtomVM address space, for 32-bit and greater machine architectures.

### Boxed Integers

AtomVM uses boxed integers for values that exceed the immediate integer range. There are two types
of boxed integer representations: native integers (using int32_t or int64_t) and big integers (using
arrays of uint32_t digits).

#### Native Boxed Integers

For integers that don't fit in immediate representation but can be stored in native C integer
types, AtomVM uses boxed integers with two's complement encoding and a redundant sign bit in the
header.

**On 32-bit systems:**
- Integers in range `[-2^31, -2^27-1] ∪ [2^27, 2^31-1]` are stored as boxed int32_t (single word
payload)
- Integers in range `[-2^63, -2^31-1] ∪ [2^31, 2^63-1]` are stored as boxed int64_t (two word
payload)

**On 64-bit systems:**
- Integers in range `[-2^63, -2^59-1] ∪ [2^59, 2^63-1]` are stored as boxed int64_t (single word
payload)

The boxed header uses:
- `0x8` (`001000b`) for positive integers (TERM_BOXED_POSITIVE_INTEGER)
- `0xC` (`001100b`) for negative integers (TERM_BOXED_NEGATIVE_INTEGER)

|< 6 >|
+=========================+======+
| boxed-size (1 or 2) |001X00| boxed[0] (X=0 for positive, X=1 for negative)
+-------------------------+------+
| native integer value | boxed[1] (int32_t or int64_t low word)
+--------------------------------+
| high word (if int64_t on | boxed[2] (32-bit systems only)
| 32-bit system) |
+================================+
| |
|<---------- word-size --------->|

#### Big Integers

For integers beyond the native int64_t range (up to ±(2^256 - 1)), AtomVM uses an array of uint32_t
digits representing the magnitude, with the sign stored as a flag in the boxed header. These big
integers do NOT use two's complement encoding.

The digits array:
- Stores the absolute value of the integer
- Uses little-endian ordering (digit[0] is least significant)
- Omits leading zero digits to save space
- Includes a dummy zero digit when necessary to avoid ambiguity with native boxed integers

|< 6 >|
+=========================+======+
| boxed-size (n) |001X00| boxed[0] (X=0 for positive, X=1 for negative)
+-------------------------+------+
| digit[0] (lsb) | boxed[1] (uint32_t)
+--------------------------------+
| digit[1] | boxed[2] (uint32_t)
+--------------------------------+
| ... | ...
+--------------------------------+
| digit[k-1] (msb) | boxed[k] (uint32_t)
+--------------------------------+
| 0 (dummy digit if needed) | boxed[n] (uint32_t)
+================================+
| |
|<---------- word-size --------->|

**Canonicalization Rules:**
- AtomVM ensures that integers are always stored in the most compact representation
- Operations that produce results fitting in a smaller representation automatically convert to that
representation
- A dummy digit mechanism ensures that the smallest big integer always has more words than the
largest native boxed integer. This is required when storing values such as `UINT64_MAX`
(`0xFFFFFFFFFFFFFFFF`), that would require only 2 digits, but boxed-size field must allow to
distinguish it from native boxed integers (such as `int64_t`)

**Examples:**
- The value 3 is always stored as an immediate integer (never as a boxed integer)
- On a 64-bit system, 2^60 would be stored as a boxed int64_t, not as a big integer
- The value 2^100 would be stored as a big integer with 4 uint32_t digits (plus potentially a dummy
digit)

### References

A reference (e.g., created via [`erlang:make_ref/0`](./apidocs/erlang/estdlib/erlang.md#make_ref0)) stores a 64-bit incrementing counter value (a "ref tick"). On 64 bit machines, a Reference takes up two words -- the boxed header and the 64-bit value, which of course can fit in a single word. On 32-bit platforms, the high-order 28 bits are stored in `boxed[1]`, and the low-order 32 bits are stored in `boxed[2]`:
Expand Down Expand Up @@ -278,7 +370,7 @@ Tuples are represented as boxed terms containing a boxed header (`boxed[0]`), a

### Maps

Maps are represented as boxed terms containing a boxed header (`boxed[0]`), a type tag of `0x3C` (`111100b`), followed by:
Maps are represented as boxed terms containing a boxed header (`boxed[0]`), a type tag of `0x2C` (`101100b`), followed by:

* a term pointer to a tuple of arity `n` containing the keys in the map;
* a sequence of `n`-many words, containing the values of the map corresponding (in order) to the keys in the reference tuple.
Expand All @@ -300,7 +392,7 @@ The keys and values are single word terms, i.e., either immediates or pointers t
| ...
| | |< 6 >|
| +=========================+======+
| | boxed-size (n) |111100| boxed[0]
| | boxed-size (n) |101100| boxed[0]
| +-------------------------+------+
+-----------------< keys | boxed[1]
+--------------------------------+
Expand Down Expand Up @@ -446,7 +538,7 @@ to `nil`.
some
binary |< 6 >|
^ +=========================+======+
| | boxed-size (5) |100100| boxed[0]
| | boxed-size (5) |000100| boxed[0]
| +-------------------------+------+
| | match-or-binary-ref | boxed[1]
| +--------------------------------+
Expand All @@ -464,15 +556,15 @@ A reference to a reference-counted binary counts as a reference, in which case t

#### Sub-Binaries

Sub-binaries are represented as boxed terms containing a boxed header (`boxed[0]`), a type tag of `0x28` (`001000b`)
Sub-binaries are represented as boxed terms containing a boxed header (`boxed[0]`), a type tag of `0x28` (`101000b`)

A sub-binary is a boxed term that points to a reference-counted binary, recording the offset into the binary and the length (in bytes) of the sub-binary. An invariant for this term is that the `offset + length` is always less than or equal to the length of the referenced binary.

some
refc
binary |< 6 >|
^ +=========================+======+
| | boxed-size (3) |001000| boxed[0]
| | boxed-size (3) |101000| boxed[0]
| +-------------------------+------+
| | len | boxed[1]
| +--------------------------------+
Expand Down Expand Up @@ -630,6 +722,8 @@ A given process heap and stack occupy a single region of malloc'd memory, and it

Terms stored in the stack, registers, and process dictionary are either single-word terms (like atoms or pids) or term references, i.e., single-word terms that point to boxed terms or list cells in the heap. These terms constitute the "roots" of the memory graph of all "reachable" terms in the process.

Boxed integers, including both native boxed integers and big integers, are simple blob structures that are copied as-is during garbage collection. They do not contain any pointers or addresses that need to be updated during the garbage collection process.

### When does garbage collection happen?

Garbage collection typically occurs as the result of a request for an allocation of a multi-word term in the heap (e.g., a tuple, list, or binary, among other types), and when there is currently insufficient space in the free space between the current heap and the current stack to accommodate the allocation.
Expand Down
4 changes: 2 additions & 2 deletions doc/src/programmers-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Currently, AtomVM implements a strict subset of the BEAM instruction set.
A high level overview of the supported language features include:

* All the major Erlang types, including
* integers (with size limits)
* integers (integers with 256-bit magnitude plus separate sign)
* floats
* tuples
* [lists](./apidocs/erlang/estdlib/lists.md)
Expand Down Expand Up @@ -740,7 +740,7 @@ The following Erlang type specification enumerates this type:
Erlang/OTP uses the Christian epoch to count time units from year 0 in the Gregorian calendar. The, for example, the value 0 in Gregorian seconds represents the date Jan 1, year 0, and midnight (UTC), or in Erlang terms, `{{0, 1, 1}, {0, 0, 0}}`.

```{attention}
AtomVM is currently limited to representing integers in at most 64 bits, with one bit representing the sign bit.
AtomVM is currently limited to representing time in at most 64 bits, with one bit representing the sign bit.
However, even with this limitation, AtomVM is able to resolve microsecond values in the Gregorian calendar for over
292,000 years, likely well past the likely lifetime of an AtomVM application (unless perhaps launched on a deep
space probe).
Expand Down
4 changes: 4 additions & 0 deletions libs/jit/include/jit.hrl
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@

-define(JIT_FORMAT_VERSION, 1).

% Before adding any new platform to the list below:
% Is it 64-bit big endian? if so, `put_digits` function in jit.erl must be updated to support
% big endian platforms.

-define(JIT_ARCH_X86_64, 1).
-define(JIT_ARCH_AARCH64, 2).

Expand Down
Loading
Loading