Skip to content

HashSetFn's calculation of maxSize can cause Overflow during functor instantiation #279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 12 tasks
Skyb0rg007 opened this issue May 25, 2023 · 7 comments
Closed
2 of 12 tasks
Assignees
Labels
bug Something isn't working fixed-in-110.99.4 issues that will be fixed in the 110.99.4 version smlnj-lib General issue with the SML/NJ Library or the Util component

Comments

@Skyb0rg007
Copy link
Contributor

Skyb0rg007 commented May 25, 2023

Version

110.99.3 (Latest)

Operating System

  • Any
  • Linux
  • macOS
  • Windows
  • Other Unix

OS Version

No response

Processor

  • Any
  • Arm (using Rosetta)
  • PowerPC
  • Sparc
  • x86 (32-bit)
  • x86-64 (64-bit)
  • Other

System Component

SML/NJ Library

Severity

Major

Description

The HashSetFn functor (hash-set-fn.sml) calculates maxSize as the largest power of 2 that is still less than Array.maxLen.
However, the calculation does not properly handle systems in which Int.maxInt = SOME Array.maxLen, causing an exception to be raised during functor instantiation.

Note: This bug does not occur on the SML/NJ compiler, since Array.maxLen is much smaller than Int.maxInt.
The issue is reproducible for MLton (latest version: 20230523-gd082c4a36), since there SOME Array.maxLen = Int.maxInt.

Transcript

(* test.sml *)
structure X = HashSetFn(
   struct
      type hash_key = int
      val hashVal = Word.fromInt
      val sameKey: int * int -> bool = op =
   end)
val () = print "Okay!\n"


(* test.mlb *)
$(SML_LIB)/basis/basis.mlb
$(SML_LIB)/smlnj-lib/Util/smlnj-lib.mlb
test.sml


$ mlton -output a.out test.mlb
$ ./a.out
unhandled exception: Overflow


$ # The `-default-type intinf` flag causes Int.int to have infinite precision, but keeps Array.maxLen the same
$ mlton -output a.out -default-type intinf test.mlb
$ ./a.out
Okay!

Expected Behavior

Instantiating the HashSetFn should never raise Overflow.

Steps to Reproduce

Just run the code from the transcript.

Additional Information

This can be fixed by modifying the code at line 37 in hash-set-fn.sml:

(* old *)   
val maxSize = let
     fun f i = let
             val i' = i+i
             in
               if i' < Array.maxLen then f i' else i
             end
     in
       f 0x10000
     end

(* new*)   
val maxSize = let
     fun f i = let
             val i' = i+i
             in
               if i' < Array.maxLen then f i' else i
             end
             handle Overflow => i
     in
       f 0x10000
     end

Email address

ssoss AT uchicago DOT edu

@Skyb0rg007 Skyb0rg007 added the bug Something isn't working label May 25, 2023
@MatthewFluet
Copy link

MLton patches the SML/NJ Library that it ships to avoid a similar problem in structure HashTableRep (see https://github.com/MLton/mlton/blob/b1f1f0f0916d28c0d183fba85549d5bf96f1fa41/lib/smlnj-lib/smlnj-lib.patch#L4498-L4510).

@Skyb0rg007
Copy link
Contributor Author

I'll resubmit this report there. However I do still believe this should be addressed in the SML/NJ repo, if for nothing but future-proofing against a possible Array.maxLen increase or Int.maxInt decrease.

JohnReppy added a commit that referenced this issue May 25, 2023
Overflow during functor instantiation).
@JohnReppy JohnReppy added smlnj-lib General issue with the SML/NJ Library or the Util component fixed-in-110.99.4 issues that will be fixed in the 110.99.4 version labels May 25, 2023
@JohnReppy JohnReppy self-assigned this May 25, 2023
@JohnReppy
Copy link
Contributor

I've pushed a fix that computes floor(log2(Array.maxSize)) to use as the maxSize limit for hash tables.

@MatthewFluet
Copy link

This computation of floor(log2(Array.maxSize)) is not robust when structure Word = Word32 and structure Int = IntInf; in general, if the default word size is smaller than the default integer size.

$ cat z.sml 
val maxSize = let
      fun lp (0w0, k) = Word.toIntX(Word.<<(0w1, k-0w1))
        | lp (w, k) = lp (Word.>>(w, 0w1), k+0w1)
      in
        lp (Word.fromInt Array.maxLen, 0w0)
      end

val _ = print (concat ["maxSize: ", Int.toString maxSize, "\n"])
$ mlton -default-type intinf z.sml 
$ ./z 
maxSize: ~2147483648

JohnReppy added a commit that referenced this issue Jun 14, 2023
size into the internal `MaxHashTableSize` structure.
@JohnReppy
Copy link
Contributor

Rewrote the code to just use the Int module, so it should be portable now.

@MatthewFluet
Copy link

The max-hash-table-size.sml file was not added in the commit.

@JohnReppy
Copy link
Contributor

I missed it in the initial commit, but I added it about 4 hours ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed-in-110.99.4 issues that will be fixed in the 110.99.4 version smlnj-lib General issue with the SML/NJ Library or the Util component
Projects
None yet
Development

No branches or pull requests

3 participants