Releases: diku-dk/futhark
nightly
0.25.37
Added
-
futhark fmtprints module types better. -
New server protocol commands:
rank,elemtype,new_array,set. -
futhark lspprovides inlay hints, they show type ascriptions for
inferred types of bindings, by VegOwOtenks. (#2398) -
Scan chunk size is now exposed as tuning parameter in
cudaandhip
backends. -
futhark lspoffers code actions, they insert type ascriptions for
inferred types. Every named binding has an action. -
futhark benchandfuthark testcan now handle entry points that return
opaque values, as long as there is no expected result. -
Better fusion for
scanSOACs. -
New prelude function:
exscan, an exclusive scan.
Fixed
0.25.36
Added
-
futhark lspsupports running evaluation comments in the interpreter,
implemented by VegOwOtenks. -
New notation:
let x.f = y in zas a shortcut forlet x = x with f = y in z, by Aziz Rmadi. (#1160) -
Array and field updates can now be chained, e.g.,
x with [i].f = y,
including in operator sections. Work by Aziz Rmadi. -
New server protocol commands:
index,shape, andattributes. -
FutharkScript now supports (barely) array indexing.
-
futhark autotunenow shows a progress bar when running in a terminal. Work
by Nader Rahhal. -
The C API now has functions for creating and updating arrays of opaque
objects. (#2383, #2384)
Fixed
-
GPU code generation for atomics on 8-bit scalars.
-
f64.nextafter,f64.copysign, andf64.isinfreduced precision. (#2363) -
Equality for
f16in interpreter did not handle NaN correctly. -
f16handling inmulticoreandispcbackends. -
Opaque types in C API generated from record patterns in entry points. (#2371)
-
Regression in the unrolling of
map. -
Occasional duplication of entry points leading to compiler crash. (#2374)
-
futhark autotuneno longer tells you to report an issue when the underlying
program crashes. (#2388)
0.25.35
Added
-
futhark lspnow provides thetextDocument/formattingserver method -
futhark profile: Generation of HTML-Files with cost centre and source range
analysis, heatmap markup according to source range cost.
(#2340, With VegOwOtenks) -
Per Cost-Centre Source Location Listings in the
.summary-file output of
futhark profile. (By VegOwOtenks) -
The
multicorebackend now uses a decoupled lookbackscanimplemented by
Amirreza Hashemi. -
Custom tuning parameters can now be added with the
#[param(NAME)]attribute.
See the documentation for the sharp edges. -
futhark testnow supports--tuning, just likefuthark bench.
Fixed
-
Unit types now behave like records/tuples in the C interface and derived
interface, such as server-mode andfuthark literate. (#2332) -
futhark bench: JSON output now contains results for all test stanzas that
use a given entry point, rather than just the last one. -
A bug in internalisation of
whileloops that could occur when one result of
the loop was syntactically the same as a loop parameter. (#2335) -
Missing inlining for some functions that must be inlined when using GPU
backends. (#2341) -
An interpreter bug in return size inference for functions returning an
abstract type that is concrete at the calling size (#2336). -
futhark benchwould ignore programs with any tag. -
An issue where some array types would not be generated for the C API.
-
An edge case in loop size inference. (#2354)
0.25.34
Added
- Support for CUDA 13.
Fixed
-
Declaration of zero-length arrays in GPU kernels.
-
Handling of size closures of type abbreviations in interpreter (#2316).
-
futhark literatedeletedCACHEDIR.TAGfiles. -
futhark literatenow prints records and tuples properly. -
Some optimisations would throw away source location information, resulting in
worse profiling data. -
Tighter source locations for
let x[i] = ...expressions. -
Oversight in size expressions in let-bindings. (#2322)
-
futhark pkgis now more robust against errors in package data. -
Incorrect uniqueness inference for functions returning abstract types. (#2324)
-
Duplication of entry points in some cases. (#2326)
-
A race condition in the creation of
CACHEDIR.TAGfiles that could cause
futhark benchandfuthark testto crash.
0.25.33
Added
-
Futhark now implements the cachedir specification, such that e.g. the
data
directories created byfuthark testandfuthark benchcontain a
CACHEDIR.TAGfile. -
C types corresponding to records now have nicer names.
-
futhark scriptnow has a$restoreprocedure, and the$store
procedure has been augmented with support for opaque types. -
The expression guarded by an
assertmay now be any expression, and need no
longer be an atom.
Fixed
-
futhark fmt --checkno longer prints the program on failure. -
Use of unsigned types in entry points could cause invalid C to be generated.
(#2306) -
futhark scriptnow frees values before terminating. -
Correct source locations when warning about unused local functions.
-
Unpacking a unary sum type directly in a parameter or
let-binding was
defective. (#2314) -
The derivative of
x**1forx==0would be NaN. -
futhark fmtnow prints multi-lineassertin a less horrible way.
0.25.32
Added
-
futhark docnow prints warnings about errors in doc comments (mostly
references to unknown names). -
FutharkScript now permits chained
letwithoutin, just as in Futhark. -
futhark pkgnow allows the~character in package paths. -
cudabackend: explicitly support CC 8.9, 9.0, 10.0, 10.1, and 12.0. -
Profiling now provides source locations for the profiled events. Some things
are not yet accurately tracked, and only the static location is reported
(i.e., no full call stack). -
The AD transformation is now more diligent about propagating attributes from
SOACs in the primal code to SOACs in the differentiated code.
Fixed
-
f64atomics on NVIDIA GPUs with less than CC 6.0 (Maxwell and older). -
Infinite loop in fusion (#2276).
-
Rare compiler crash during internalisation code that does size coercions on
opaque size-lifted types. -
Missing fusion inside reduction and scan operators (#2283).
-
Incorrect aliasing for memory blocks could cause some optimisations to be
misapplied. (#2288) -
to_bits/from_bitsnot handled by AD (#2292). -
For GPU backends, incorrect code generation for accumulator updates (produced
by AD) for thread-local arrays (#2294). -
Missing consumption check in the termination condition for
whileloops (#2300).
0.25.31
Added
-
GPU backends: more efficient atomic operations on 8-bit and 16-bit quantities.
This helps histograms on these types, as well as AD on programs that use
f16. -
Improved handling of long chains of
flatten/unflatten/transpose
operations. -
New attributes:
#[blank]and#[scratch]. -
A module type
with-refinement may now have an existentially quantified size
on its right-hand side. -
Value specs in module types can now use section binding notation for symbolic
names, and in fact this is the preferred form that is also used byfuthark fmt. (#2266) -
futhark profilenow also prints proportion of total runtime for each cost centre. -
Futhark no longer warns about entry points with opaque types.
-
Types such as
foo.barare now turned intofoo_barin the C API, rather
than an ugly hash.
Fixed
-
Interpreter: some tricky aspects of size-lifted types (#2258).
-
Incorrect unused-name warning for named parameters in module types.
-
Size-lifted abstract types with hidden sizes could result in different sizes
being incorrectly treated as the same size. -
It was possible to make size-lifted types appear unlifted by using parametric
types (#2268). -
The same type would be mentioned twice in some type errors.
-
The type checker neglected to detect some cases of invalid references from
return types to names bound in parameter patterns. (#2271) -
Incorrect handling of projections used in size expressions.
-
Subtle interactions of modules and sizes in the interpreter and compiler
(#2273).
0.25.30
Added
-
New math functions:
f16.rsqrt,f32.rsqrt,f64.rsqrt. -
New math functions:
cospi,sinpi,tanpi,acospi,asinpi,
atanpi,atan2pi, in each of thef16/f32/f64modules. (#2243) -
Slight improvements in the ability of the fusion engine to fuse
acrossmapnests separated byreshapeoperations. Only works if
the innermost return type is purely scalar. -
futhark pkgnow allows underscores in package paths.
Fixed
-
The interpreter no longer crashes when passing a sum-typed value
into AD, but it is unlikely to produce a usable result (#2238). -
The partial derivatives of comparisons are now always zero.
Previously we had some code that made an attempt at giving these
another interpretation, but it was never mathematically sound, not
useful, and sometimes buggy. (#2239). -
Out-of-bounds reads in GPU backends when transposing a great many
matrices in parallel (#2241). -
vjpin the interpreter is now asymptotically efficient (#2187,
#2240). Work by Marcus Jensen. -
The interpreter did not handle
opencorrectly. -
Incorrect handling of some size inference edge cases during
monomorphisation (#2252). -
Incorrect registration of entry point types when mixing type
abbreviations and arrays (#2253). -
Reverse mode AD now handles sequential streams. (#2256)
0.25.29
Fixed
-
AD would in some cases produce code that would cause a compiler
crash (#2228). -
Slight error in the definition of the partial derivatives for the
**operator could cause NaNs in the interpreter when using
forward-mode AD (#2229). -
The magical machinery for inferring external API types did not
handle arrays with uniqueness annotations consistently, resulting in
incompatible entry point types being generated, leading to a
compiler crash. (#2231) -
A simplification rule for array slices would in some cases produce
type-incorrect code. (#2232) -
A bug in the defunctionaliser could cause a compiler crash in code
that used both higher order functions and size expressions in clever
ways (#2234). -
Fusion could crash after AD in some circumstances (#2236).