Skip to content

Releases: JuliaGPU/Metal.jl

v1.1.0

10 Apr 14:31
1ebc4c9
Compare
Choose a tag to compare

Metal v1.1.0

Diff since v1.0.0

Merged pull requests:

Closed issues:

  • Validation-related back-end crash on macOS Ventura (#34)
  • slow broadcast copy in 2D (#41)
  • Poor performance of mapreduce (#46)
  • Multiplication with SubArrays (#47)
  • Add support to creating MtlArray using a memory allocated by Array (#62)
  • Improve use of unified memory (#86)
  • Use Autoreleasepools with Metal (#103)
  • Unknown RFLT tag generated by macOS 13 Metal compiler (#167)
  • mapreduce allocates a lot on the CPU (#211)
  • Legalization errors with vectorized code (#257)
  • Compilation Failure due to undefined symbols (#276)
  • resize!, append! not defined (#277)
  • tag new version (#278)
  • Panic during profiling tests on 14.4 beta (#281)
  • M3 backend cannot handle atomics with complicated pointer conversions (#282)
  • Int128 does not compile (#287)
  • Two suspicious mtl-related behaviours (#289)
  • LU factorization: add allowsingular keyword argument (#299)
  • Autorelease changes lead to use after free with errors (#301)
  • Reductions don't work on Shared Arrays (#312)

v1.0.0

30 Jan 15:16
f6df13d
Compare
Choose a tag to compare

Metal v1.0.0

Diff since v0.5.1

Merged pull requests:

  • Matrix batches (#158) (@tgymnich)
  • Add 1.10 CI. (#256) (@maleadt)
  • Update manifest (#258) (@github-actions[bot])
  • CompatHelper: bump compat for GPUCompiler to 0.25, (keep existing compat) (#259) (@github-actions[bot])
  • Bump actions/checkout from 3 to 4 (#260) (@dependabot[bot])
  • Update manifest (#261) (@github-actions[bot])
  • CompatHelper: bump compat for CEnum to 0.5, (keep existing compat) (#262) (@github-actions[bot])
  • Update manifest (#263) (@github-actions[bot])
  • CompatHelper: add new compat entry for Artifacts at version 1, (keep existing compat) (#264) (@github-actions[bot])
  • Reduce launch overhead by generating code to encode arguments. (#265) (@maleadt)
  • Remove unused function argument (#266) (@tgymnich)
  • Introduce application tracing profiler (#267) (@maleadt)
  • Remove content(::MTLBuffer), use convert intead. (#268) (@maleadt)
  • Allow more kwargs syntax with kernel launches (#269) (@maleadt)
  • Don't re-use the IO object when shelling out to Python. (#271) (@maleadt)
  • Preserve storage mode when broadcasting. (#273) (@maleadt)

Closed issues:

  • Support for macOS Sonoma (#201)
  • Error with Julia 1.10 (#274)

v0.5.1

13 Sep 14:34
335704e
Compare
Choose a tag to compare

Metal v0.5.1

Diff since v0.5.0

Merged pull requests:

  • MPSMatrix improvements (#157) (@tgymnich)
  • Update manifest (#221) (@github-actions[bot])
  • Update manifest (#222) (@github-actions[bot])
  • Update manifest (#224) (@github-actions[bot])
  • Update manifest (#227) (@github-actions[bot])
  • CompatHelper: bump compat for ObjectiveC to 1, (keep existing compat) (#228) (@github-actions[bot])
  • Update manifest (#230) (@github-actions[bot])
  • Fix argument types in sincos (#232) (@fjebaker)
  • Update manifest (#233) (@github-actions[bot])
  • Improve docs (#235) (@christiangnrd)
  • Remove linear algebra section of MPS docs (#237) (@christiangnrd)
  • CompatHelper: bump compat for GPUCompiler to 0.22, (keep existing compat) (#238) (@github-actions[bot])
  • Port openlibm log1pf as log1p (#239) (@sotlampr)
  • Port openlibm erf (#240) (@tgymnich)
  • Remove 1.6-era override mechanism. (#241) (@maleadt)
  • CompatHelper: add new compat entry for Requires at version 1, (keep existing compat) (#242) (@github-actions[bot])
  • Update manifest (#243) (@github-actions[bot])
  • enable dependabot for GitHub actions (#244) (@ranocha)
  • Bump actions/checkout from 2 to 3 (#245) (@dependabot[bot])
  • Bump peter-evans/create-pull-request from 3 to 5 (#246) (@dependabot[bot])
  • Show METAL_CAPTURE_ENABLED in Metal.versioninfo() when the environment variable is set (#248) (@christiangnrd)
  • Update manifest (#249) (@github-actions[bot])
  • Adapt to GPUCompiler.jl, and other small updates. (#250) (@maleadt)
  • Switch to GPUArrays buffer management. (#251) (@maleadt)
  • Update manifest (#252) (@github-actions[bot])
  • Update manifest (#253) (@github-actions[bot])
  • Bump GPUCompiler (#255) (@maleadt)

Closed issues:

  • Random access indexing into MtlArray views cause scalar indexing (#149)
  • Q: How to debug kernels - KA.@print? (#223)
  • Crash during MTLDispatchListApply (#225)
  • Unable to compile trig functions through ForwardDiff (#229)
  • symbol multiply defined! Bug/crash on Julia master, fine on 1.10 (#231)
  • log1p fails on MtlArray{Float32} (#234)
  • When precompiling, UndefVarError: CompilerConfig not defined (#247)

v0.5.0

01 Jul 15:06
9a72b9c
Compare
Choose a tag to compare

Metal v0.5.0

Diff since v0.4.1

Metal.jl 0.5 is a feature release, bringing initial support for atomic operations (#168).
Low-level atomics that mimic Metal C are supported (atomic_store_explicit,
atomic_load_explicit, etc), as well as a higher-level Metal.@atomic that can be used to
update array values similar to how CUDA.jl's @atomic works. This uses native atomics when
supported, and falls back to a compare-exchange loop otherwise.

Minor changes include an update for the @device_code_agx disassembler, the addition of a
type variable to MtlArray encoding the storage mode (#194), and support for MPSVector
(#199) which should accelerate matrix/vector multiplications.

Also note that Metal.jl now disallows the construction of Float64 arrays, as these are not
support by the Metal libraries.

Closed issues:

  • Support for atomics (#79)
  • Make MtlArray storage mode a type parameter (#190)
  • Long stacktrace when trying to create Float64 rand arrays (#205)
  • allowscalar equivalent for Metal.jl (#206)
  • Define map! ? (#219)

Merged pull requests:

  • Implement atomics using compiler intrinsics (#168) (@maleadt)
  • Parameterize MtlArray storage mode (#194) (@christiangnrd)
  • Implement MPSVector (#199) (@tgymnich)
  • Update manifest (#200) (@github-actions[bot])
  • Add Metal 3.1 to MTLLanguageVersion (#202) (@christiangnrd)
  • Update manifest (#203) (@github-actions[bot])
  • CompatHelper: bump compat for GPUCompiler to 0.21, (keep existing compat) (#204) (@github-actions[bot])
  • Update manifest (#207) (@github-actions[bot])
  • Disallow Float64 arrays entirely. (#209) (@maleadt)
  • Adapt to LLVM.jl 6. (#213) (@maleadt)
  • Update manifest (#215) (@github-actions[bot])
  • Bump disassembler. (#216) (@maleadt)

v0.4.1

01 Jun 13:28
Compare
Choose a tag to compare

Metal v0.4.1

Diff since v0.4.0

Closed issues:

  • Command buffer callbacks can cause bus error during thread adoption (#138)
  • how to set up Project.toml (#185)
  • Metal.rand() creates a CPU array (#187)
  • fill! for Int8 errors when the value is negative (#192)

Merged pull requests:

v0.4.0

22 May 14:25
6047d50
Compare
Choose a tag to compare

Metal v0.4.0

Diff since v0.3.0

Closed issues:

  • Restore mtlcall (#17)
  • mapreduce has poor performance (#87)
  • Native code reflection (#95)
  • rand! with Bools sometimes fails in tests in 1.9 (#141)
  • LLVM assertion failures (#153)
  • Time macro similar to CUDA.@time (#160)
  • bug in rand!? (#162)
  • Why not support threadIdx().x, blockIdx().x, blockDim().x etc? (#163)
  • Incorrect(?) darwin version in 1.8 with Metal.versioninfo() (#179)

Merged pull requests:

v0.3.0

31 Mar 14:07
20ba6a4
Compare
Choose a tag to compare

Metal v0.3.0

Diff since v0.2.0

Closed issues:

  • Migrate to metal C++? (#2)
  • Improved errors when calling device functions on CPU (#90)
  • Improve Objective-C interfacing (#104)
  • Rename grid to groups (#116)
  • Add functionality check helper (#121)
  • inputing non-isbits types (#128)
  • @metal docstring out-of-date (#129)
  • mapreduce kernel uses too many threads (#132)
  • Powers don't work with complex floats (#142)

Merged pull requests:

v0.2.0

03 Mar 13:00
Compare
Choose a tag to compare

Metal v0.2.0

Diff since v0.1.2

Closed issues:

  • Threadgroup memory breaks on small datatypes (#26)
  • Int64 not supported on AMD GPUs? (#38)
  • Base.unsafe_convert is ambiguous (#42)
  • Support for multiple devices (#44)
  • Add CITATION file (#55)
  • XGBoost on Metal.jl (#82)
  • first try at metal (#84)
  • Copysign intrinsic possibly wrong (#89)
  • Metal.jl fails to precompile on Linux (#97)
  • Silent failure with unsupported(?) Intel Iris Graphics (#109)
  • I have 2 question about Metal.jl and Flux.jl (#110)

Merged pull requests:

  • Update manifest (#57) (@github-actions[bot])
  • Add GPU profiling capabilities (#58) (@max-Hawkins)
  • Automatically detect if we need cmt build from source. (#59) (@maleadt)
  • Update manifest (#60) (@github-actions[bot])
  • Add queue kernel launch argument (#61) (@tgymnich)
  • Update manifest (#63) (@github-actions[bot])
  • Switch pipeline to juliaecosystem (#64) (@vchuravy)
  • Update manifest (#65) (@github-actions[bot])
  • Add a function for setting the current device (#66) (@maxwindiff)
  • Add documentation webpage (#67) (@max-Hawkins)
  • Wrap simdgroup matrix functions (#70) (@maxwindiff)
  • Support loading/saving simdgroup matrix from threadgroup memory (#71) (@maxwindiff)
  • Conditionalize the MtlDeviceArray element-type workaround. (#72) (@maleadt)
  • Add basic SIMD shuffle up/down (#73) (@max-Hawkins)
  • Update manifest (#74) (@github-actions[bot])
  • Optimize warp reduction for mapreduce (#75) (@max-Hawkins)
  • Specialize GPUArrays.global_index() to improve broadcast performance (#76) (@maxwindiff)
  • Update manifest (#78) (@github-actions[bot])
  • Add initial performance shader support (matmul) (#80) (@max-Hawkins)
  • Use Ninja to build cmt. (#81) (@maleadt)
  • Update manifest (#83) (@github-actions[bot])
  • Support Julia 1.9 (#85) (@maleadt)
  • Add queue parameter to unsafe_copyto (#88) (@tgymnich)
  • Update manifest (#91) (@github-actions[bot])
  • Add MPS tests. (#92) (@maleadt)
  • Support for writing binary archives (#94) (@maleadt)
  • Support precompilation and loading on non-Apple hardware (#98) (@maleadt)
  • Update manifest (#99) (@github-actions[bot])
  • Improve reduce performance by passing CartesianIndices and length statically (#100) (@maxwindiff)
  • Do not release objects that are autoreleased. (#102) (@habemus-papadum)
  • Fix path the cmt in Hacking Section of the Readme (#105) (@habemus-papadum)
  • Add example showing Metal and Gtk4 integration (#106) (@habemus-papadum)
  • Fix memory leak. (#107) (@habemus-papadum)
  • Add a mtl function for simple recursive data conversions. (#114) (@maleadt)
  • Write profile trace in the current folder. (#115) (@maleadt)

v0.1.2

03 Oct 13:24
18da14d
Compare
Choose a tag to compare

Metal v0.1.2

Diff since v0.1.1

Closed issues:

  • installation issue (libz.1.dylib not found) [+workaround] (#51)
  • Optimally choosing threads and grid (#54)

Merged pull requests:

  • Use Base.active_project. (#43) (@maleadt)
  • Update manifest (#45) (@github-actions[bot])
  • Add aliases MtlVector and MtlMatrix (#48) (@amontoison)
  • Update manifest (#49) (@github-actions[bot])
  • Wrap at-metal's output in a let block. (#50) (@maleadt)
  • Update manifest (#52) (@github-actions[bot])
  • Update manifest (#56) (@github-actions[bot])

v0.1.1

10 Jul 12:27
71f05d9
Compare
Choose a tag to compare

Metal v0.1.1

Diff since v0.1.0

Closed issues:

  • Super slow broadcast (#39)

Merged pull requests: