Skip to content

Conversation

@donglixp
Copy link
Contributor

@donglixp donglixp commented Mar 6, 2017

No description provided.

@soumith soumith merged commit 761d679 into pytorch:master Mar 6, 2017
@soumith
Copy link
Member

soumith commented Mar 6, 2017

thanks!

mrshenli pushed a commit to mrshenli/pytorch that referenced this pull request Apr 11, 2020
(Tutorials Refresh) Update build.sh, conf.py, add first commit for recipes
jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this pull request Aug 5, 2021
zhuhong61 pushed a commit to zhuhong61/pytorch that referenced this pull request Jun 8, 2022
hubertlu-tw pushed a commit to hubertlu-tw/pytorch that referenced this pull request Nov 1, 2022
Co-authored-by: pbialecki <pbialecki@nvidia.com>
akashveramd pushed a commit to akashveramd/pytorch that referenced this pull request Apr 9, 2025
akashveramd pushed a commit to akashveramd/pytorch that referenced this pull request Apr 9, 2025
…rch#945)

* Add f8 bf8 gemm example

* Add element-wise ops

* Add intrinsics

* Update reference calculation

* Add an additional type option for xdlops gemm

* Fix build process

* Add bf8 to buffer addressing

* Update blockwise op, split typeA and typeB

* Update for compatibility

* Uppdate naming to f8->fp8

* Update naming

* Format

* Update naming (pytorch#937)

* Add a client example

* Add computetypes to device and gridwise ops

* Add instances, update instance factory

* Format

* Fix a flag

* Add ckProfiler mode

* Fix typos

* Add an example

* Add bf8 generator

* add bf8 mfma; fixed type_convert for bf8

* move verfication ahead of timing

* Update reference calculation

* Fix reference

* Narrow down float init range

* Fix bf8 bf8 mfma

* Add bf8 @ fp8 mfma

* Update example

* Update instances

* Update profiler api

* Update for compatibility

* Format

* Remove extra example

* Clean up

* workaround convert

---------

Co-authored-by: Jing Zhang <jizha@amd.com>
akashveramd pushed a commit to akashveramd/pytorch that referenced this pull request Apr 9, 2025
* Add f8 bf8 gemm example

* Add element-wise ops

* Add intrinsics

* Update reference calculation

* Add an additional type option for xdlops gemm

* Fix build process

* Add bf8 to buffer addressing

* Update blockwise op, split typeA and typeB

* Update for compatibility

* Uppdate naming to f8->fp8

* Update naming

* Format

* Update naming (pytorch#937)

* Add a client example

* Add computetypes to device and gridwise ops

* Add instances, update instance factory

* Format

* Fix a flag

* Add ckProfiler mode

* Fix typos

* Add an example

* Add bf8 generator

* add bf8 mfma; fixed type_convert for bf8

* move verfication ahead of timing

* Update reference calculation

* Fix reference

* Narrow down float init range

* Fix bf8 bf8 mfma

* Add bf8 @ fp8 mfma

* Update example

* Update instances

* Update profiler api

* Update for compatibility

* Format

* Remove extra example

* Clean up

* workaround convert

* added instance of f16_bf8f8, and client example

* fixed mfma selector

* format

---------

Co-authored-by: Rostyslav Geyyer <rosty.geyyer@amd.com>
Co-authored-by: Rostyslav Geyyer <46627076+geyyer@users.noreply.github.com>
Co-authored-by: Jing Zhang <jizha@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants