Update README.md #183

vkuzo · 2024-01-11T22:00:02Z

No description provided.

facebook-github-bot · 2024-01-11T22:01:11Z

@vkuzo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

drisspg · 2024-01-11T22:30:57Z

README.md

 # User API, subject to change

-## single GPU
+We provide two types of scaling for per-tensor scaling of tensors: dynamic and delayed.


This sounds kinda of weird, what about "We provide two scaling strategies: per-tensor dynamic and delayed."

drisspg · 2024-01-11T22:32:00Z

README.md


+# optional: use FSDP. Note that workarounds are needed for autocast+compile+FSDP+float8 to work
+from float8_experimental import config
+config.enable_amax_init = False


can we do like

if "fsdp" do the config stuff otherwise you don't need to

drisspg · 2024-01-11T22:32:55Z

README.md

-
-We are using a module swap UX to keep things simple. If the user model has `torch.nn.Linear` modules or their `fairscale` TP/SP equivalents,
-we can convert them to float8. `F.linear`, `torch.mm`, `torch.matmul` are not supported at the moment.
+# upcoming work


Maybe we make a tracking issue for upcoming work and link that here. Just so we don't forget to remove from reademe, either way is okay though

drisspg · 2024-01-11T22:33:47Z

README.md

-
-## multi GPU
-
-### TP/SP


Maybe we still have the TP/SP stuff documented just saying we know this doesn't work with compile and planto have dtensor integration for this

hmm, it's not that usable without compile though :( I'm more excited about just deleting this

drisspg · 2024-01-11T22:34:15Z

README.md

-
-### Tensor subclasses
-
-We are using tensor subclasses (`Float8Tensor`) to write modular code which satisfies


I think this is still fine no?

will copy-paste to design

drisspg · 2024-01-11T22:34:40Z

README.md

-# the rest of the flow is the same as the single GPU flow
-```
-
-# high level technical design


same with this, I don't mind the high level design

I think all of the below is slightly outdated and also more of dev docs than user README. I think we can have an issue about high level design, it would be nice to edit that without OSS->Meta PR sync. I can copy-paste this there.

drisspg · 2024-01-11T22:35:02Z

README.md

 * `float8_experimental/float8_linear.py` - `Float8Linear` (main user facing entry point for delayed scaling)
 * `float8_experimental/float8_dynamic_linear.py` - `Float8DynamicLinear` (main user facing entry point for dynamic scaling)
 * `float8_experimental/float8_tensor.py` - `Float8Tensor`, which allows `Float8Linear` to abide by the `x.dtype == x.grad.dtype` restriction
-* `float8_experimental/tp_linear.py` - `Float8ColumnParallelLinear` / `Float8RowParallelLinear` (TP/SP versions of float8 linear)


same with above

facebook-github-bot · 2024-01-11T23:56:11Z

@vkuzo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-01-12T00:06:48Z

@vkuzo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-01-12T00:49:11Z

@vkuzo merged this pull request in d0af81a.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 11, 2024

vkuzo requested a review from drisspg January 11, 2024 22:00

drisspg reviewed Jan 11, 2024

View reviewed changes

drisspg approved these changes Jan 11, 2024

View reviewed changes

vkuzo force-pushed the vkuzo-patch-9 branch from 728f2aa to 5f435ea Compare January 11, 2024 23:55

Update README.md

c633d26

vkuzo force-pushed the vkuzo-patch-9 branch from 5f435ea to c633d26 Compare January 12, 2024 00:06

facebook-github-bot closed this in d0af81a Jan 12, 2024

facebook-github-bot added the Merged label Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update README.md #183

Update README.md #183

Uh oh!

vkuzo commented Jan 11, 2024

Uh oh!

facebook-github-bot commented Jan 11, 2024

Uh oh!

drisspg Jan 11, 2024 •

edited

Loading

Uh oh!

drisspg Jan 11, 2024

Uh oh!

drisspg Jan 11, 2024

Uh oh!

drisspg Jan 11, 2024

Uh oh!

vkuzo Jan 12, 2024

Uh oh!

drisspg Jan 11, 2024

Uh oh!

vkuzo Jan 12, 2024

Uh oh!

drisspg Jan 11, 2024

Uh oh!

vkuzo Jan 12, 2024

Uh oh!

drisspg Jan 11, 2024

Uh oh!

facebook-github-bot commented Jan 11, 2024

Uh oh!

facebook-github-bot commented Jan 12, 2024

Uh oh!

facebook-github-bot commented Jan 12, 2024

Uh oh!

Uh oh!


		### Tensor subclasses

		We are using tensor subclasses (`Float8Tensor`) to write modular code which satisfies

Update README.md #183

Update README.md #183

Uh oh!

Conversation

vkuzo commented Jan 11, 2024

Uh oh!

facebook-github-bot commented Jan 11, 2024

Uh oh!

drisspg Jan 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jan 11, 2024

Uh oh!

facebook-github-bot commented Jan 12, 2024

Uh oh!

facebook-github-bot commented Jan 12, 2024

Uh oh!

Uh oh!

drisspg Jan 11, 2024 •

edited

Loading