Feature/sparse pooling #43

Flegyas · 2022-03-22T17:30:48Z

This PR adds a new pooling method for subwords (well, two, but the inefficient one is there only for benchmarking purposes).
The sparse one is necessary for contexts where we want to enable CUDA determinism since scatter methods do not support it.

The script benchmark.py compares them, but I think that there is some mismatch in the approaches since these are the results (GPU: NVIDIA 2060S | CPU: AMD 3700X):

scatter == sparse (allclose with atol=1e-07): False
scatter == inefficient (allclose with atol=1e-07): False
sparse == inefficient (allclose with atol=1e-07): True
scatter 23.960102558135986s
sparse 23.518492221832275s
inefficient 24.366436004638672s

I wrote the "inefficient" pooling method as a control one, and it seems like the scatter method is not matching its results.
I think the mismatch can be traced to something weird happening with the padded positions, but I didn't investigate further.

I could very well have implemented both the control and the sparse methods wrongly, so please double-check everything!

And thank you for the library, it is truly useful!

Updates the requirements on [torch](https://github.com/pytorch/pytorch) to permit the latest version. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](pytorch/pytorch@v1.7.0...v1.11.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>

….7-and-lt-1.12 Update torch requirement from <1.11,>=1.7 to >=1.7,<1.12

… `__call__`

Riccorl · 2022-03-23T10:39:18Z

Thanks for the PR! I will compare them in some downstream task to check that everything works.

If everything is fine, I will clean up some stuff and release it (2.1 I guess).

Riccorl and others added 13 commits March 9, 2022 21:23

Removed unused functions

ba0dbb5

Merge remote-tracking branch 'origin/main'

c554be7

Merge pull request Riccorl#42 from Riccorl/dependabot/pip/torch-gte-1…

7ae4cc5

….7-and-lt-1.12 Update torch requirement from <1.11,>=1.7 to >=1.7,<1.12

Offset padding is -1 by default now

efc5a10

Merge remote-tracking branch 'origin/main'

a0c0a46

Update version

0b73cf6

word_max_batch_len and subword_max_batch_len updated on Tokenizer…

fa84413

… `__call__`

Update version

7a14075

Change the ModelInput.to method to rely on duck-typing

70db04a

Add compute_bpe_info parameter to Tokenizer.__call__

92b259a

Add new subword pooling strategies to TransformersEmbedder

d02bf8a

Add a simple benchmark for the subword pooling methods

8d12cd6

Riccorl merged commit fdf8d7b into Riccorl:dev Mar 23, 2022

Flegyas deleted the feature/sparse-pooling branch March 23, 2022 10:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/sparse pooling #43

Feature/sparse pooling #43

Flegyas commented Mar 22, 2022

Riccorl commented Mar 23, 2022

Feature/sparse pooling #43

Feature/sparse pooling #43

Conversation

Flegyas commented Mar 22, 2022

Riccorl commented Mar 23, 2022