Faster attributes #1383

AntonioCarta · 2023-05-23T12:12:10Z

the high level idea is that before data attributes were part of the "dataset tree". However, we can manage them independently. Now an AvalancheDataset is a DatasetWithTransforms + a set of attributes. Overall easier to manage and speeds up #1357, which is now on par with the normal concat.

@lrzpellegrini as you can see there are two tests failing

top-k I think is a change in the API, so it doesn't allow to check k=4 anymore
I have no idea what the equality check in custom_streams_name_and_length is supposed to do. Shouldn't we check equality of the samples?

@neuperc @AlbinSou this fixes the DER slowdown.

ContinualAI-bot · 2023-05-23T12:12:42Z

Oh no! It seems there are some PEP8 errors! 😕
Don't worry, you can fix them! 💪
Here's a report about the errors and where you can find them:

avalanche/benchmarks/utils/classification_dataset.py:103:81: E501 line too long (84 > 80 characters)
avalanche/benchmarks/utils/classification_dataset.py:107:81: E501 line too long (84 > 80 characters)
avalanche/benchmarks/utils/data.py:238:81: E501 line too long (88 > 80 characters)
avalanche/benchmarks/utils/data.py:241:81: E501 line too long (89 > 80 characters)
avalanche/benchmarks/utils/data.py:410:81: E501 line too long (82 > 80 characters)
avalanche/benchmarks/utils/data.py:417:81: E501 line too long (92 > 80 characters)
avalanche/benchmarks/utils/data.py:512:81: E501 line too long (91 > 80 characters)
avalanche/benchmarks/utils/utils.py:487:81: E501 line too long (88 > 80 characters)
8       E501 line too long (84 > 80 characters)

lrzpellegrini · 2023-05-29T12:22:21Z

Please correct me if I'm wrong. With this PR AvalancheDataset will now have two main components/fields:

flat_data: in charge of managing transformations
data_attributes: in charge of data attributes

with each field managing its own subset operations. If this is the case, it seems reasonable.

As for the tests, custom_streams_name_and_length seems to fail correctly: an object should be equal to itself, especially if it is the very same object (with the same memory address). The equality check method may need a fix before merging.

AntonioCarta · 2023-05-29T17:00:41Z

Yes, you are correct about the PR implementation.

As for the tests, custom_streams_name_and_length seems to fail correctly: an object should be equal to itself, especially if it is the very same object (with the same memory address). The equality check method may need a fix before merging.

I misunderstood the test. I broke the equality method. Now it's fixed.

AntonioCarta added 3 commits May 22, 2023 16:01

AvalancheDataset: manage attributes separately from the data

927ca53

FIX attribute edge case with zero length datasets

429000c

ADD profiling script

b28a7d6

AntonioCarta requested a review from lrzpellegrini May 23, 2023 12:12

AntonioCarta added 2 commits May 29, 2023 15:57

FIX equality

41b9226

FIX pep8

575ce1b

AntonioCarta merged commit f7aa170 into ContinualAI:master May 29, 2023
5 of 13 checks passed

AntonioCarta deleted the faster_attributes branch June 1, 2023 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster attributes #1383

Faster attributes #1383

AntonioCarta commented May 23, 2023

ContinualAI-bot commented May 23, 2023

lrzpellegrini commented May 29, 2023

AntonioCarta commented May 29, 2023

Faster attributes #1383

Faster attributes #1383

Conversation

AntonioCarta commented May 23, 2023

ContinualAI-bot commented May 23, 2023

lrzpellegrini commented May 29, 2023

AntonioCarta commented May 29, 2023