add CTrL integration #561

TomVeniat · 2021-04-23T15:17:26Z

Draft for #377

coveralls · 2021-04-23T17:19:59Z

Pull Request Test Coverage Report for Build 1649217012

84 of 87 (96.55%) changed or added relevant lines in 3 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.1%) to 79.913%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
avalanche/benchmarks/classic/ctrl.py	42	43	97.67%
tests/test_ctrl.py	41	43	95.35%

Totals
Change from base Build 1648858210:	0.1%
Covered Lines:	11533
Relevant Lines:	14432

💛 - Coveralls

vlomonaco · 2021-04-27T08:11:20Z

Hi @TomVeniat! Thanks for contributing to Avalanche! Let us know when we can start to review this and please add some comments in the PR description so that we can understand the logic behind it!

TomVeniat · 2021-04-27T08:59:29Z

Hi @vlomonaco,
I think we're almost done, where could I add a demo of how to use it?
I think a notebook would be great but it seems notebooks are only used for the "from zero to hero" series and that all other examples are scripts, is that correct?

vlomonaco · 2021-04-27T09:16:57Z

Hi Tom, you can add a comprehensive example on its usage in the examples directory. Also please add unit tests, doc strings and try to add not only the "classic" CtRL Benchmark but also expose its "internal" datasets, so they can be re-used with the avalanche benchmark generators, thanks!

AntonioCarta · 2021-04-27T09:56:39Z

Maybe a better place for a notebook would be outside of avalanche's documentation, in our colab repository. We have a collection of notebooks and continual learning examples there.

What do you think?

TomVeniat · 2021-05-01T13:11:43Z

I think that only exposing the "classic" streams is a good starting point, with a simple script in the examples directory (similar to https://github.com/ContinualAI/avalanche/blob/master/examples/all_mnist.py), just to show how to get started.
Once this is done I'll work on exposing the internals and making a notebook to demonstrate how to create new streams, as it will require a bit more work on the CTrL side. Does this sound good to you?

About the testing, is there a way to add some information about each experience in a dataset_benchmark? I'd like to be able to verify that the same category appears in different tasks, so adding the class names in each experience could be useful.
Another use case would be task descriptors, is it possible to use them in the current version of Avalanche?
Thanks for your help !

vlomonaco · 2021-05-03T08:21:31Z

I think that only exposing the "classic" streams is a good starting point, with a simple script in the examples directory (similar to https://github.com/ContinualAI/avalanche/blob/master/examples/all_mnist.py), just to show how to get started.
Once this is done I'll work on exposing the internals and making a notebook to demonstrate how to create new streams, as it will require a bit more work on the CTrL side. Does this sound good to you?

Yes, it makes sense, thanks! :)

About the testing, is there a way to add some information about each experience in a dataset_benchmark? I'd like to be able to verify that the same category appears in different tasks, so adding the class names in each experience could be useful.
Another use case would be task descriptors, is it possible to use them in the current version of Avalanche?

You can check the "classes_in_experience" attribute which is populated automatically. As for tasks descriptors, unfortunately I'm not sure we currently support them as task_labels are only integers across avalanche at the moment (can you confirm this @lrzpellegrini?).

lrzpellegrini · 2021-05-03T13:53:25Z

I'm afraid that AvalancheDataset currently supports only int task_labels. Having a task descriptor field in AvalancheDataset (as a separate task_descriptor field?) is a good idea we already thinked of (we also have a pending feature request #371), but that feature remained low on the priority list... It should't be too difficult to implement this feature in its plain version where we have one descriptor per pattern (just like the targets and task_labels fields) and the descriptors are of arbitrary types (Tensors, ndarrays, scalars).

@TomVeniat you need Tensor descriptors, am I correct?

vlomonaco · 2021-06-01T12:59:14Z

@TomVeniat any update on this?

TomVeniat · 2021-06-01T13:30:58Z

Hi @vlomonaco,
Yes, the integration of existing streams, with doc and tests is ready.
I also have a demo script that trains a simple CNN on a whole stream using the Naive strategy. I started working on the task-level cross-validation but got caught up with other things the last few weeks, sorry about that!
The PR will be ready for review as soon as this part is done, I think it is important to provide an example of the correct way to evaluate any model on CTrL.
You can expect an update this week, would that work for you ?

vlomonaco · 2021-06-01T14:12:35Z

Yes of course, thanks @TomVeniat! :) Let us know if you need any help!

ContinualAI-bot · 2021-06-10T17:14:38Z

Oh no! It seems there are some PEP8 errors! 😕
Don't worry, you can fix them! 💪
Here's a report about the errors and where you can find them:

examples/simple_ctrl.py:40:1: E302 expected 2 blank lines, found 1
examples/simple_ctrl.py:68:5: E122 continuation line missing indentation or outdented
examples/simple_ctrl.py:80:5: E303 too many blank lines (2)
examples/simple_ctrl.py:85:81: E501 line too long (104 > 80 characters)
examples/simple_ctrl.py:102:81: E501 line too long (91 > 80 characters)
examples/simple_ctrl.py:104:1: E305 expected 2 blank lines after class or function definition, found 1
1       E122 continuation line missing indentation or outdented
1       E302 expected 2 blank lines, found 1
1       E303 too many blank lines (2)
1       E305 expected 2 blank lines after class or function definition, found 1
2       E501 line too long (104 > 80 characters)

lrzpellegrini

Hi Tom, apart from the comments I left in ctrl.py and simple_ctrl.py, can you add ctrl in the __init__.py file in avalanche/benchmarks/classic (following the alphabetical order of modules)?

avalanche/benchmarks/classic/ctrl.py

examples/simple_ctrl.py

TomVeniat · 2021-06-14T11:57:17Z

Thank you @lrzpellegrini for the review, I'll make the suggested changes.

The only thing missing for the full integration of CTrL is a way to handle longer streams. The current integration starts with loading all experiences as AvalancheTensorDatasets, and then creates the benchmark using dataset_benchmark.
The problem is that the long stream of CTrL doesn't necessarily fit in memory, preventing the usage of this method.
Is there a way to lazily generate the Experiences in a benchmark?
An alternative solution would be to write each experience generated with CTrL on the disk and only then create the benchmark with a list of paths. Is this possible in the current version? If it isn't, do you think that writing every sample to the disk and then use path_benchmark work?

I will also open a new issue for the task-level cross-validation, the solution I had isn't that straightforward to implement so Id' like your input on it to make sure it fits nicely with the framework.

lrzpellegrini · 2021-06-14T13:00:07Z

Is there a way to lazily generate the Experiences in a benchmark?

Alas, in the current state this is not possible. However, this is a feature we already have in our backlog (see #600).

An alternative solution would be to write each experience generated with CTrL on the disk and only then create the benchmark with a list of paths. Is this possible in the current version? If it isn't, do you think that writing every sample to the disk and then use path_benchmark work?

Yes, this is already doable. Have a look at:

avalanche/avalanche/benchmarks/scenarios/generic_benchmark_creation.py

Line 270 in 68290d3

def create_generic_benchmark_from_paths(

(which is the alias for paths_benchmark). That helper accepts a list of files for each experience and it also supports custom streams (like the validation one). In that case, the best solution is to write any generated data to a temporary folder or some other kind of "working directory".

Avalanche doesn't currently support directly loading tensor(s) describing a whole experience directly from disk, but I think I can easily implement it if needed!

I will also open a new issue for the task-level cross-validation, the solution I had isn't that straightforward to implement so Id' like your input on it to make sure it fits nicely with the framework.

Yes, a separate issue seems the best solution. For that feature we'll need some support from @AntonioCarta on the strategy-side.

TomVeniat · 2021-06-15T16:22:54Z

Yes, this is already doable. Have a look at:

avalanche/avalanche/benchmarks/scenarios/generic_benchmark_creation.py

Line 270 in 68290d3

def create_generic_benchmark_from_paths(

(which is the alias for paths_benchmark). That helper accepts a list of files for each experience and it also supports custom streams (like the validation one). In that case, the best solution is to write any generated data to a temporary folder or some other kind of "working directory".

I started the implementation following this strategy but It seems that there is no way to give the correct transform to each experience. The train_transform and eval_transform attributes of path_benchmark only accept a global transform for the whole benchmark and can't work with lists of transforms (one per experience) in the current version. Am I overlooking something?
I think the best alternative would be to create each AvalancheDataset manually (as done here) with the correct transform and then create the benchmark using dataset_benchmark instead of path_benchmark.

lrzpellegrini · 2021-06-16T13:38:31Z

Yes, if a different transformation is needed for each experience, then dataset_benchmark is the best solution!

lrzpellegrini · 2021-06-18T12:39:14Z

Hi Tom, I recently implemented a minimal support for lazily generated streams, see #671.

That approach may make things easier on your side! There is also some memory optimization tuning that can be used to prevent memory usage from exploding by dropping references to previous experiences (I still have to document it decently). That PR still has to be merged, but it should be a matter of hours.

TomVeniat · 2021-06-18T12:57:09Z

Thanks for the update! I should have posted here but I went the write to disk + dataset_benchmark route for the long stream in this commit.
I think this PR is ready for a review, the last thing to do will be to remove the EarlyStopping feature and replace it with the official one when #670 is merged.

ContinualAI-bot · 2021-06-22T08:29:11Z

Oh no! It seems there are some PEP8 errors! 😕
Don't worry, you can fix them! 💪
Here's a report about the errors and where you can find them:

examples/simple_ctrl.py:127:72: E502 the backslash is redundant between brackets
1       E502 the backslash is redundant between brackets

ContinualAI-bot · 2021-06-24T12:44:39Z

Oh no! It seems there are some PEP8 errors! 😕
Don't worry, you can fix them! 💪
Here's a report about the errors and where you can find them:

tests/test_ctrl.py:58:27: E241 multiple spaces after ','
1       E241 multiple spaces after ','

TomVeniat · 2021-06-24T12:52:38Z

It seems like the memory consumption is still too high for the long stream, I will need to make some changes on the ctrl package side to make it more efficient (I think that I keep a reference to all generated tasks so far during the generation process).
In the meantime, I added a parameter controlling the length of the long stream for more flexibility and I use it in the tests to make them run faster and relieve the memory pressure.

TomVeniat · 2021-07-22T10:19:38Z

@lrzpellegrini Any idea of what happened with the python 3.9 tests setup?

ashok-arjun · 2021-08-10T16:36:40Z

@TomVeniat This seems to be the error:

Traceback (most recent call last):
  File "/__w/avalanche/avalanche/tests/test_ctrl.py", line 63, in test_determinism
    bench_1 = CTrL(stream, seed=1)
  File "/__w/avalanche/avalanche/avalanche/benchmarks/classic/ctrl.py", line 55, in CTrL
    stream = ctrl.get_stream(stream_name, seed)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/streams/__init__.py", line 144, in get_stream
    return init_component(default_rng(seed), **config)['task_gen']
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/streams/__init__.py", line 130, in init_component
    v = init_component(_rnd=_rnd, **v)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/streams/__init__.py", line 130, in init_component
    v = init_component(_rnd=_rnd, **v)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/streams/__init__.py", line 130, in init_component
    v = init_component(_rnd=_rnd, **v)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/streams/__init__.py", line 134, in init_component
    return comp_class(seed=_rnd.integers(0, 1e9), **kwargs)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/instances/image_dataset_tree.py", line 120, in __init__
    super().__init__(*args, **kwargs)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/concepts/concept_tree.py", line 57, in __init__
    super().__init__(*args, **kwargs)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/commons/tree.py", line 27, in __init__
    self.root_node = self.build_tree()
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/instances/image_dataset_tree.py", line 129, in build_tree
    self.trainset = ds_class(split='train', **common_params)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/instances/DTD.py", line 41, in __init__
    self._download_and_prepare(root, split, img_size)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/ctrl/instances/DTD.py", line 47, in _download_and_prepare
    download_and_extract_archive(self.url, download_root=root)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/torchvision/datasets/utils.py", line 417, in download_and_extract_archive
    extract_archive(archive, extract_root, remove_finished)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/torchvision/datasets/utils.py", line 383, in extract_archive
    suffix, archive_type, compression = _detect_file_type(from_path)
  File "/opt/conda/envs/avalanche-env/lib/python3.9/site-packages/torchvision/datasets/utils.py", line 305, in _detect_file_type
    raise RuntimeError(
RuntimeError: Archive type and compression detection only works for 1 or 2 suffixes. Got 4 instead.

This is from the ctrl-benchmark, specifically the download_and_extract_archive does not work with torchvision anymore. This prevents the download of the dtd dataset.

I got the same error when running on python 3.9 on multiple machines.

I have a fix with which I am working locally, and to integrate that, you need to merge the PR I raised at facebookresearch/CTrLBenchmark.

vlomonaco · 2021-08-12T14:28:25Z

Thanks @ashok-arjun for this! :)

TomVeniat · 2021-08-16T19:25:25Z

Thanks @ashok-arjun, your PR is live! :)
@vlomonaco, is there anything I missed to force the test pipeline to take into account the changes made in environment.yml ? The failures are due to ctrl-benchmark not being installed when the tests are run.

ashok-arjun · 2021-12-29T04:34:09Z

@lrzpellegrini Any idea why these tests aren't passing? I am not able to figure this out.

AndreaCossu · 2021-12-29T12:50:37Z

Hi @ashok-arjun, is your repo in sync with avalanche master branch? If not, can you please update it (there may be conflicts to be solved)?

ashok-arjun · 2021-12-30T01:23:43Z

@AndreaCossu Actually it is @TomVeniat 's branch that must be checked for conflicts, since he has made the PR :)

ashok-arjun · 2021-12-31T05:29:31Z

The error related to the unit test failures is ModuleNotFoundError: No module named 'ctrl'.

TomVeniat · 2022-01-03T09:31:59Z

Thanks @ashok-arjun for looking into this issue !

[...] Is there anything I missed to force the test pipeline to take into account the changes made in environment.yml ? The failures are due to ctrl-benchmark not being installed when the tests are run.

It seems the error remains the same, with the updates of environment-dev.yml and environment.yml not taken into account when running the tests.
Is there anything I can do to help with that ?

AndreaCossu · 2022-01-03T12:45:51Z

I added the ctrl-benchmark to the docker test. Thanks @TomVeniat and @ashok-arjun !
If @lrzpellegrini is ok with the changes, we can merge this.

lrzpellegrini · 2022-01-03T13:50:49Z

Everything seems in order to me!

Review satisfied

TomVeniat mentioned this pull request Apr 23, 2021

Integrating the CtRL Benchmark #377

Closed

vlomonaco requested a review from lrzpellegrini April 23, 2021 16:17

vlomonaco marked this pull request as draft April 23, 2021 16:18

vlomonaco mentioned this pull request May 18, 2021

Avalanche Roadmap to Beta-2 🚀 #587

Closed

32 tasks

vlomonaco added the Benchmarks Related to the Benchmarks module label Jun 6, 2021

TomVeniat force-pushed the master branch from 6570919 to ae4861f Compare June 10, 2021 17:14

lrzpellegrini previously requested changes Jun 14, 2021

View reviewed changes

avalanche/benchmarks/classic/ctrl.py Show resolved Hide resolved

examples/simple_ctrl.py Show resolved Hide resolved

TomVeniat force-pushed the master branch from 1e7b363 to 3e3152a Compare June 16, 2021 09:57

TomVeniat mentioned this pull request Jun 16, 2021

ADD early stopping plugin #664

Closed

TomVeniat added 2 commits June 22, 2021 10:30

add CTrL integration

4aef41d

Add normalization to each task

e2e6e60

TomVeniat added 6 commits June 22, 2021 10:30

Fix PEP8 errors

648062a

Example script cleaning

3cda277

Add support for the long stream + early stopping

67f611c

Add modes to the early stopping plugin

ac183eb

Add S_long to the demo

21426fa

Use early stopping plugin + add license

dc75875

TomVeniat force-pushed the master branch from 1703f79 to dc75875 Compare June 22, 2021 08:46

Shorten the long_stream for testing

e8e0741

Fix PEP8 error

6a0e97f

TomVeniat marked this pull request as ready for review June 26, 2021 09:54

ashok-arjun mentioned this pull request Aug 10, 2021

Fix python3.9 errors with new functions facebookresearch/CTrLBenchmark#4

Merged

update ctrl dependency

f43544a

Merge branch 'master' into master

095460d

Merge remote-tracking branch 'upstream/master'

6e89f99

AndreaCossu merged commit 21589ea into ContinualAI:master Jan 3, 2022

AndreaCossu linked an issue Jan 3, 2022 that may be closed by this pull request

Integrating the CtRL Benchmark #377

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add CTrL integration #561

add CTrL integration #561

TomVeniat commented Apr 23, 2021

coveralls commented Apr 23, 2021 •

edited

Loading

vlomonaco commented Apr 27, 2021 •

edited

Loading

TomVeniat commented Apr 27, 2021

vlomonaco commented Apr 27, 2021 •

edited

Loading

AntonioCarta commented Apr 27, 2021

TomVeniat commented May 1, 2021

vlomonaco commented May 3, 2021 •

edited

Loading

lrzpellegrini commented May 3, 2021

vlomonaco commented Jun 1, 2021

TomVeniat commented Jun 1, 2021

vlomonaco commented Jun 1, 2021

ContinualAI-bot commented Jun 10, 2021

lrzpellegrini left a comment

TomVeniat commented Jun 14, 2021

lrzpellegrini commented Jun 14, 2021

TomVeniat commented Jun 15, 2021 •

edited

Loading

lrzpellegrini commented Jun 16, 2021

lrzpellegrini commented Jun 18, 2021

TomVeniat commented Jun 18, 2021

ContinualAI-bot commented Jun 22, 2021

ContinualAI-bot commented Jun 24, 2021

TomVeniat commented Jun 24, 2021

TomVeniat commented Jul 22, 2021

ashok-arjun commented Aug 10, 2021

vlomonaco commented Aug 12, 2021

TomVeniat commented Aug 16, 2021

ashok-arjun commented Dec 29, 2021

AndreaCossu commented Dec 29, 2021

ashok-arjun commented Dec 30, 2021

ashok-arjun commented Dec 31, 2021

TomVeniat commented Jan 3, 2022

AndreaCossu commented Jan 3, 2022

lrzpellegrini commented Jan 3, 2022

add CTrL integration #561

add CTrL integration #561

Conversation

TomVeniat commented Apr 23, 2021

coveralls commented Apr 23, 2021 • edited Loading

Pull Request Test Coverage Report for Build 1649217012

💛 - Coveralls

vlomonaco commented Apr 27, 2021 • edited Loading

TomVeniat commented Apr 27, 2021

vlomonaco commented Apr 27, 2021 • edited Loading

AntonioCarta commented Apr 27, 2021

TomVeniat commented May 1, 2021

vlomonaco commented May 3, 2021 • edited Loading

lrzpellegrini commented May 3, 2021

vlomonaco commented Jun 1, 2021

TomVeniat commented Jun 1, 2021

vlomonaco commented Jun 1, 2021

ContinualAI-bot commented Jun 10, 2021

lrzpellegrini left a comment

Choose a reason for hiding this comment

TomVeniat commented Jun 14, 2021

lrzpellegrini commented Jun 14, 2021

TomVeniat commented Jun 15, 2021 • edited Loading

lrzpellegrini commented Jun 16, 2021

lrzpellegrini commented Jun 18, 2021

TomVeniat commented Jun 18, 2021

ContinualAI-bot commented Jun 22, 2021

ContinualAI-bot commented Jun 24, 2021

TomVeniat commented Jun 24, 2021

TomVeniat commented Jul 22, 2021

ashok-arjun commented Aug 10, 2021

vlomonaco commented Aug 12, 2021

TomVeniat commented Aug 16, 2021

ashok-arjun commented Dec 29, 2021

AndreaCossu commented Dec 29, 2021

ashok-arjun commented Dec 30, 2021

ashok-arjun commented Dec 31, 2021

TomVeniat commented Jan 3, 2022

AndreaCossu commented Jan 3, 2022

lrzpellegrini commented Jan 3, 2022

coveralls commented Apr 23, 2021 •

edited

Loading

vlomonaco commented Apr 27, 2021 •

edited

Loading

vlomonaco commented Apr 27, 2021 •

edited

Loading

vlomonaco commented May 3, 2021 •

edited

Loading

TomVeniat commented Jun 15, 2021 •

edited

Loading