Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Several small ica bugs (+ first fixes) #5967

Open
jeythekey opened this issue Feb 20, 2019 · 5 comments
Open

[BUG] Several small ica bugs (+ first fixes) #5967

jeythekey opened this issue Feb 20, 2019 · 5 comments

Comments

@jeythekey
Copy link
Contributor

I encountered a few minor bugs in ica:

Already fixed

  • ica._band_pass_filter (called (only) in ica.score_sources to filter sources and targets) uses the sfreq used when fitting the ica instead of the actual sfreq of the sources/targets (i.e. of raw).
  • ica._detect_artifacts is sorting scores in ascending rather than descending order for integer criterion arguments, contrary to what is promised in the docstring. This only affects the experimental detect_artifacts function.

The above 2 issues should be fixed in a first PR.

There are a few other things I would like to discuss first (should that be another issue, then?):

Checks for ica.exclude

Problem

The ica.exclude attribute is supposed to be a list of integers (while the exclude argument is supported to be "array-like"). However, neither is asserted and when providing an np.array instead, it is silently converted into a list.
BUT: Before that, strange things happen, if manually setting the ica.exclude attribute to a np.array (and exclude=None, otherwise both would have to have the same number of elements in order to not erroring): Random permutations of this array get added up numerically (instead of being appended), resulting in completely random components being removed, and for each call a different set of components (because of set().

Suggestions/Options

  1. The main culprit is a redundancy, which I think should be removed:
    a) This is called in the beginning of each apply...-method (by exclude = self._check_exclude(exclude)):
def _check_exclude(self, exclude):
    if exclude is None:
        return list(set(self.exclude))
    else:
        return list(set(self.exclude + exclude))
        # should instead be:
        # return list(set(list(self.exclude) + list(exclude))) # (see 2.b))

Now, self.exclude got permuted and copied to exclude.

b) Later, all the apply...methods call this:

def _pick_sources(self, data, include, exclude):
    """Aux function."""
# The following lines should be removed, since all the calling _apply_... functions do that already:
    if exclude is None:
        exclude = self.exclude
     else:
        exclude = list(set(self.exclude + list(exclude))) 
# ... or should at least be replaced with:
# exclude = self._check_exclude(exclude)

Here, the copied and permuted version in exclude (a list) gets added numerically to self.exclude (a np.array) ...

  1. Check
    a) Either assert self.exclude is a list and throw error or
    b) silently convert it to a list, as suggested above under 1 a)

--> What would be the preferred way?

Side note:

In contrast to exclude, picks has to be a np.array and bads has to be a list of strings. I think this has the potential to cause confusion not only for me. So, for me that would speak for option 2b) and to officially supporting both exclude and ica.exclude being both.

(Puhh, that was long ... ;))

Modify in place

(not a bug)
As I understood from some discussions, it is the general practice to modify large objects in place and providing a copy() method to the user.
I observed, that ica.get_sources always copies the (potentially big ...) data.
Should that be addressed, i.e. should the raw data in the inst object be replaced instead of returning a separate sources object?

OK, I guess I should stop here for this issue ;)

@larsoner
Copy link
Member

Most places we treat ndarray and list interchangably, so I like the first solution, i.e., just always use the _check_exclude and inside that, do whatever logic is necessary to get the correct set. list(set(list(x) + list(y))) seems fine.

@jeythekey
Copy link
Contributor Author

I just added a small convenience PR related to ica plotting. It should be harmless, since it only passes arguments through from ica.plot_components to ica.plot_properties.

@agramfort
Copy link
Member

agramfort commented Feb 23, 2019 via email

@jeythekey
Copy link
Contributor Author

@agramfort The ica.exclude handling is already implemented as you both suggested in this commit

Concerning the 2nd point: Yes, if I understood you right. In the moment, if I want to get the sources for a large raw instance, it doubles the data in memory. I can obviously say raw = ica.get_sources(raw), but it would still copy the data temporarily and this is not the way other ica-functions do it (e.g. score_sources). If I understand it right, in ica._transform the data is temporarily duplicated even 2 times.
But I cannot judge if making everything in-place would actually benefit memory efficiency and would be worth the effort, since maybe data gets copied anyway inside the np.dot functions.

@agramfort
Copy link
Member

agramfort commented Feb 23, 2019 via email

mmagnuski pushed a commit that referenced this issue May 3, 2019
* Fix sorting in descending order in ica._detect_artifacts (was sortin in ascending order).

* Fix sfreq selection in ica._band_pass_filter (called (only) in ica.score_sources to filter sources and targets) to use the raw sfreq rather than the one used for fitting the ica.

* Incorporate missed suggestion from #5620 + summary of discussion

* fix PEP8

* fix typo in doc

* Allow both exclude and ica.exclude to be array_like by always converting to list first.

* ica.detect_artifacts: Corrected doc according to code and changed some default values to yield the same compos as before

* Added test for checking descending sorting in detect_artifacts for integer criterion

* Stylistic improvements thanks to jona-sassenhagen

* Fix test for score-sorting

* pep8 and whats_new.rst

* Remove links to privat funcs in whats_new.rst

* fix links in whats_new

* STY: No end period

* Futher improvements thanks to mmagnuski and a few further corrections in the docstrings of run_ica and detect_artifacts

* Added warning for new sorting-direction in detect_artifacts; added whats_new from #6247

* improve bug section; remove warning again; typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants