Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinAlgError: singular matrix #1502

Open
Kzernobog opened this issue Jul 18, 2018 · 11 comments

Comments

Projects
None yet
10 participants
@Kzernobog
Copy link

commented Jul 18, 2018

'LinAlgError: singular matrix' error pops up when trying to call the pairplot() function.

@mwaskom

This comment has been minimized.

Copy link
Owner

commented Jul 19, 2018

Can't help without a reproducible example, sorry.

@bicycle1885

This comment has been minimized.

Copy link

commented Jul 27, 2018

Hi, this is a (simplified) case I encountered while working on seaborn.pairplot.

data: data.1000.txt

singular.py:

import pandas
import seaborn
data = pandas.read_table("data.1000.txt", index_col="cell")
seaborn.pairplot(data.drop(columns="cluster").iloc[:,0:6], hue="batch")

commands:

~/w/experiments $ python3 --version
Python 3.6.6
~/w/experiments $ pip3 show seaborn
Name: seaborn
Version: 0.9.0
Summary: seaborn: statistical data visualization
Home-page: https://seaborn.pydata.org
Author: Michael Waskom
Author-email: mwaskom@nyu.edu
License: BSD (3-clause)
Location: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages
Requires: numpy, scipy, pandas, matplotlib
Required-by:
~/w/experiments $ python3 singular.py
Traceback (most recent call last):
  File "singular.py", line 4, in <module>
    seaborn.pairplot(data.drop(columns="cluster").iloc[:,0:6], hue="batch")
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/axisgrid.py", line 2111, in pairplot
    grid.map_diag(kdeplot, **diag_kws)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/axisgrid.py", line 1399, in map_diag
    func(data_k, label=label_k, color=color, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/distributions.py", line 691, in kdeplot
    cumulative=cumulative, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/distributions.py", line 294, in _univariate_kdeplot
    x, y = _scipy_univariate_kde(data, bw, gridsize, cut, clip)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/distributions.py", line 366, in _scipy_univariate_kde
    kde = stats.gaussian_kde(data, bw_method=bw)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scipy/stats/kde.py", line 172, in __init__
    self.set_bandwidth(bw_method=bw_method)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scipy/stats/kde.py", line 499, in set_bandwidth
    self._compute_covariance()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scipy/stats/kde.py", line 510, in _compute_covariance
    self._data_inv_cov = linalg.inv(self._data_covariance)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scipy/linalg/basic.py", line 975, in inv
    raise LinAlgError("singular matrix")
numpy.linalg.linalg.LinAlgError: singular matrix

bicycle1885 added a commit to bicycle1885/seaborn that referenced this issue Jul 30, 2018

@bicycle1885 bicycle1885 referenced a pull request that will close this issue Jul 30, 2018

Open

remove hue from pairplot variables (fix #1502) #1525

@zhoudaxia233

This comment has been minimized.

Copy link

commented Sep 26, 2018

Hello, I encountered the same situation, do you know how can I make it work without removing the hue parameter?
@bicycle1885 @mwaskom

@mwaskom

This comment has been minimized.

Copy link
Owner

commented Sep 26, 2018

My guess is that it's getting raised when trying to do a KDE on a single observation. You could use a histogram on the diagonal, instead of a kde, which will probably be more robust.

@zhoudaxia233

This comment has been minimized.

Copy link

commented Sep 26, 2018

My guess is that it's getting raised when trying to do a KDE on a single observation. You could use a histogram on the diagonal, instead of a kde, which will probably be more robust.

Now it works! Thank you very much mwaskom 😄

@100518832

This comment has been minimized.

Copy link

commented Sep 26, 2018

I have the same error

@kevinglasson

This comment has been minimized.

Copy link

commented Oct 9, 2018

Having the same problem, had to install conda to get it to work without getting the linalg error. I'm on a mac too

@cemanughian

This comment has been minimized.

Copy link

commented Nov 7, 2018

Im having this issue with kdeplot too

@paulbrodersen

This comment has been minimized.

Copy link

commented Nov 9, 2018

I think the kdeplot fails when any of the variables is integer (or discrete with large bin sizes). The average k-nearest distance is then 0 (for not too large k), which then screws over the kernel width estimation of the KDE.

@emanuelef

This comment has been minimized.

Copy link

commented Nov 30, 2018

Similar issue but only when using Python 3, Python 2 with same data works fine.

@mjdv

This comment has been minimized.

Copy link

commented Dec 18, 2018

Same problem. Another easy working example is using the "Eighth-Grade Pupils in the Netherlands" data set as follows.

import pandas as pd
import seaborn as sns

df = pd.read_csv("http://vincentarelbundock.github.io/Rdatasets/csv/MASS/nlschools.csv", index_col=0)
sns.pairplot(df, hue="COMB")

However, it does work if you remove the COMB column from the data to be plotted.

Repository owner locked and limited conversation to collaborators Dec 18, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.