Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinAlgError: singular matrix #1502

Closed
kamiradi opened this issue Jul 18, 2018 · 12 comments · Fixed by #1525
Closed

LinAlgError: singular matrix #1502

kamiradi opened this issue Jul 18, 2018 · 12 comments · Fixed by #1525

Comments

@kamiradi
Copy link

'LinAlgError: singular matrix' error pops up when trying to call the pairplot() function.

@mwaskom
Copy link
Owner

mwaskom commented Jul 19, 2018

Can't help without a reproducible example, sorry.

@bicycle1885
Copy link
Contributor

bicycle1885 commented Jul 27, 2018

Hi, this is a (simplified) case I encountered while working on seaborn.pairplot.

data: data.1000.txt

singular.py:

import pandas
import seaborn
data = pandas.read_table("data.1000.txt", index_col="cell")
seaborn.pairplot(data.drop(columns="cluster").iloc[:,0:6], hue="batch")

commands:

~/w/experiments $ python3 --version
Python 3.6.6
~/w/experiments $ pip3 show seaborn
Name: seaborn
Version: 0.9.0
Summary: seaborn: statistical data visualization
Home-page: https://seaborn.pydata.org
Author: Michael Waskom
Author-email: mwaskom@nyu.edu
License: BSD (3-clause)
Location: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages
Requires: numpy, scipy, pandas, matplotlib
Required-by:
~/w/experiments $ python3 singular.py
Traceback (most recent call last):
  File "singular.py", line 4, in <module>
    seaborn.pairplot(data.drop(columns="cluster").iloc[:,0:6], hue="batch")
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/axisgrid.py", line 2111, in pairplot
    grid.map_diag(kdeplot, **diag_kws)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/axisgrid.py", line 1399, in map_diag
    func(data_k, label=label_k, color=color, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/distributions.py", line 691, in kdeplot
    cumulative=cumulative, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/distributions.py", line 294, in _univariate_kdeplot
    x, y = _scipy_univariate_kde(data, bw, gridsize, cut, clip)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/seaborn/distributions.py", line 366, in _scipy_univariate_kde
    kde = stats.gaussian_kde(data, bw_method=bw)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scipy/stats/kde.py", line 172, in __init__
    self.set_bandwidth(bw_method=bw_method)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scipy/stats/kde.py", line 499, in set_bandwidth
    self._compute_covariance()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scipy/stats/kde.py", line 510, in _compute_covariance
    self._data_inv_cov = linalg.inv(self._data_covariance)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scipy/linalg/basic.py", line 975, in inv
    raise LinAlgError("singular matrix")
numpy.linalg.linalg.LinAlgError: singular matrix

@zhoudaxia233
Copy link

zhoudaxia233 commented Sep 26, 2018

Hello, I encountered the same situation, do you know how can I make it work without removing the hue parameter?
@bicycle1885 @mwaskom

@mwaskom
Copy link
Owner

mwaskom commented Sep 26, 2018

My guess is that it's getting raised when trying to do a KDE on a single observation. You could use a histogram on the diagonal, instead of a kde, which will probably be more robust.

@zhoudaxia233
Copy link

My guess is that it's getting raised when trying to do a KDE on a single observation. You could use a histogram on the diagonal, instead of a kde, which will probably be more robust.

Now it works! Thank you very much mwaskom 😄

@100518832
Copy link

I have the same error

@elidhu
Copy link

elidhu commented Oct 9, 2018

Having the same problem, had to install conda to get it to work without getting the linalg error. I'm on a mac too

@cemanughian
Copy link

Im having this issue with kdeplot too

@paulbrodersen
Copy link

I think the kdeplot fails when any of the variables is integer (or discrete with large bin sizes). The average k-nearest distance is then 0 (for not too large k), which then screws over the kernel width estimation of the KDE.

@emanuelef
Copy link

Similar issue but only when using Python 3, Python 2 with same data works fine.

@mjdv
Copy link

mjdv commented Dec 18, 2018

Same problem. Another easy working example is using the "Eighth-Grade Pupils in the Netherlands" data set as follows.

import pandas as pd
import seaborn as sns

df = pd.read_csv("http://vincentarelbundock.github.io/Rdatasets/csv/MASS/nlschools.csv", index_col=0)
sns.pairplot(df, hue="COMB")

However, it does work if you remove the COMB column from the data to be plotted.

Repository owner locked and limited conversation to collaborators Dec 18, 2018
@mwaskom
Copy link
Owner

mwaskom commented Sep 4, 2019

Fixed in #1823

@mwaskom mwaskom closed this as completed Sep 4, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants