Skip to content

noDatabaseForPredictions=True fails with KeyError: 'Cluster Labels' #17

@aretaon

Description

@aretaon

Error description

ComplexFinder does not run in no-database mode.

How to reproduce

Starting from the example files provided the following code works:

from ComplexFinder.src.main import ComplexFinder
X = "ComplexFinder/example-data/D0"
ComplexFinder(analysisName = "ExampleRun_01",
              runName = "ExampleRun_01_noDB",
              idColumn = "Uniprot ID",
              noDatabaseForPredictions=True,
              grouping={'WT': 'D0_aebersold.txt'}).run(X)

returns

-> 1672 df = df.sort_values(by="Cluster Labels")
KeyError: 'Cluster Labels'

However, running the same code with noDatabaseForPredictions=False leads to no errors.

System properties

OS: Fedora 40
Python: 3.8
Dependencies:
jupyter = ">=1.1.1,<2"
asteval = "<=0.9.19"
certifi = "<=2022.12.7"
cycler = "<=0.10.0"
cython = "<=0.29.21"
future = "<=0.18.2"
hdbscan = "<=0.8.29"
joblib = "<=1.2.0"
kiwisolver = "<=1.3.1"
llvmlite = "<=0.34.0"
lmfit = "<=1.0.1"
matplotlib = "<=3.3.2"
numba = "<=0.51.2"
numpy = "<=1.22.0"
pandas = "<=1.1.4"
pillow = "<=9.3.0"
pyparsing = "<=2.4.7"
python-dateutil = "<=2.8.1"
pytz = "<=2020.4"
scikit-learn = "<=0.23.2"
scipy = "<=1.5.4"
seaborn = "<=0.11.0"
six = "<=1.15.0"
threadpoolctl = "<=2.1.0"
umap-learn = ">=0.5.0"
uncertainties = "<=3.1.4"
imbalanced-learn = ">=0.7.0,<0.8"

Full traceback:

KeyError Traceback (most recent call last)
/tmp/ipykernel_1245616/3156604555.py in ?()
----> 1 ComplexFinder(analysisName = "ExampleRun_01",
2 runName = "ExampleRun_01_noDB",
3 idColumn = "Uniprot ID",
4 noDatabaseForPredictions=True,

~/nvme_data/Projects/ComplexFinder_debug/ComplexFinder/src/main.py in ?(self, X, maxValueToOne)
2207 groupFileNames = [groupFileNames]
2208 self._clusterInteractions(combinedInteractions,groupFiles = groupFileNames,groupName = groupName)
2209 else:
2210 print("Info :: Cluster Interactions")
-> 2211 self._clusterInteractions(None)
2212
2213
2214 self.params["runTimes"]["Interaction Clustering and Embedding"] = time.time() - endTrainingTime

~/nvme_data/Projects/ComplexFinder_debug/ComplexFinder/src/main.py in ?(self, predInts, clusterMethod, plotEmbedding, groupFiles, combineProbs, groupName)
1668 umapKwargs = self.params["umapDefaultKwargs"],
1669 generateSquareMatrix = True,
1670 )
1671 df = pd.DataFrame().from_dict({"Entry":intLabels,"Cluster Labels({})".format(analysisName):clusterLabels,"reachability":reachability,"core_distances":core_distances})
-> 1672 df = df.sort_values(by="Cluster Labels")
1673 df = df.set_index("Entry")
1674
1675 if pooledDistances is not None:

~/nvme_data/Projects/ComplexFinder_debug/.pixi/envs/default/lib/python3.8/site-packages/pandas/core/frame.py in ?(self, by, axis, ascending, inplace, kind, na_position, ignore_index, key)
5294 else:
5295 from pandas.core.sorting import nargsort
5296
5297 by = by[0]
-> 5298 k = self._get_label_or_level_values(by, axis=axis)
5299
5300 # need to rewrap column in Series to apply key function
5301 if key is not None:

~/nvme_data/Projects/ComplexFinder_debug/.pixi/envs/default/lib/python3.8/site-packages/pandas/core/generic.py in ?(self, key, axis)
1559 values = self.xs(key, axis=other_axes[0])._values
1560 elif self._is_level_reference(key, axis=axis):
1561 values = self.axes[axis].get_level_values(key)._values
1562 else:
-> 1563 raise KeyError(key)
1564
1565 # Check for duplicates
1566 if values.ndim > 1:

KeyError: 'Cluster Labels'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions