Add scikit-fingerprints #359

Hrovatin · 2024-09-03T04:50:36Z

Replace mordred and rdkit fingerprints with scikit-fingerpints and enable other fingerprints from the package. Aim to remove rdkit and mordred install.

Dev in: https://github.com/Hrovatin/baybe/tree/feature/scikit_fingerprints

Notes/Discuss:

Functions that use RDKit but are not fingerprint related - do we keep RDKit then?
- is_valid_smiles: not used anywhere
- get_canonical_smiles
New automatic fingerprint naming will not be backward-compatible
mordred check in edbo - can this be used for any fingeprint (before was mordred and rdkit)
Consider making Fingerprint enum a class to make code prettier (see TODOs in enum code) - EDIT: Not relevant anymore

The text was updated successfully, but these errors were encountered:

Scienfitz · 2024-09-03T07:32:22Z

Scienfitz · 2024-09-03T09:48:34Z

its strange that is_valid_smiles is not used soemwhere, we definitley used to valdiate SMILES at some point

but in here I see that the value corresponding to SMILES are validated with a different logic in @data.validator and not using is_valid_smiles @AdrianSosic any idea why?

wouldnt value_validator=is_valid_smiles make the most sense? Or was there an issue with lazy loading?

AdrianSosic · 2024-09-03T09:57:49Z

Was refactored at some point to handle smiles in canonical form, which also does the check internally (see validator method):

But the other function was kept because it's still useful in its own right.

Hrovatin · 2024-09-03T14:54:08Z

@Scienfitz I ran pytest -fast and there are two errs that I am not sure about - if you could provide some guidance that would be great

FAILED tests/test_searchspace.py::test_searchspace_memory_estimate[grid5-parameter_names0] - AssertionError: ('Comp: ', 699840, 563760)
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-AtomPairFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...

One more question about the test - do I need to run them separately in env where CHEM is not installed?

Hrovatin · 2024-09-03T15:03:31Z

Also, do I need to do sth to re-generate documentation svgs or will it be done automatically? I guess examples/Backtesting/full_lookup.py creates some of these - should I run it?

Scienfitz · 2024-09-03T15:38:39Z

please can you use
tox -e fulltest-py310
tox -e coretest-py312
(and also
tox -e lint-py312
tox -e mypy-py312
for other tests)

will probably give you the same error but to exclude that its any environment misconfiguration

I have a suspicion for the first error, but impossible to help without seeing the code. You can already open the PR in draft mode

Scienfitz · 2024-09-03T15:39:09Z

dont care about the pictures at this moment, they actually shouldnt change much if the fingerprints from the package are implemented identically

AVHopp · 2024-09-03T16:05:50Z

Regarding pictures: Once everything else is fixed, just ping me about the pictures @Hrovatin . I can then give you a heads-up/we can discuss how to update pictures, but as Martin says, this is not really relevant at the moment.

Hrovatin · 2024-09-04T04:47:32Z

Test results. For mypy I need to do a few updates and will add once finished.

tox -p -e lint-py312
  lint-py312: OK (19.49=setup[1.84]+cmd[0.01,17.63] seconds)
  congratulations :) (19.81 seconds)

tox -p -e coretest-py312
  coretest-py312: OK (269.43=setup[72.26]+cmd[0.01,197.16] seconds)
  congratulations :) (269.77 seconds)

tox -p -e fulltest-py310
=================================================================================================== short test summary info ====================================================================================================
FAILED tests/docs/test_examples.py::test_example[examples/Serialization/basic_serialization.py] - subprocess.CalledProcessError: Command '['python', 'examples/Serialization/basic_serialization.py']' returned non-zero exit status 1.
FAILED tests/test_iterations.py::test_kernels[b3-grid5-i3-AdditiveKernel3] - torch._C._LinAlgError: linalg.eigh: (Batch element 0): The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated eigenvalues (error code: 2).
FAILED tests/test_searchspace.py::test_searchspace_memory_estimate[grid5-parameter_names0] - AssertionError: ('Comp: ', 699840, 563760)
FAILED tests/test_searchspace.py::test_searchspace_memory_estimate[grid8-parameter_names0] - AssertionError: ('Comp: ', 1119744, 902016)
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-AtomPairFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-AutocorrFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-AvalonFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-E3FPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-ECFPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-ERGFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-EStateFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-FunctionalGroupsFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-GETAWAYFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-GhoseCrippenFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-KlekotaRothFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-LaggnerFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-LayeredFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-LingoFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-MACCSFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-MAPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-MHFPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-MORSEFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-MQNsFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-MordredFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-PatternFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-PharmacophoreFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-PhysiochemicalPropertiesFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-PubChemFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-RDFFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-RDKit2DDescriptorsFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-RDKitFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-SECFPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-TopologicalTorsionFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-USRCATFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-USRFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-WHIMFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid5-DefaultFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-AtomPairFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-AutocorrFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-AvalonFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-E3FPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-ECFPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-ERGFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-EStateFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-FunctionalGroupsFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-GETAWAYFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-GhoseCrippenFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-KlekotaRothFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-LaggnerFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-LayeredFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-LingoFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-MACCSFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-MAPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-MHFPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-MORSEFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-MQNsFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-MordredFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-PatternFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-PharmacophoreFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-PhysiochemicalPropertiesFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-PubChemFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-RDFFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-RDKit2DDescriptorsFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-RDKitFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-SECFPFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-TopologicalTorsionFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-USRCATFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-USRFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-WHIMFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
FAILED tests/test_substance_parameter.py::test_run_iterations[b3-i2-grid8-DefaultFingerprint] - baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...
==================================================================================== 70 failed, 1561 passed, 4 skipped in 405.38s (0:06:45) ====================================================================================
fulltest-py310: exit 1 (410.36 seconds) /Users/karinhrovatin/Documents/code/baybe-Hrovatin> pytest -p no:warnings --cov=baybe --durations=5 pid=77919
  fulltest-py310: FAIL code 1 (413.63=setup[3.27]+cmd[0.00,410.36] seconds)
  evaluation failed :( (414.00 seconds)

Hrovatin · 2024-09-04T05:49:10Z

For mypy I have multiple issues with SubstanceEncoding, for which I would anyway suggest changes, as briefly mentioned above.
So I did not resolve them for now.

baybe/parameters/enum.py:51: error: Unexpected keyword argument "names" for "ParameterEncoding"  [call-arg]
baybe/parameters/substance.py:60: error: Variable "baybe.parameters.enum.SubstanceEncoding" is not valid as a type  [valid-type]
baybe/parameters/substance.py:60: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-aliases
baybe/parameters/substance.py:60: error: No overload variant of "field" matches argument types "Any", "ParameterEncoding"  [call-overload]
baybe/parameters/substance.py:60: note: Possible overload variants:
baybe/parameters/substance.py:60: note:     def field(*, default: None = ..., validator: None = ..., repr: bool | Callable[[Any], str] = ..., hash: bool | None = ..., init: bool = ..., metadata: Mapping[Any, Any] | None = ..., converter: None = ..., factory: None = ..., kw_only: bool = ..., eq: bool | None = ..., order: bool | None = ..., on_setattr: Callable[[Any, Attribute[Any], Any], Any] | list[Callable[[Any, Attribute[Any], Any], Any]] | _NoOpType | None = ..., alias: str | None = ..., type: type | None = ...) -> Any
baybe/parameters/substance.py:60: note:     def [_T] field(*, default: None = ..., validator: Callable[[Any, Attribute[_T], _T], Any] | Sequence[Callable[[Any, Attribute[_T], _T], Any]] | None = ..., repr: bool | Callable[[Any], str] = ..., hash: bool | None = ..., init: bool = ..., metadata: Mapping[Any, Any] | None = ..., converter: Callable[[Any], Any] | Converter[Any, _T] | None = ..., factory: Callable[[], _T] | None = ..., kw_only: bool = ..., eq: bool | Callable[[Any], Any] | None = ..., order: bool | Callable[[Any], Any] | None = ..., on_setattr: Callable[[Any, Attribute[Any], Any], Any] | list[Callable[[Any, Attribute[Any], Any], Any]] | _NoOpType | None = ..., alias: str | None = ..., type: type | None = ...) -> _T
baybe/parameters/substance.py:60: note:     def [_T] field(*, default: _T, validator: Callable[[Any, Attribute[_T], _T], Any] | Sequence[Callable[[Any, Attribute[_T], _T], Any]] | None = ..., repr: bool | Callable[[Any], str] = ..., hash: bool | None = ..., init: bool = ..., metadata: Mapping[Any, Any] | None = ..., converter: Callable[[Any], Any] | Converter[Any, _T] | None = ..., factory: Callable[[], _T] | None = ..., kw_only: bool = ..., eq: bool | Callable[[Any], Any] | None = ..., order: bool | Callable[[Any], Any] | None = ..., on_setattr: Callable[[Any, Attribute[Any], Any], Any] | list[Callable[[Any, Attribute[Any], Any], Any]] | _NoOpType | None = ..., alias: str | None = ..., type: type | None = ...) -> _T
baybe/parameters/substance.py:60: note:     def [_T] field(*, default: _T | None = ..., validator: Callable[[Any, Attribute[_T], _T], Any] | Sequence[Callable[[Any, Attribute[_T], _T], Any]] | None = ..., repr: bool | Callable[[Any], str] = ..., hash: bool | None = ..., init: bool = ..., metadata: Mapping[Any, Any] | None = ..., converter: Callable[[Any], Any] | Converter[Any, _T] | None = ..., factory: Callable[[], _T] | None = ..., kw_only: bool = ..., eq: bool | Callable[[Any], Any] | None = ..., order: bool | Callable[[Any], Any] | None = ..., on_setattr: Callable[[Any, Attribute[Any], Any], Any] | list[Callable[[Any, Attribute[Any], Any], Any]] | _NoOpType | None = ..., alias: str | None = ..., type: type | None = ...) -> Any
baybe/parameters/substance.py:61: error: "ParameterEncoding" has no attribute "DefaultFingerprint"  [attr-defined]
baybe/parameters/substance.py:61: error: Unsupported converter, only named functions, types and lambdas are currently supported  [misc]
baybe/parameters/substance.py:123: error: SubstanceEncoding? has no attribute "name"  [attr-defined]
Found 6 errors in 2 files (checked 102 source files)
mypy-py312: exit 1 (2.70 seconds) /Users/karinhrovatin/Documents/code/baybe-Hrovatin> mypy pid=82438
  mypy-py312: FAIL code 1 (5.70=setup[2.99]+cmd[0.01,2.70] seconds)
  evaluation failed :( (5.99 seconds)

Scienfitz · 2024-09-05T07:10:58Z

Functions that use RDKit but are not fingerprint related - do we keep RDKit then?

I think rdkit is a main dep of skfp so we do not have to decide and can keep all other funcs

New automatic fingerprint naming will not be backward-compatible

Is ideally designed to coincide with the namign scheme ie dropping capitalization and Fingerprints. might need an alias/deprecation for the morgan one

mordred check in edbo - can this be used for any fingeprint (before was mordred and rdkit)

Yes for now

Consider making Fingerprint enum a class to make code prettier (see TODOs in enum code)

not sure what you mean but Adrian raised the one point: If we generate the enums automatically, would that destroy the tab completion when I type SubstanceEncoding.<TAB>? Can you check? If so we should not generate the encoding automatically int his PR and leave it for a potential upcoming solution.

Scienfitz · 2024-09-05T07:16:36Z

Regarding Errors

AssertionError: ('Comp: ', 699840, 563760)

I suspect the missing dtype cast messes with the size estimation vs actual size in the memory test. E.g. if a fingeprrint returns some of their columns as int the estimation that these are all floats32 doesnt hold anymore

The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated eigenvalues (error code: 2). (and all other errors re numericals like decomposion, ill defined matrix etc)

Ingore, they appear 40% of the time at random

baybe.exceptions.NotEnoughPointsLeftError: Using the current settings, there are fewer than 3 possible data points left to recommend. This can be either because all data points have been measured at some point (while 'a...

No clear idea. Seems like the overall contruction of the parameter computational representation comp_df is not correct. Did you look at some of those (and compare eg with the one you get from a non substance parameter) ?

Closes #359

Scienfitz assigned Hrovatin Sep 3, 2024

Scienfitz added the new feature New functionality label Sep 3, 2024

Hrovatin mentioned this issue Sep 4, 2024

Enable scikit-fingerprints #364

Merged

Scienfitz closed this as completed in #364 Nov 11, 2024

Scienfitz added a commit that referenced this issue Nov 11, 2024

Merge: Enable scikit-fingerprints (#364)

b7297fa

Closes #359

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scikit-fingerprints #359

Add scikit-fingerprints #359

Hrovatin commented Sep 3, 2024 •

edited

Loading

Scienfitz commented Sep 3, 2024 •

edited by Hrovatin

Loading

Scienfitz commented Sep 3, 2024 •

edited

Loading

AdrianSosic commented Sep 3, 2024

Hrovatin commented Sep 3, 2024 •

edited

Loading

Hrovatin commented Sep 3, 2024 •

edited

Loading

Scienfitz commented Sep 3, 2024

Scienfitz commented Sep 3, 2024

AVHopp commented Sep 3, 2024

Hrovatin commented Sep 4, 2024

Hrovatin commented Sep 4, 2024

Scienfitz commented Sep 5, 2024

Scienfitz commented Sep 5, 2024

Add scikit-fingerprints #359

Add scikit-fingerprints #359

Comments

Hrovatin commented Sep 3, 2024 • edited Loading

Scienfitz commented Sep 3, 2024 • edited by Hrovatin Loading

Scienfitz commented Sep 3, 2024 • edited Loading

AdrianSosic commented Sep 3, 2024

Hrovatin commented Sep 3, 2024 • edited Loading

Hrovatin commented Sep 3, 2024 • edited Loading

Scienfitz commented Sep 3, 2024

Scienfitz commented Sep 3, 2024

AVHopp commented Sep 3, 2024

Hrovatin commented Sep 4, 2024

Hrovatin commented Sep 4, 2024

Scienfitz commented Sep 5, 2024

Scienfitz commented Sep 5, 2024

Hrovatin commented Sep 3, 2024 •

edited

Loading

Scienfitz commented Sep 3, 2024 •

edited by Hrovatin

Loading

Scienfitz commented Sep 3, 2024 •

edited

Loading

Hrovatin commented Sep 3, 2024 •

edited

Loading

Hrovatin commented Sep 3, 2024 •

edited

Loading