Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dm.read_smi() copies a file to an already existing location #150

Closed
cwognum opened this issue Nov 8, 2022 · 6 comments · Fixed by #151
Closed

dm.read_smi() copies a file to an already existing location #150

cwognum opened this issue Nov 8, 2022 · 6 comments · Fixed by #151
Assignees
Labels
bug Something isn't working

Comments

@cwognum
Copy link
Contributor

cwognum commented Nov 8, 2022

When I try to load a remote .smi file using Datamol, I run into the following exception:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
(...)

File ~/mambaforge/envs/mood/lib/python3.10/site-packages/datamol/io.py:320, in read_smi(urlpath)
    318 if not fsspec.utils.can_be_local(str(urlpath)):
    319     active_path = pathlib.Path(tempfile.mkstemp()[1])
--> 320     dm.utils.fs.copy_file(urlpath, active_path)
    322 # Read the molecules
    323 supplier = rdmolfiles.SmilesMolSupplier(str(active_path), titleLine=0)

File ~/mambaforge/envs/mood/lib/python3.10/site-packages/datamol/utils/fs.py:250, in copy_file(source, destination, chunk_size, force, progress, leave_progress)
    247     raise ValueError(f"The file being copied does not exist or is not a file: {source}")
    249 if not force and is_file(destination_file):  # type: ignore
--> 250     raise ValueError(f"The destination file to copy already exists: {destination}")
    252 with source_file as source_stream:
    253     with destination_file as destination_stream:

ValueError: The destination file to copy already exists: /tmp/tmpkzrm7u58

Seems like this function is not up-to-date with recent changes in the dm.utils.fs module? Seems to me like active_path should be set differently to a non-existing file.

@hadim hadim self-assigned this Nov 10, 2022
@hadim hadim added the bug Something isn't working label Nov 10, 2022
@hadim
Copy link
Contributor

hadim commented Nov 10, 2022

Thanks. Using force=True should do the job here.

In general, I don't recommend using this kind of reader/saver and instead use the CSV reader/saver which should work with any .smi files.

@cwognum
Copy link
Contributor Author

cwognum commented Nov 11, 2022

I'm not sure I understand. You're saying .smi files can also be read (or actually; should be read) using dm.read_csv()?

@hadim
Copy link
Contributor

hadim commented Nov 11, 2022

Yes. There is nothing specific to .smi. It's just a csv-like format.

hadim added a commit that referenced this issue Nov 11, 2022
This was linked to pull requests Nov 11, 2022
Merged
This was unlinked from pull requests Nov 11, 2022
Merged
@hadim hadim linked a pull request Nov 11, 2022 that will close this issue
Merged
3 tasks
@cwognum
Copy link
Contributor Author

cwognum commented Nov 11, 2022

Does Datamol implement these read functions separately? Seems like 'read_smi()' should then just be a light wrapper of 'read_csv()'?

@hadim
Copy link
Contributor

hadim commented Nov 11, 2022

Yes I remember hesitating doing that at the very beginning of datamol but I prefer to stick to rdkit here in case the rdkit function is modified. If a user use read_smi he will expect the equivalent rdkit function under the hood.

@hadim
Copy link
Contributor

hadim commented Nov 11, 2022

Thanks for reporting the bug @cwognum !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants