Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: compatibility with pathlib.Path #285

Closed
RensDimmendaal opened this issue Jun 17, 2022 · 1 comment
Closed

Feature request: compatibility with pathlib.Path #285

RensDimmendaal opened this issue Jun 17, 2022 · 1 comment

Comments

@RensDimmendaal
Copy link

Hey there, thanks for the great project!

I'm using it now and I run into the issue that the project is not compatible with pathlib.Path (docs).

When I run this:

from pathlib import Path
import pypandoc
fpath = Path("/path/to/my/file.docx")
output = pypandoc.convert_file(fpath,"markdown_strict")

I get this back:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [27], in <cell line: 1>()
----> 1 output = pypandoc.convert_file(fpath,"markdown_strict")

File ~/.pyenv/versions/miniforge3-4.10.3-10/envs/my-project/lib/python3.9/site-packages/pypandoc/__init__.py:155, in convert_file(source_file, to, format, extra_args, encoding, outputfile, filters, verify_format, sandbox, cworkdir)
    150     return _convert_input(discovered_source_files[0], format, 'path', to, extra_args=extra_args,
    151                       outputfile=outputfile, filters=filters,
    152                       verify_format=verify_format, sandbox=sandbox,
    153                       cworkdir=cworkdir)
    154 else: # behavior for multiple  files or file patterns
--> 155     format = _identify_format_from_path(discovered_source_files[0], format)
    156     return _convert_input(discovered_source_files, format, 'path', to, extra_args=extra_args,
    157                       outputfile=outputfile, filters=filters,
    158                       verify_format=verify_format, sandbox=sandbox,
    159                       cworkdir=cworkdir)

I'm pretty sure it's because the pypandoc checks if fpath is either a string or a list. But it'd be nice if it would also accept pathlib.Path objects. Both for files as for pathlib generators (e.g. pathlib.Path("my/dir/").glob("*.docx")).

Pathlib has some other nice benefits. For example you can also check if the file exists (Path.exists()).

As a workaround I just convert the Paths to strings in my own code.

If you're open to this change I'd be happy to submit a pull request.

@JessicaTegner
Copy link
Owner

Hi there.

Yes I'm very much open to this.
I think the solution here, is to just to check if it's a pathfile object and if it is, to convert it to a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants