Skip to content

Conversation

fflores97
Copy link
Contributor

@fflores97 fflores97 commented Jun 19, 2025

More general solution to pandoc path issues I encountered. I re-use path lookup logic that looks for the quarto binary to find the pandoc binary, and added a setting in case the user wants to hardcode it.

Would love some help with the frequent paths for the pandoc binary and just contributing to this repo in general.

Thanks!

@fflores97 fflores97 changed the title Add pandoc path lookup and optional setting feat: Add pandoc path lookup and optional setting Jun 19, 2025
@cderv
Copy link
Contributor

cderv commented Jun 19, 2025

Hi! Thanks for your contribution.

Can you reexplain the problem you are trying solve? I understand this is because of Nix specific build and setup for QUarto, right?

To remind you of the context, Quarto works with an expected version of Pandoc. Quarto's installer does bundle Pandoc, and the VSCODE extension currently looks in there.
Nix has a specific setup because it rebuilds Quarto by using an external Pandoc.
Quarto allows that through QUARTO_PANDOC environment variable, but quarto is still meant to work with the expected version of Pandoc. When use with another version, there is no warranty of 💯 % working.

Nix package does set QUARTO_PANDOC: https://github.com/NixOS/nixpkgs/blob/9cd27c52f82d1d8122b11c5c3556cccc12805d86/pkgs/development/libraries/quarto/default.nix#L62

So I wonder if the "fix" should be to look for QUARTO_PANDOC env var if Pandoc is not found in the expected place inside the Quarto install folder.

Nix could also add symlink in the expected place.

I hope it makes sense. I am trying to prevent side effects if the VSCode extension starts to detect unwanted Pandoc.

@fflores97
Copy link
Contributor Author

Thank you, Christophe. The problem is that the NixOS quarto package doesn't bundle pandoc inside it like in other distributions. In fact, nixpkgs explicitely deletes it, so the default pandocPath variable won't consider other pandoc paths.

I like your idea better, though. As far as I can tell, currently nothing in the codebase uses QUARTO_PANDOC, but I see it as a good chance to use it. I just tried this and could get visual edit mode to work with just the env variable. Very elegant!

@cderv
Copy link
Contributor

cderv commented Jun 20, 2025

Yes, I know they remove it - they set the env var to the real location. They could also have done symlink I guess.

The env var is the expected advanced way to change Pandoc used by Quarto, so I would use this instead of trying to detect outside of Quarto.

@fflores97
Copy link
Contributor Author

Check out the current status of my branch. I reset all my changes to go with just the env variable (plus some auto-formatting apparently). Works for my use. Would love your guidance to eventually get this merged

@cderv cderv changed the title feat: Add pandoc path lookup and optional setting feat: Consider QUARTO_PANDOC for Pandoc path lookup Jul 29, 2025
@cderv cderv requested review from cscheid and vezwork July 29, 2025 20:20
@vezwork
Copy link
Collaborator

vezwork commented Jul 29, 2025

I pulled this PR and tested it locally. I built with yarn run dev-vscode and ran an extension host. Previewing a basic qmd and using the visual editor continue to work, it does not seem to cause any issues.

Looks good to me.

@vezwork
Copy link
Collaborator

vezwork commented Jul 29, 2025

@cderv and @cscheid are there security considerations around setting the pandocPath based on an env variable? Could this be used to cause the Quarto extension to open an arbitrary executable pointed to by QUARTO_PANDOC?

@cderv
Copy link
Contributor

cderv commented Jul 30, 2025

QUARTO_PANDOC is an env var that can be set already in quarto CLI itself.
This is a logic we have for all our tools - code is here in CLI (e.g. QUARTO_TYPST)
https://github.com/quarto-dev/quarto-cli/blob/bf038503cd0290ef040288e763174497f5bd7d0b/src/core/resources.ts#L39-L46

We know this is used in Nix Packaging (https://github.com/NixOS/nixpkgs/blob/dc9637876d0dcc8c9e5e22986b857632effeb727/pkgs/development/libraries/quarto/default.nix#L62-L66) and in conda-forge packaging (https://github.com/conda-forge/quarto-feedstock)
They are using this because they want to create a Quarto bundle that does not bundled their own pandoc version, but use other pandoc version available on the packaging system. Same with other tools.

Nix does that by setting the env var QUARTO_PANDOC at run time, and I am seeing now that Conda-forge does that by using QUARTO_PANDOC at build time and maybe at run time (I would need to check that).

Anyhow, they both may not have the expected version where we look for it

if (quartoInstall) {
// use cmd suffix for older versions of quarto on windows
const windows = os.platform() == "win32";
const useCmd = windows && semver.lte(quartoInstall.version, "1.1.162");
let pandocPath = path.join(quartoInstall!.binPath, "tools", "pandoc");
// more recent versions of quarto use architecture-specific tools dir,
// if the pandocPath is not found then look in the requisite dir for this arch
if (!windows && !fs.existsSync(pandocPath)) {
pandocPath = path.join(
path.dirname(pandocPath),
isArm_64() ? "aarch64" : "x86_64",
path.basename(pandocPath)
);
}

So it seems interesting to make this configurable for those environment configuration.

Any thoughts now you know this context ?

@cscheid
Copy link
Contributor

cscheid commented Aug 21, 2025

@cderv and @cscheid are there security considerations around setting the pandocPath based on an env variable? Could this be used to cause the Quarto extension to open an arbitrary executable pointed to by QUARTO_PANDOC?

The direct answer here is "yes". But this is more of a foot gun than it is a security issue. The difference is due to the threat models at play. quarto-cli works much like a workflow orchestrator, something like a (huge) make replacement that only works for calling Pandoc in (extremely fancy) particular ways.

The reason that analogy is important is that make also calls arbitrary binaries in the way that Quarto does. The closest analogy is the CC env var convention that stands for "C Compiler". Often the actual system C compiler is determined by an environment variable, and make just goes along with it. If someone pollutes your environment and makes CC point to a malicious binary, You Are In Trouble.

That's a real consideration for how you should think about CC when calling make, in the same way that you should think about QUARTO_PANDOC when calling quarto.

We need to fix this by documenting this behavior in https://quarto.org/docs/advanced/environment-vars.html (same for QUARTO_TYPST, etc)

@cderv
Copy link
Contributor

cderv commented Aug 25, 2025

Thanks for the clarification @cscheid !

Still ok to merge this right ?

Copy link
Contributor

@cscheid cscheid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and would make the behavior of the extension match that of quarto-cli.

@cscheid
Copy link
Contributor

cscheid commented Aug 27, 2025

@fflores97 Thanks for the contribution, we will merge it. Although we don't need a contributor agreement for a small PR like this, you might want to send one so that's one fewer step in the future. If you're willing to do that, we have instructions you can follow in our website.

@cderv cderv merged commit 3fe790b into quarto-dev:main Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants