Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add no-attachments extension to ipynb format. #8432

Open
lsmor opened this issue Nov 10, 2022 · 5 comments
Open

add no-attachments extension to ipynb format. #8432

lsmor opened this issue Nov 10, 2022 · 5 comments

Comments

@lsmor
Copy link

lsmor commented Nov 10, 2022

Describe your proposed improvement and the problem it solves.

By default, pandoc will create attachments for image links when writing ipynb as stated by the documentation. The way pandoc creates those, is via addAttachment function in the Ipynb writer. Such a function depends on fetchItem which as far as I can tell is meant to read the raw ByteString from file system.

This has two problems:

  • first: if there is an image reference but the image is not in the filesytem then the writer fails completely, instead of producing a broken link to the image
  • second: If working with pandoc library, then Pandoc.runPure can't simply work, because fetchItem can't do any IO to grab the raw bytes of the image. This is shown in the example below, in which runPure fails but runIO doesn't
-- The path to some filesystem image
pathToImage :: String
pathToImage = "./path/to/some_image.png"

-- a raw string representing input data.  This is the native output of a markdown with the following content
-- ![this is an image](./path/to/some_image.png)
pan :: Pandoc
pan = read $ "Pandoc Meta{unMeta = fromList []} [Para[Image (\"\",[],[]) [Str \"this\",Space,Str \"is\",Space,Str\"an\",Space,Str\"image\"] (\"" <> pathToImage <> "\" ,\"fig:\")]]"

main :: IO ()
main = do
    
    -- Fails
    let jupter_pure = Pandoc.runPure $ Pandoc.writeIpynb Pandoc.def pan
    print $ jupter_pure

    putStrLn "****************"

    -- Success
    jupter_txt <- Pandoc.runIO $ Pandoc.writeIpynb Pandoc.def pan
    print $ jupter_txt

I think this change could by adressed by changing function extractCells so instead of calling addAttachment it checks first if the extension is enable. If it is, then do not modify the Inline, else modify it. This will create a regular markdown imagen link instead of an attachment

-- on extractCells function
-- this line
(newdoc, attachments) <-
      runStateT (walkM addAttachment (Pandoc nullMeta xs)) mempty

-- should become something like
(newdoc, attachments) <-
     if new_extension_enable              -- Because this function get WriterOptions as input, it shouldn't be difficutl to check
        then  pure (xs, Map.fromList [])  -- Return the Block unaltered and don't add a thing to the MediaBag
        else  runStateT (walkM addAttachment (Pandoc nullMeta xs)) mempty

I think I can implement this change if you confirm this is the way yo go.

Describe alternatives you've considered.

I read the documentation looking for other options to achive this or manipulating the raw Text produced by writeIpynb (I am working with pandoc the library)

@jgm
Copy link
Owner

jgm commented Nov 10, 2022

first: if there is an image reference but the image is not in the filesytem then the writer fails completely, instead of producing a broken link to the image

This could be addressed by handling the error raised by fetchItem. Would that be a simpler approach than adding a new extension?

@lsmor
Copy link
Author

lsmor commented Nov 10, 2022

This could be addressed by handling the error raised by fetchItem. Would that be a simpler approach than adding a new extension?

Well, the error is simply Left (PandocResourceNotFound "path/to/image.png"). You can handle that... but the goal is to produce and ipynb in which images aren't attachments but regular markdown image links, I don't think handle it helps. Moreover, you still have the problem of not being able to use runPure if using the pandoc library.

I am thinking about a nasty filter which converts images to links with a ! in front. Let me check if that works.

@lsmor
Copy link
Author

lsmor commented Nov 10, 2022

I turned out, that a simple filter can be used. Shame on me! I am closing this as no changes needed.

imageToLink :: Pandoc.Block -> Pandoc.Block
imageToLink (Pandoc.Para (Pandoc.Image attrs inl target:is)) = Pandoc.Para $ Pandoc.Str "!":Pandoc.Link attrs inl target:is
imageToLink i = i

@lsmor lsmor closed this as completed Nov 10, 2022
@jgm
Copy link
Owner

jgm commented Nov 10, 2022

You can handle that... but the goal is to produce and ipynb in which images aren't attachments but regular markdown image links, I don't think handle it helps.

Why not? We could handle the error by just including a regular image link with that path, and issuing a warning.

you still have the problem of not being able to use runPure if using the pandoc library.

fetchItem can be used with runPure. It won't do any actual IO, but it will still look in the ersatz file system, and it will raise an error if nothing else -- which can be trapped.

Let's keep this open.

@jgm jgm reopened this Nov 10, 2022
@lsmor
Copy link
Author

lsmor commented Nov 10, 2022

Maybe I am a little bit lost, are you proposing to actually change the code so writeIpynb handles such an error? I am happy to contribute to that (Notice that my actual problem all images must be links can be solved with a simple filter)

Comming back to writeIpynb, I guess addAttachment can be modified to handle that error and not modify the Image part. I am not so sure how to actually handle the error though. Is there any example within the code base I can look at? (I am not that familiarize with pandoc but I know enough Haskell to follow along the types)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants