-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Citation reader in Word document issue #9116
Comments
Nothing has changed in this code for a long time, so I doubt previous versions behaved differently, but if you can confirm it that would be interesting. |
Hello! It is most likely that I am mistaken about past behaviour. However, I think it would then be a possible change in the writer. When I look at the xml inside the docx (see image bellow) I see that the citation has a '"suppress-author":true' field which looks promising in this direction. I don't handle Haskell, but it occurs to me that this would be a possible modification, so that it includes a "-" when this field appears in xml citation (if this field appears it should change from '[@1580]' to '[-@1580]'). Do you see this as a possible change? |
Yes, it should be possible! |
I believe |
Hm, I can't find any documentation in the CSL spec for |
@bdarcus would probably know more about the current state of things. |
It's not part of the CSL spec; perhaps coming from Zotero/citeproc-js? |
I ask you to better understand the problem. As far as I can see, Pandoc "cite" objects have a "citationMode" field that can be set to "SuppressAuthor" (see image). The problem would be that the field in question is not supported by CLS, but it is supported by Pandoc?
I found this. I understand that it is not at all convenient to add random fields. But would this specification be enough to justify its inclusion in Pandoc? I think it could be quite a useful feature and I can't think of any other way to handle it (for example with a custom reader or a filter ). I'm seeing that citeproc-pandoc handles page numbers as suffixes. What is the reason for this? In citeproc-js the 'label' and 'locator' fields are used for this. I understand that most of the time this is not a problem, but it can be a drawback when using other languages (as in my case). However, this is a lower priority, because it can be easily handled with a filter. (However, perhaps this is another issue.) Thank you very much for your work and dedication. |
So this appears to be a citeproc-js add-on. Page numbers as suffixes: although pandoc's AST doesn't separate locators and labels from the rest of the suffix, there is code in pandoc that does this before calling citeproc. So it should work properly. Standard locator label abbreviations (as defined in citeproc for your locale) should work, but make sure |
I've been trying to understand a bit about how this works. As I said, although I've been reading about Haskell, I must confess that I'm having a hard time understanding it. To understand how this aspect works, I've been looking at the commits 73fe7c1, 0011c95, 9ef8650, 60caa0a and e07c0e7. Related issue #7840. In the docx file, the "suppress-author" field (: True) contains the necessary information. This field does not appear when the citation is "normal". Considering that, as @jgm comments, the problem is in line 526 of Docx.hs, I think adding something similar to this code could help:
I don't know how it will handle |
I tried at the time to modify the Haskell code, but was unsuccessful. With this in mind I developed a little trick that allows me to have a workaround until someone who understands Haskell better can tackle it. The trick consists in using a lua filter to detect in the text of the pandoc.Cite those that start with numbers (or with the 'suffix' + a number). To work, this requires that Zotero-Word uses some kind of (author, year) based citation schema. I leave you the filter code and I hope this can help if someone comes across the same problem.
|
Explain the problem.
I'm trying to convert a file from Word to Markdown, but I'm getting some problems with converting citations that don't include the author. In the output the citation is interpreted as having an author. I suspect the problem is in the reader, because I can identify the problem in the intermediate AST. I think this is a bug and I'm pretty sure it didn't happen in previous versions of pandoc.
I attach a minimal example (EX.zip) with a Word document, as well as the file with the reference (.json).
Thanks!
The command line I am using is:
The result obtained is:
But I would expect to get this in the body:
Pandoc version?
Pandoc 3.1.8
The text was updated successfully, but these errors were encountered: