Skip to content


Subversion checkout URL

You can clone with
Download ZIP


fix for information loss on footnotes/endnotes within XWPFRun.toString #3

wants to merge 1 commit into from

2 participants


Dear Apache POI Team,

Please consider a problem: whenever MS-Word document with footnotes/endnotes is being parsed with XWPFWordExtractor, information on the location of footnote/endnote references is lost. This information loss is clearly observed in, for example, Apache Tika output.

To reproduce a problem, please insert the following code to TestXWPFWordExtractor.testFootnotes: w = new"user.home"), "footnotes.output.txt"));
    try {
    } finally {

then run tests and inspect the content of "footnotes.output.txt" - it contains "Eto ochen prostoy text so snoskoy", where between "prostoy" and "text" there should be a footnote reference (and it is lost).

I suggest to introduce additional markup like [footnoteRef:num], [endnoteRef:num], which will allow applications to correctly render footnote references.

Please, see commit details.


Thanks, committed in r1492308. (That should mirror through to git shortly)

@Gagravarr Gagravarr referenced this pull request from a commit
@Gagravarr Gagravarr Patch from akhikhl from github pull #3 - Extract references from XWPF…
… footnotes

git-svn-id: 13f79535-47bb-0310-9956-ffa450edef68
@ischindl ischindl referenced this pull request from a commit in ischindl/poi
@Gagravarr Gagravarr Patch from akhikhl from github pull #3 - Extract references from XWPF…
… footnotes

git-svn-id: 13f79535-47bb-0310-9956-ffa450edef68
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
7 src/ooxml/java/org/apache/poi/xwpf/usermodel/
@@ -52,6 +52,7 @@ Licensed to the Apache Software Foundation (ASF) under one or more
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTDrawing;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTEmpty;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTFonts;
+import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTFtnEdnRef;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTHpsMeasure;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTOnOff;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPTab;
@@ -817,6 +818,12 @@ public String toString() {
+ if (o instanceof CTFtnEdnRef) {
+ CTFtnEdnRef ftn = (CTFtnEdnRef)o;
+ String footnoteRef = ftn.getDomNode().getLocalName().equals("footnoteReference") ?
+ "[footnoteRef:" + ftn.getId().intValue() + "]" : "[endnoteRef:" + ftn.getId().intValue() + "]";
+ text.append(footnoteRef);
+ }
10 src/ooxml/testcases/org/apache/poi/xwpf/extractor/
@@ -166,8 +166,9 @@ public void testHeadersFooters() throws IOException {
public void testFootnotes() throws IOException {
XWPFDocument doc = XWPFTestDataSamples.openSampleDocument("footnotes.docx");
XWPFWordExtractor extractor = new XWPFWordExtractor(doc);
- assertTrue(extractor.getText().contains("snoska"));
+ String text = extractor.getText();
+ assertTrue(text.contains("snoska"));
+ assertTrue(text.contains("Eto ochen prostoy[footnoteRef:1] text so snoskoy"));
@@ -190,8 +191,9 @@ public void testFormFootnotes() throws IOException {
public void testEndnotes() throws IOException {
XWPFDocument doc = XWPFTestDataSamples.openSampleDocument("endnotes.docx");
XWPFWordExtractor extractor = new XWPFWordExtractor(doc);
- assertTrue(extractor.getText().contains("XXX"));
+ String text = extractor.getText();
+ assertTrue(text.contains("XXX"));
+ assertTrue(text.contains("tilaka [endnoteRef:2]or 'tika'"));
public void testInsertedDeletedText() throws IOException {
Something went wrong with that request. Please try again.