bug: DOCX references are not extracted properly

### Bug
DOCX references are not extracted properly.

PDF conversion works flawlessly.

### Steps to reproduce

Consider the two uploaded files.

#### DOCX

*Filename: docling_docx_test.py*

```python
from io import BytesIO
from pathlib import Path

from docling.datamodel.base_models import DocumentStream
from docling.document_converter import DocumentConverter
from docling.exceptions import ConversionError

file = Path("Drought_Manuscript_mini.docx")
filename = file.name
buf = BytesIO(file.read_bytes())
source = DocumentStream(name=filename, stream=buf)
converter = DocumentConverter()
result = converter.convert(source)
doc = result.document.export_to_markdown()
print(doc)
```

*Output*

```
Drought is one of the most complex and least understood natural disasters, 
causing significant agricultural, hydrological, and socioeconomic impacts . 
Annually, about 55 million people worldwide experience droughts, posing major 
threats to livestock and crops. Droughts jeopardize livelihoods, increase 
disease and mortality risks, and prompt massive migration . By 2030, water 
scarcity will affect 40% of the global population, with up to 700 million 
people at risk of displacement due to drought . Climate change exacerbates 
these issues, leading to prolonged dry periods, unrest, and population 
movements . Recently, the severity of drought events has intensified, 
amplifying their effects on ecosystems and agriculture as a result of 
climate change consequences . Drought substantially negatively impacts 
agricultural production and income in India, with production decreasing 
by 85% and income by 93% during drought years ().
```

#### PDF

*Filename: docling_pdf_test.py*

```python
from io import BytesIO
from pathlib import Path

from docling.datamodel.base_models import DocumentStream
from docling.document_converter import DocumentConverter
from docling.exceptions import ConversionError

file = Path("Drought_Manuscript_mini.pdf")
filename = file.name
buf = BytesIO(file.read_bytes())
source = DocumentStream(name=filename, stream=buf)
converter = DocumentConverter()
result = converter.convert(source)
doc = result.document.export_to_markdown()
print(doc)
```

*Output*

```
Drought is one of the most complex and least understood natural disasters, 
causing significant agricultural, hydrological, and socioeconomic 
impacts (Hagman G 1984). Annually, about 55 million people worldwide 
experience droughts, posing major threats to livestock  and crops. Droughts 
jeopardize livelihoods, increase disease and mortality risks, and prompt 
massive migration (VERMA et al. 2023). By 2030, water scarcity will 
affect 40% of the global population, with up to 700 million people at risk 
of displacement due to drought (World Health Organization (WHO) 2024). 
Climate change exacerbates these issues, leading to prolonged dry periods, 
unrest, and population movements (de Bruin et al. 2018). Recently, the 
severity of drought events has intensified, amplifying their effects on 
ecosystems and agriculture as a result of climate change 
consequences (Hammouri 2022). Drought substantially   negatively   
impacts   agricultural   production   and   income   in   India,   
with   production decreasing by 85% and income by 93% during 
drought years ((Prasad et al. 2023)).
```

### Docling version 2.28.2

### Python version
3.12

[Drought_Manuscript_mini.docx](https://github.com/user-attachments/files/19482304/Drought_Manuscript_mini.docx)
[Drought_Manuscript_mini.pdf](https://github.com/user-attachments/files/19482305/Drought_Manuscript_mini.pdf)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: DOCX references are not extracted properly #1250

Bug

Steps to reproduce

DOCX

PDF

Docling version 2.28.2

Python version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: DOCX references are not extracted properly #1250

Description

Bug

Steps to reproduce

DOCX

PDF

Docling version 2.28.2

Python version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions