Support docu versioning: changed detection of unstructured data

### Description of the new feature

The parser should be able to detect and handle the different versions of the same PDF document to generate high-quality metadata for your Retrieval Augmented Generation (RAG) system. By extracting structured data from unstructured documents, we can filter results more effectively and drastically improve retrieval accuracy.

### Proposed technical implementation details

refer this video (l[ink](https://www.youtube.com/watch?v=RPpGIxmdZYs&t=112s)). This example is based on LangExtract with Gemini and Ollma models if required. Refer to this link and image for more details.

<img width="1517" height="467" alt="Image" src="https://github.com/user-attachments/assets/52aac82c-71a1-42fb-acc1-613029f040f6" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support docu versioning: changed detection of unstructured data #17

Description of the new feature

Proposed technical implementation details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Support docu versioning: changed detection of unstructured data #17

Description

Description of the new feature

Proposed technical implementation details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions