Skip to content

hahongchul/sparrow

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sparrow - Data extraction from documents with ML

Description

Sparrow helps to extract and process data from scanned documents and pictures. It works with forms, invoices, receipts and other structured data.

It implements various methods to extract data. Version 1 of Skipper was focused on Donut ML model. Version 2 focus is on-premise LLMs.

Modules

  • sparrow-data - Sparrow Data module. Data for Donut ML model fine-tuning and OCR services
  • sparrow-ml - Sparrow ML module. Donut ML model fine-tuning and LLM RAG services
  • sparrow-ui - Sparrow UI module. UI for Donut ML model services and dashboard UI for LLM RAG

Inference with Donut ML model

Inference Results

Author

Katana ML, Andrej Baranovskij

License

Licensed under the Apache License, Version 2.0. Copyright 2020-2023 Katana ML, Andrej Baranovskij. Copy of the license.

About

Data extraction from documents with ML

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 86.2%
  • Python 12.6%
  • Other 1.2%