GitHub - adacta-io/adacta: Personal Document Archiving

Adacta is a personal document archiving system. It allows to categorize and organize PDF documents for long-term archiving as needed in personal document management.

Its main features are a inbox concept allowing the user to review documents before archiving and a full-text search over all documents for easy retrial. This is all archived while keeping a simple and documented on-disk format avoiding vendor-login and even encourages the usage of common filesystem utilities for mass-operations on documents.

Features

Note: Not all features are implemented yet.

Adacta concentrates on the following features:

Documents, meta-data and all other permanent data is stored in a document repository as ordinary files and folders using common and widely adopted file formats. The filesystem structure and document-formats are well documented.
State which is internal to Adacta can be rebuilt from the document repository. The only thing requiring backup is the repository.
Uploaded PDF documents are OCRed if they do not contain extractable text. The original input document is stored beside the final document allowing to improve the OCR process after the documents have been archived. The whole OCR process is running in a docker container avoiding installing a complex and hard to maintain OCR software stack.
Documents can be tagged. The tagging is aided by machine learning to suggest tags based on the documents text.
Ergonomic Web Interface to review and browse documents.
CLI to interact with documents from the comand line

In contrast, Adacta declares the following non-features:

No multi-user or mandate support and no ACL system. There may be multiple accounts to avoid sharing a password, but all users will always see the same data.
No process management or document state / task tracking. This is pure archiving. While the inbox represents some kind of state, this state is only about the technical processing of the document.

Building

The frontend must be build before the backend. See the README.md in the according folders for further instructions.

Running

Running Adacta requires a running Docker daemon and an Elasticsearch cluster.

After building both, frontend and backend, the backend can be started by running

./backend/target/release/adacta --config path/to/your/adacta.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.circleci		.circleci
backend		backend
cli		cli
contrib		contrib
frontend		frontend
juicer		juicer
proto		proto
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Building

Running

About

Releases

Packages

Languages

License

adacta-io/adacta

Folders and files

Latest commit

History

Repository files navigation

Features

Building

Running

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages