Skip to content

Commit

Permalink
Fix assisted split
Browse files Browse the repository at this point in the history
Add different chain for tests
Add tests infrastructure
Some refactor
  • Loading branch information
sbrunner committed Mar 31, 2019
1 parent 764c1e1 commit 4200638
Show file tree
Hide file tree
Showing 35 changed files with 884 additions and 387 deletions.
17 changes: 17 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
version: 2
jobs:
build:
machine: true
steps:
- checkout
- run: docker build --tag=sbrunner/scan-to-paperless .
- run: docker build --tag=tests tests
- run: docker run --rm tests pylint process
- run: docker run --rm tests python3 -m pyflakes .
- run: docker run --rm tests bandit -r process
- run: docker run --rm tests mypy --ignore-missing-imports --disallow-untyped-defs --strict-optional --follow-imports skip /opt
- run: docker run --rm tests codespell --check-filenames --skip=./Deskew/*,*.pyc,*.png
- run: docker run --rm --env=PYTHONPATH=/opt/ --volume=`pwd`/results:/results tests bash -c 'mv /opt/process /opt/process.py; cd /tests; pytest --verbose --color=yes .'
- store_artifacts:
path: results
destination: tests-results
3 changes: 3 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*
!process
!requirements.txt
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/results/
1 change: 1 addition & 0 deletions .pylintrc
4 changes: 3 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
FROM ubuntu:cosmic

COPY requirements.txt /tmp/

RUN \
. /etc/os-release && \
apt-get update && \
Expand All @@ -10,7 +12,7 @@ RUN \
apt-get update && \
apt-get install --assume-yes --no-install-recommends scantailor scantailor-advanced && \
apt-get clean && \
pip3 install pyyaml numpy scipy opencv-python && \
pip3 install --requirement=/tmp/requirements.txt && \
rm --recursive --force /var/lib/apt/lists/* && \
curl http://galfar.vevb.net/store/deskew-125.zip > /tmp/deskew-125.zip && \
unzip /tmp/deskew-125.zip -d /opt && \
Expand Down
34 changes: 34 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,40 @@ The scan utils will rotate and reorder all the sheets to get a good document.

The options `--append-credit-card` will append all the sheets vertically to have the booth face of the credit card on the same page.

### Assisted split

1. Do your scan as usual with the extra option `--assisted-split`.

2. After the process do his first pass you will have images with lines and numbers.
The lines represent the detected potential split of the image, the length indicate the strength of the detection.
In your config you will have somthing like:

```
assisted_split:
- destinations:
- 4 # Page number of the left part of the image
- 1 # Same for the right page of the image
image: image-1.png # name of the image
limits:
- margin: 0 # Margin around the split
name: 0 # Number visible on the generated image
value: 375 # The position of the split (can be manually edited)
vertical: true # Will split the image vertically
- ...
source: /source/975468/7-assisted-split/image-1.png
- ...
```

Edit your config file, you should have one more destination then the limits.
If you put destinatination like that: 2.1, it mean that it will be the first part of the page 2 and the 2.2 will be the secound part.

3. Delete the file `REMOVE_TO_CONTINUE`.

4. After the process do his first pass you will have the final generated images.

5. If it's OK delete the file `REMOVE_TO_CONTINUE`.


## Install

Install in a venv in the home directory:
Expand Down
20 changes: 20 additions & 0 deletions post_db_create.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
ALTER TABLE documents_correspondent ALTER COLUMN name TYPE text;
ALTER TABLE documents_correspondent ALTER COLUMN slug TYPE text;
ALTER TABLE documents_correspondent ALTER COLUMN match TYPE text;
ALTER TABLE documents_tag ALTER COLUMN name TYPE text;
ALTER TABLE documents_tag ALTER COLUMN slug TYPE text;
ALTER TABLE documents_tag ALTER COLUMN match TYPE text;
ALTER TABLE documents_document ALTER COLUMN title TYPE text;
ALTER TABLE documents_document ALTER COLUMN title TYPE text;

DROP INDEX documents_document_content_aa150741;
DROP INDEX documents_document_content_aa150741_like;

ALTER TABLE django_admin_log ALTER COLUMN object_repr TYPE text;

CREATE EXTENSION pg_trgm;
DROP INDEX documents_document_content;
CREATE INDEX documents_document_content ON documents_document USING GIN (upper(content) gin_trgm_ops);
CREATE INDEX documents_correspondent_name ON documents_correspondent USING GIN (upper(name) gin_trgm_ops);
CREATE INDEX documents_document_title ON documents_document USING GIN (upper(title) gin_trgm_ops);
CREATE INDEX documents_tag_name ON documents_tag USING GIN (upper(name) gin_trgm_ops);

0 comments on commit 4200638

Please sign in to comment.