Skip to content
Converts scanned documents and stores them on Google Drive
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
assets
cmd s2d-up: listen on the first public network address Mar 20, 2018
examples Initial commit May 17, 2016
internal switch from now-defunct Google+ API to oauth2 API Apr 16, 2019
proto
templates switch from now-defunct Google+ API to oauth2 API Apr 16, 2019
.travis.yml
LICENSE
Makefile
README.md
binarize.go
bundle.go
convert.go
goembed.go
local.go
local_arm64.go
main.go
preferremote.go Switch from deprecated State() to a grpc.Balancer May 3, 2017
rotate.go add rotate180 May 1, 2017
scan2drive.png add screenshot (to be used by README.md) May 16, 2016
scan2drive.service
setup.go
web.go
writepdf.go

README.md

scan2drive

scan2drive screenshot

scan2drive is a Raspberry Pi 3-based appliance (with a web interface) for scanning, converting and uploading physical documents to Google Drive.

During the conversion step, scan2drive skips empty pages and converts the rest from multi-megabyte JPEGs into a kilobyte-sized PDF. This allows you to use Google Drive’s OCR-based full text search.

Both the originals and the converted PDF are uploaded to Google Drive, so that you can enjoy full text search but still have the full-quality originals just in case.

In comparison to the native Google Drive connectivity which some document scanner vendors provide, scan2drive has these main advantages:

  1. scan2drive integrates with the scan button of your document scanner. You press one button and your documents will end up on Google Drive. Other solutions require you to use a mobile app or software on your PC.
  2. scan2drive is self-hosted and depends only on Google Drive being available, not the scanner vendor’s cloud integration service. Many vendors send documents into their own clouds and then to Google Drive. You are welcome to archive the scan directory of scan2drive to other places you see fit, in case there are any issues with Google Drive.
  3. scan2drive converts the scanned documents into a PDF which is small enough to be full text indexed by Google Drive, but it also retains the original JPEGs in case you need them.

Project status and vision

Currently, there are a number of open issues and not all functionality might work well. Use at your own risk!

The project vision is described above. Notably, scan2drive is already feature complete. We don’t want to add any more features to it than it currently has.

scan2drive was published in the hope that it could be useful to others, but the main author has no time to create an active community around it or accept contributions in a timely manner. All support, development and bug fixes are strictly best effort.

Directory structure

The scans directory (-scans_dir flag) contains the following files:

  • <sub>/ is the per-user directory under which scans are placed
  • 2016-05-09-21:05:02+0200/ is a directory for an individual scan
    • page*.jpg are the raw pages obtained by calling scanimage
    • scan.pdf is the converted PDF
    • thumb.png is the first page of the converted PDF for display in the UI
    • COMPLETE.* are empty files recording which individual processing steps are done

Any file in the scans directory can be deleted at will, with the caveat that deleting scans before the COMPLETE.uploadoriginals file is present will result in that scan being irrevocably lost.

The state directory (-state_dir flag) contains the following files:

  • cookies.key is a secret key with which cookies are encrypted
  • sessions/ contains session contents
  • users/ is a directory containing per-user data
  • users/<sub>/ is a directory for an individual user
    • drive_folder.json contains information about the selected destination Google Drive folder. In case this file is deleted, the user will need to re-select the destination folder and scans cannot be uploaded until a new destination folder has been selected.
    • token.json contains the offline OAuth token for accessing Google Drive on behalf of the user. In case this file is deleted, the user will need to re-login. In case this file is leaked, the user should revoke the token

Installation

First, install gokrazy.

Then, pack the github.com/stapelberg/scan2drive Go package.

As an example, assuming your SD card is accessible as /dev/sdb:

gokr-packer -overwrite=/dev/sdb github.com/stapelberg/scan2drive

Boot your Raspberry Pi 3 from this SD card and connect the Fujitsu ScanSnap iX500 document scanner via USB (no other scanner is supported).

You should be able to access the gokrazy web interface at the URL which gokr-packer printed. To access the scan2drive web interface, switch to port 7120.

You can’t perform that action at this time.