Skip to content

Commit

Permalink
Merge branch 'Stirling-Tools:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
jimdouk committed May 28, 2024
2 parents caa5525 + 0f4eb83 commit 52e9689
Show file tree
Hide file tree
Showing 588 changed files with 129,063 additions and 113,597 deletions.
2 changes: 1 addition & 1 deletion .github/FUNDING.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
otechie: # Replace with a single Otechie username
lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry
custom: ['https://paypal.me/froodleplex?country.x=GB&locale.x=en_GB'] # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']
custom: ['https://www.paypal.com/donate/?hosted_button_id=MN7JPG5G6G3JL'] # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']
8 changes: 1 addition & 7 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,8 @@ name: "Build repo"
on:
push:
branches: ["main"]
paths-ignore:
- ".github/**"
- "**/*.md"
pull_request:
branches: ["main"]
paths-ignore:
- ".github/**"
- "**/*.md"

jobs:
build:
Expand All @@ -36,7 +30,7 @@ jobs:

- uses: gradle/actions/setup-gradle@v3
with:
gradle-version: 7.6
gradle-version: 8.7

- name: Build with Gradle
run: ./gradlew build --no-build-cache
2 changes: 1 addition & 1 deletion .github/workflows/push-docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ jobs:

- uses: gradle/actions/setup-gradle@v3
with:
gradle-version: 7.6
gradle-version: 8.7

- name: Run Gradle Command
run: ./gradlew clean build
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/releaseArtifacts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:

- uses: gradle/actions/setup-gradle@v3
with:
gradle-version: 7.6
gradle-version: 8.7

- name: Generate jar (With Security=${{ matrix.enable_security }})
run: ./gradlew clean createExe
Expand Down
9 changes: 9 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,15 @@ jobs:
sudo curl -SL "https://github.com/docker/compose/releases/download/v2.26.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
# sudo chmod +x /usr/local/bin/docker-compose
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.7"

- name: Pip requirements
run: |
pip install -r ./cucumber/requirements.txt
- name: Run Docker Compose Tests
run: |
chmod +x ./test.sh
Expand Down
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -124,4 +124,7 @@ watchedFolders/

# Ignore Mac DS_Store files
.DS_Store
**/.DS_Store
**/.DS_Store

#cucumber
/cucumber/reports/**
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ If you would like to add or modify a translation, please see [How to add new lan

## Docs

Documentation for Stirling-PDF is handled in a seperate repository. Please see [Docs repository](https://github.com/Stirling-Tools/Stirling-Tools.github.io) or use "edit this page"-button at the bottom of each page at [https://stirlingtools.com/docs/](https://stirlingtools.com/docs/).
Documentation for Stirling-PDF is handled in a separate repository. Please see [Docs repository](https://github.com/Stirling-Tools/Stirling-Tools.github.io) or use "edit this page"-button at the bottom of each page at [https://stirlingtools.com/docs/](https://stirlingtools.com/docs/).

## Fixing Bugs or Adding a New Feature

Expand Down
27 changes: 12 additions & 15 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,44 +1,42 @@
# Main stage
FROM alpine:20240329
FROM alpine:3.20.0

# Copy necessary files
COPY scripts /scripts
COPY pipeline /pipeline
COPY src/main/resources/static/fonts/*.ttf /usr/share/fonts/opentype/noto/
COPY src/main/resources/static/fonts/*.otf /usr/share/fonts/opentype/noto/
#COPY src/main/resources/static/fonts/*.otf /usr/share/fonts/opentype/noto/
COPY build/libs/*.jar app.jar

ARG VERSION_TAG


# Set Environment Variables
ENV DOCKER_ENABLE_SECURITY=false \
VERSION_TAG=$VERSION_TAG \
JAVA_TOOL_OPTIONS="$JAVA_TOOL_OPTIONS -XX:MaxRAMPercentage=75" \
HOME=/home/stirlingpdfuser \
PUID=1000 \
HOME=/home/stirlingpdfuser \
PUID=1000 \
PGID=1000 \
UMASK=022


# JDK for app
RUN echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/main" | tee -a /etc/apk/repositories && \
echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/community" | tee -a /etc/apk/repositories && \
echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/testing" | tee -a /etc/apk/repositories && \
apk update && \
apk upgrade --no-cache -a && \
apk add --no-cache \
ca-certificates \
tzdata \
tini \
openssl \
openssl-dev \
bash \
curl \
openjdk17-jre \
su-exec \
shadow \
su-exec \
openssl \
openssl-dev \
openjdk21-jre \
# Doc conversion
libreoffice@testing \
libreoffice \
# pdftohtml
poppler-utils \
# OCR MY PDF (unpaper for descew and other advanced featues)
Expand All @@ -60,10 +58,9 @@ openssl-dev \
addgroup -S stirlingpdfgroup && adduser -S stirlingpdfuser -G stirlingpdfgroup && \
chown -R stirlingpdfuser:stirlingpdfgroup $HOME /scripts /usr/share/fonts/opentype/noto /configs /customFiles /pipeline && \
chown stirlingpdfuser:stirlingpdfgroup /app.jar && \
tesseract --list-langs && \
rm -rf /var/cache/apk/*
tesseract --list-langs

EXPOSE 8080
EXPOSE 8080/tcp

# Set user and run command
ENTRYPOINT ["tini", "--", "/scripts/init.sh"]
Expand Down
26 changes: 12 additions & 14 deletions Dockerfile-ultra-lite
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# use alpine
FROM alpine:3.19.1
FROM alpine:3.20.0

ARG VERSION_TAG

Expand All @@ -8,7 +8,7 @@ ENV DOCKER_ENABLE_SECURITY=false \
HOME=/home/stirlingpdfuser \
VERSION_TAG=$VERSION_TAG \
JAVA_TOOL_OPTIONS="$JAVA_TOOL_OPTIONS -XX:MaxRAMPercentage=75" \
PUID=1000 \
PUID=1000 \
PGID=1000 \
UMASK=022

Expand All @@ -18,34 +18,32 @@ COPY scripts/init-without-ocr.sh /scripts/init-without-ocr.sh
COPY pipeline /pipeline
COPY build/libs/*.jar app.jar


# Set up necessary directories and permissions

RUN mkdir /configs /logs /customFiles && \
chmod +x /scripts/*.sh && \
RUN echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/main" | tee -a /etc/apk/repositories && \
echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/community" | tee -a /etc/apk/repositories && \
echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/testing" | tee -a /etc/apk/repositories && \
apk upgrade --no-cache -a && \
apk add --no-cache \
ca-certificates \
tzdata \
tini \
bash \
curl \
su-exec \
shadow \
openjdk17-jre && \
echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/main" | tee -a /etc/apk/repositories && \
echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/community" | tee -a /etc/apk/repositories && \
echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/testing" | tee -a /etc/apk/repositories && \
su-exec \
openjdk21-jre && \
# User permissions
mkdir /configs /logs /customFiles && \
chmod +x /scripts/*.sh && \
addgroup -S stirlingpdfgroup && adduser -S stirlingpdfuser -G stirlingpdfgroup && \
chown -R stirlingpdfuser:stirlingpdfgroup $HOME /scripts /configs /customFiles /pipeline && \
chown stirlingpdfuser:stirlingpdfgroup /app.jar

# Set environment variables
ENV ENDPOINTS_GROUPS_TO_REMOVE=CLI

EXPOSE 8080

ENTRYPOINT ["tini", "--", "/scripts/init-without-ocr.sh"]
EXPOSE 8080/tcp

# Run the application
ENTRYPOINT ["tini", "--", "/scripts/init-without-ocr.sh"]
CMD ["java", "-Dfile.encoding=UTF-8", "-jar", "/app.jar"]
80 changes: 71 additions & 9 deletions LocalRunGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ You could theoretically use a Distrobox/Toolbox, if your Distribution has old or

Install the following software, if not already installed:

- Java 17 or later
- Java 17 or later (21 recommended)

- Gradle 7.0 or later (included within repo so not needed on server)

Expand Down Expand Up @@ -42,17 +42,25 @@ For Debian-based systems, you can use the following command:

```bash
sudo apt-get update
sudo apt-get install -y git automake autoconf libtool libleptonica-dev pkg-config zlib1g-dev make g++ openjdk-17-jdk python3 python3-pip
sudo apt-get install -y git automake autoconf libtool libleptonica-dev pkg-config zlib1g-dev make g++ openjdk-21-jdk python3 python3-pip
```

For Fedora-based systems use this command:

```bash
sudo dnf install -y git automake autoconf libtool leptonica-devel pkg-config zlib-devel make gcc-c++ java-17-openjdk python3 python3-pip
sudo dnf install -y git automake autoconf libtool leptonica-devel pkg-config zlib-devel make gcc-c++ java-21-openjdk python3 python3-pip
```

For non-root users with Nix Package Manager, use the following command:
```bash
nix-channel --update
nix-env -iA nixpkgs.jdk21 nixpkgs.git nixpkgs.python38 nixpkgs.gnumake nixpkgs.libgcc nixpkgs.automake nixpkgs.autoconf nixpkgs.libtool nixpkgs.pkg-config nixpkgs.zlib nixpkgs.leptonica
```

### Step 2: Clone and Build jbig2enc (Only required for certain OCR functionality)

For Debian and Fedora, you can build it from source using the following commands:

```bash
mkdir ~/.git
cd ~/.git &&\
Expand All @@ -64,6 +72,11 @@ make &&\
sudo make install
```

For Nix, you will face `Leptonica not detected`. Bypass this by installing it directly using the following command:
```bash
nix-env -iA nixpkgs.jbig2enc
```

### Step 3: Install Additional Software
Next we need to install LibreOffice for conversions, ocrmypdf for OCR, and opencv for pattern recognition functionality.

Expand Down Expand Up @@ -105,6 +118,13 @@ sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpa
pip3 install uno opencv-python-headless unoconv pngquant WeasyPrint
```

For Nix:

```bash
nix-env -iA nixpkgs.unpaper nixpkgs.libreoffice nixpkgs.ocrmypdf nixpkgs.poppler_utils
pip3 install uno opencv-python-headless unoconv pngquant WeasyPrint
```

### Step 4: Clone and Build Stirling-PDF

```bash
Expand All @@ -115,33 +135,38 @@ chmod +x ./gradlew &&\
./gradlew build
```


### Step 5: Move jar to desired location

After the build process, a `.jar` file will be generated in the `build/libs` directory.
You can move this file to a desired location, for example, `/opt/Stirling-PDF/`.
You must also move the Script folder within the Stirling-PDF repo that you have downloaded to this directory.
This folder is required for the python scripts using OpenCV
This folder is required for the python scripts using OpenCV.

```bash
sudo mkdir /opt/Stirling-PDF &&\
sudo mv ./build/libs/Stirling-PDF-*.jar /opt/Stirling-PDF/ &&\
sudo mv scripts /opt/Stirling-PDF/ &&\
echo "Scripts installed."
```

For non-root users, you can just keep the jar in the main directory of Stirling-PDF using the following command:
```bash
mv ./build/libs/Stirling-PDF-*.jar ./Stirling-PDF-*.jar
```

### Step 6: Other files
#### OCR
If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running non-english scanning.

##### Installing Language Packs
Easiest is to use the langpacks provided by your repositories. Skip the other steps
Easiest is to use the langpacks provided by your repositories. Skip the other steps.

Manual:

1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tessdata`
3.
Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.
3. Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.

**IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED.

Debian based systems, install languages with this command:
Expand Down Expand Up @@ -171,14 +196,38 @@ dnf search -C tesseract-langpack-
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
```

Nix:

```bash
nix-env -iA nixpkgs.tesseract
```

**Note:** Nix Package Manager pre-installs almost all the language packs when tesseract is installed.

### Step 7: Run Stirling-PDF

Those who have pushed to the root directory, run the following commands:

```bash
./gradlew bootRun
or
java -jar /opt/Stirling-PDF/Stirling-PDF-*.jar
```

Since libreoffice, soffice, and conversion tools have their dbus_tmp_dir set as `dbus_tmp_dir="/run/user/$(id -u)/libreoffice-dbus"`, you might get the following error when using their endpoints:
```
[Thread-7] INFO s.s.SPDF.utils.ProcessExecutor - mkdir: cannot create directory ‘/run/user/1501’: Permission denied
```
To resolve this, before starting the Stirling-PDF, you have to set the environment variable to a directory you have write access to by using the following commands:

```bash
mkdir temp
export DBUS_SESSION_BUS_ADDRESS="unix:path=./temp"
./gradlew bootRun
or
java -jar ./Stirling-PDF-*.jar
```

### Step 8: Adding a Desktop icon

This will add a modified Appstarter to your Appmenu.
Expand All @@ -202,7 +251,19 @@ EOF

Note: Currently the app will run in the background until manually closed.

### Optional: Run Stirling-PDF as a service
### Optional: Changing the host and port of the application:

To override the default configuration, you can add the following to `/.git/Stirling-PDF/configs/custom_settings.yml` file:

```bash
server:
host: 0.0.0.0
port: 3000
```

**Note:** This file is created after the first application launch. To have it before that, you can create the directory and add the file yourself.

### Optional: Run Stirling-PDF as a service (requires root).

First create a .env file, where you can store environment variables:
```
Expand Down Expand Up @@ -239,6 +300,7 @@ WantedBy=multi-user.target
```

Notify systemd that it has to rebuild its internal service database (you have to run this command every time you make a change in the service file):

```
sudo systemctl daemon-reload
```
Expand Down
Loading

0 comments on commit 52e9689

Please sign in to comment.