Skip to content

feat(linux): improve dev setup for Linux platform#364

Merged
Davidnet merged 7 commits into
dataiku:mainfrom
s3bc40:feat/linux-dev-setup
Apr 30, 2026
Merged

feat(linux): improve dev setup for Linux platform#364
Davidnet merged 7 commits into
dataiku:mainfrom
s3bc40:feat/linux-dev-setup

Conversation

@s3bc40
Copy link
Copy Markdown
Contributor

@s3bc40 s3bc40 commented Apr 25, 2026

Summary

  • Adds Linux-specific Makefile targets to streamline development environment setup
  • Updates getting started documentation to include Linux installation steps
  • Expands development guide with Linux-specific workflows and instructions

Changes

  • Makefile: added 23 lines of Linux dev setup targets/commands
  • docs/01-getting-started.md: updated with Linux platform guidance (+13/-7 lines)
  • docs/02-development-guide.md: expanded Linux development instructions (+31/-14 lines)

- Add make setup-tokenizers target that downloads the pre-built
  libtokenizers.a from GitHub Releases using the version pinned in
  go.mod, auto-detecting Linux x86_64 or macOS ARM64/x86_64 —
  no Rust/cargo required. Inspired by src/scripts/build_linux.sh.
- Fix ONNX Runtime install commands in dev guide to include Linux
  .so variants alongside the existing macOS .dylib commands
- Replace outdated tokenizer build steps with make setup-tokenizers
- Add troubleshooting entries for ruff not found and golangci-lint v2

Closes dataiku#362

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread Makefile
@hanneshapke
Copy link
Copy Markdown
Collaborator

@Davidnet Do we need PLATFORM="linux-aarch64";?

@hanneshapke
Copy link
Copy Markdown
Collaborator

Thank you @s3bc40 for your PR. we'll review it on Monday and get back to you.

Comment thread Makefile
Comment thread docs/01-getting-started.md Outdated
sudo systemctl status kiji-proxy
```

> **Note (Linux):** The binary resolves the data directory (`~/.kiji-proxy/`) via the home directory of the user it runs as. The `useradd -r` flag creates a system user with no home directory, which causes a startup failure. Create the home directory manually before starting the service:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I was thinking that maybe we could use something like a DynamicUser:

[Service]
ExecStart=/opt/kiji-privacy-proxy/kiji-proxy
DynamicUser=yes
# This automatically creates /var/lib/kiji-proxy with correct permissions
StateDirectory=kiji-proxy
# Tell your app to use that directory
Environment=KIJI_DATA_PATH=/var/lib/kiji-proxy

and then we check for paths in this order:

  • A CLI flag (e.g., --data-dir)

  • An environment variable (e.g., KIJI_DATA_PATH)

  • XDG_DATA_HOME (Defaults to ~/.local/share if unset)

  • ~/.kiji-proxy

  • then /var/lib/kiji-proxy

What would be your opinion, my idea will be to avoid creating a user folder in the system

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! I was not aware of the DynamicUser solution, and it's way better than manually adding everything in home dir. If we follow along XDG system while keeping potential legacy (of chosen location .kiji-proxy/), that would cover a lot of potential issues 👌


I was thinking for the implementation to update the AppDataDir() in the Go backend (path/path.go) if it fits.

Something like this if I understood:

func AppDataDir() string {
    // macOS
    if runtime.GOOS == "darwin" {
        homeDir, _ := os.UserHomeDir()
        return filepath.Join(homeDir, "Library", "Application Support", "Kiji Privacy Proxy")
    }

    // Linux
    if p := os.Getenv("KIJI_DATA_PATH"); p != "" {
        return p
    }
    if xdg := os.Getenv("XDG_DATA_HOME"); xdg != "" {
        return filepath.Join(xdg, "kiji-proxy")
    }
    if homeDir, err := os.UserHomeDir(); err == nil {
        return filepath.Join(homeDir, ".kiji-proxy")
    }
    return "/var/lib/kiji-proxy"
}

(I'll seriously learn Go soon, to avoid any potential errors and understand the Go best practices)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

Warning

This PR touches 3+ distinct areas of the codebase.

Consider splitting into smaller, focused PRs — each covering a single semantic type.
This makes reviews easier and keeps the git history clean.

Categories found:

docs:

  • docs/01-getting-started.md
  • docs/02-development-guide.md

code:

  • src/backend/paths/paths.go

chore:

  • Makefile
  • src/scripts/build_linux.sh

test:

  • src/backend/paths/paths_test.go

@s3bc40
Copy link
Copy Markdown
Contributor Author

s3bc40 commented Apr 28, 2026

@hanneshapke & @Davidnet thanks for your feedback. I applied the following changes to make it more suitable to your needs.


###go list for tokenizers version (Makefile)
Swapped the awk parse for go list -m -f '{{.Version}}' as suggested.

AppDataDir Linux priority chain (paths/paths.go)

Added the full chain: KIJI_DATA_PATH → XDG_DATA_HOME/kiji-proxy → ~/.kiji-proxy → /var/lib/kiji-proxy.
Darwin untouched.

Systemd DynamicUser (build_linux.sh)

Replaced User=kiji / Group=kiji / WorkingDirectory with

DynamicUser=yes + StateDirectory=kiji-proxy +
Environment="KIJI_DATA_PATH=/var/lib/kiji-proxy"

Systemd handles directory creation and ownership at
/var/lib/kiji-proxy, the priority chain picks it up via KIJI_DATA_PATH.

Docs (docs/01-getting-started.md)

Removed the useradd -r and workaround note.


Tested locally on Pop!_OS:

  • Unit tests: all 4 priority cases pass (go test ./src/backend/paths/... -v)
  • Binary smoke test: verified startup log Using SQLite database at <path> for each priority step including
    KIJI_DATA_PATH=/var/lib/kiji-proxy (when possible since $HOME is always set...)
  • bash src/scripts/build_linux.sh → extracted tarball, confirmed service file matches expected output
  • ./run.sh on the packaged binary → correct path resolved at startup

Any feedbacks are welcome ! I'll continue to setup and test locally to be used to the project and help as much as possible on the Linux side or anything else.

@Davidnet
Copy link
Copy Markdown
Member

Hi @s3bc40 I will be starting the review now

Comment thread docs/02-development-guide.md Outdated
@@ -104,9 +102,13 @@ source .venv/bin/activate
# Install ONNX Runtime
pip install onnxruntime
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind doing something like:

GO_VER=$(go list -m -f '{{.Version}}' github.com/yalue/onnxruntime_go | sed 's/^v//') && \
pip install "onnxruntime==${GO_VER}"

just to have the versions synced

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will sync the macos in another branch

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run the command but it seems there is a difference between PyPi version and Go version of onnxruntime

Right now on PyPi it is v1.25.1 -> https://pypi.org/project/onnxruntime/

But Go runtime output:

-> % echo $GO_VER
1.27.0

Which triggers the following mismatch:

-> % GO_VER=$(go list -m -f '{{.Version}}' github.com/yalue/onnxruntime_go | sed 's/^v//') && \
uv pip install "onnxruntime==${GO_VER}"
  × No solution found when resolving dependencies:
  ╰─▶ Because there is no version of onnxruntime==1.27.0 and you require onnxruntime==1.27.0, we can
      conclude that your requirements are unsatisfiable.

Do you want me to run another or fix anything on my end ? Thanks for your review 👍

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it seems that the author is suggesting that we hardcode:

pkg.go.dev

maybe then:

uv pip install "onnxruntime==1.25.0"

and we leave a link to the above discussion, let me know and then we can merge this,

Thanks for the PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Davidnet updated as requested ! I checked the version in each commands to set v1.25.0 and let a note about the ONNX runtime lib, with the link you sent me and this discussion for context.

@Davidnet Davidnet merged commit 74fb181 into dataiku:main Apr 30, 2026
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants