Skip to content


Repository files navigation


Simple Go server that takes a token, command, and text and returns response from Gemma (2B parameter Google LLM). Uses Redis to cache responses. Easy deployment with

Leverages Gemma CPP.

Developed for the RapidRead feature in GhostRemix.

Table of Contents

Local Development


  • Go 1.22
  • Make
  • Air
  • Tilt

Get the code

Use the template to create your own repository.

GitHub UI

  • Navigate to the repository, click Use this template, and follow the instructions.

GitHub CLI

# Step 1: Clone the template repository

git clone new-project

cd new-project

# Step 2: Create a new repository on GitHub

gh repo create username/new-project --private --source=.

# Step 3: Push the cloned contents to the new repository

git push --set-upstream origin main


  1. Create .env file from .env.example.
  2. Download Gemma from our Google Drive or Kaggle.
  3. Create libs directory and unpack zip content there.
  4. Run tilt up in project root.
  5. Test with the command below.
curl -X POST -H "Content-Type: application/json" -d '{
  "command": "Summarize this post; Reply only with the summary;",
  "token": "your_token_here",
  "text": "Your input text goes here..."
}' http://localhost:8081/askGemma

Test Docker Build

  1. Build image and run container with make all.

  2. Clean image and container with make clean-all.

Deploy to


  1. Create account.

  2. Authenticate with flyctl auth login.

  3. Create app with flyctl launch --no-deploy.

GitHub Actions

  1. Navigate to the newly created application in the dashboard and get a deploy token.

  2. Set secrets in GitHub repository settings.

  3. Manually trigger by going to Actions tab and selecting Deploy. Click Run workflow and enter the branch name to deploy.

    • You can update this action to trigger on push to main by changing the on section of the workflow file to push: [main]