Simple Go server that takes a token, command, and text and returns response from Gemma (2B parameter Google LLM). Uses Redis to cache responses. Easy deployment with Fly.io.
Leverages Gemma CPP.
Developed for the RapidRead
feature in GhostRemix.
- Go 1.22
- Make
- Air
- Tilt
Use the template to create your own repository.
- Navigate to the repository, click
Use this template
, and follow the instructions.
- Get the GitHub CLI
# Step 1: Clone the template repository
git clone https://github.com/mikab-laboratory/go-gemma.git new-project
cd new-project
# Step 2: Create a new repository on GitHub
gh repo create username/new-project --private --source=.
# Step 3: Push the cloned contents to the new repository
git push --set-upstream origin main
- Create
.env
file from.env.example
. - Download Gemma from our Google Drive or Kaggle.
- Create
libs
directory and unpack zip content there. - Run
tilt up
in project root. - Test with the command below.
curl -X POST -H "Content-Type: application/json" -d '{
"command": "Summarize this post; Reply only with the summary;",
"token": "your_token_here",
"text": "Your input text goes here..."
}' http://localhost:8081/askGemma
-
Build image and run container with
make all
. -
Clean image and container with
make clean-all
.
-
Create Fly.io account.
-
Authenticate with
flyctl auth login
. -
Create app with
flyctl launch --no-deploy
.
-
Navigate to the newly created application in the Fly.io dashboard and get a deploy token.
-
Set secrets in GitHub repository settings.
-
Manually trigger by going to Actions tab and selecting
Deploy
. ClickRun workflow
and enter the branch name to deploy.- You can update this action to trigger on push to
main
by changing theon
section of the workflow file topush: [main]
- You can update this action to trigger on push to