12 docs site (#13)

* docs: setup documentation site using mkdocs * docs: add readme for docs package * docs: add vs code config for mkdocs-material syntax support * docs: update readmes to reflect docs site * ci: add action to deploy docs site * refactor: update docs site for custom url
ae9is · Jan 26, 2024 · a910461 · a910461
1 parent b07cb30
commit a910461
Show file tree

Hide file tree

Showing 16 changed files with 1,901 additions and 136 deletions.
diff --git a/.github/workflows/pages.yml b/.github/workflows/pages.yml
@@ -0,0 +1,52 @@
+# Deploy mkdocs site to GitHub Pages
+name: pages
+
+on:
+  # Runs on pushes targeting the default branch
+  push:
+    branches:
+      - main
+    # Only run if docs or self updated
+    paths:
+      - packages/docs/**
+      - .github/workflows/pages.yml
+
+  # Allows you to run this workflow manually from the Actions tab
+  workflow_dispatch:
+
+# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
+permissions:
+  contents: write
+  pages: write
+  id-token: write
+
+# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
+# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
+concurrency:
+  group: "pages"
+  cancel-in-progress: false
+
+jobs:
+  build-and-deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Configure Git Credentials
+        run: |
+          git config user.name github-actions[bot]
+          git config user.email 41898282+github-actions[bot]@users.noreply.github.com
+      - name: Setup Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: 3.11
+          architecture: 'x64'
+      - name: Install Dependencies
+        run: |
+          cd packages/docs
+          pip install pdm
+          pdm install --frozen-lockfile --production
+      - name: Build Docs and Deploy
+        run: |
+          cd packages/docs
+          pdm run deploy-pages
diff --git a/.vscode/extensions.json b/.vscode/extensions.json
@@ -3,5 +3,6 @@
     "charliermarsh.ruff",
     "ms-python.python",
     "ms-python.vscode-pylance",
+    "redhat.vscode-yaml",
   ],
 }
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -16,4 +16,16 @@
   "python.analysis.fixAll": [
     "source.unusedImports"
   ],
+  // ref: https://squidfunk.github.io/mkdocs-material/creating-your-site/#minimal-configuration
+  "yaml.schemas": {
+    "https://squidfunk.github.io/mkdocs-material/schema.json": "mkdocs.yml"
+  },
+  "yaml.customTags": [
+    "!ENV scalar",
+    "!ENV sequence",
+    "!relative scalar",
+    "tag:yaml.org,2002:python/name:material.extensions.emoji.to_svg",
+    "tag:yaml.org,2002:python/name:material.extensions.emoji.twemoji",
+    "tag:yaml.org,2002:python/name:pymdownx.superfences.fence_code_format"
+  ]
 }
diff --git a/README.md b/README.md
@@ -1,4 +1,43 @@
-# ezsam
-Extract foreground from images or video via text prompt
+# ezsam (easy segment anything model)
 
-See the command line tool readme at https://github.com/ae9is/ezsam/tree/main/packages/cli
+A tool to segment images and video via text prompts.
+
+Input images and videos, describe the subjects or objects you want to keep, and output new images and videos with the background removed.
+
+**Check out the docs! [ezsam.org](https://www.ezsam.org)**
+
+## Why?
+
+Meta's [Segment Anything](https://github.com/facebookresearch/segment-anything) is a powerful tool for separating parts of images,
+but requires coordinate prompts&mdash;either bounding boxes or points.
+And manual prompt generation is tedious for large collections of still images or video.
+
+In constrast, text-based prompts describing the object(s) in the foreground to segment can be constant.
+Inspired by [Grounded-Segment-Anything](https://github.com/IDEA-Research/Grounded-Segment-Anything),
+this project tries to package a simpler to use tool.
+
+If you're not interested in text-based prompts with Segment Anything, 
+check out [rembg](https://github.com/danielgatis/rembg).
+
+## How does it work?
+
+The foreground is selected using text prompts to [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) to detect objects.
+Image segments are generated using [Segment Anything](https://github.com/facebookresearch/segment-anything) 
+or [Segment Anything HQ (SAM-HQ)](https://github.com/SysCV/SAM-HQ).
+
+## Quick start
+
+```bash
+# Ubuntu 22.04, Python 3.9 - 3.11
+pip install ezsam
+sudo apt install ffmpeg imagemagick
+ezsam --help
+```
+
+For more detailed info, see the documentation site here: [ezsam.org](https://www.ezsam.org)
+
+## Monorepo structure
+
+This repository collocates the following packages:
+- [cli](packages/cli): the ezsam command line tool
+- [docs](packages/docs): a static documentation site
diff --git a/packages/cli/README.md b/packages/cli/README.md
@@ -1,135 +1,5 @@
-# ezsam (easy segment anything model)
+# ezsam/cli
 
-A pipeline to extract foreground from images or video via text prompts.
+A command line tool to extract foreground from images or video via text prompts.
 
-## Why?
-
-Meta's Segment Anything is a powerful tool for separating parts of images,
-but requires coordinate prompts&mdash;either bounding boxes or points.
-Manual prompt generation is tedious for large collections of still images or video.
-In constrast, text-based prompts describing the object(s) in the foreground to segment can be constant.
-Inspired by [Grounded-Segment-Anything](https://github.com/IDEA-Research/Grounded-Segment-Anything),
-this project tries to package a simpler to use tool.
-
-If you're not interested in text-based prompts with Segment Anything, 
-check out [rembg](https://github.com/danielgatis/rembg).
-
-## How does it work?
-
-The foreground is selected using text prompts to [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) to detect objects.
-Image segments are generated using [Segment Anythinug](https://github.com/facebookresearch/segment-anything) 
-or [Segment Anything HQ (SAM-HQ)](https://github.com/SysCV/SAM-HQ).
-
-## Installation 
-
-```bash
-pip install ezsam
-```
-
-For video output, you need to install FFmpeg and have it available on your $PATH as `ffmpeg` for 
-all the encoding options except GIF. GIF output requires Imagemagick; `convert` must be available on your $PATH.
-
-```bash
-# Examples will be given for apt-based Linuxes like Ubuntu, Debian...
-apt install ffmpeg imagemagick
-```
-
-For a development install, see [Development](#development).
-
-## Usage
-
-```bash
-ezsam --help
-```
-
-## Examples
-
-Example images are sourced from [rembg](https://github.com/danielgatis/rembg/tree/main/examples) for easy comparison.
-
-Process images extracting foreground specified by prompt to `examples/animal*.out.png`.
-(For extractions, which require adding an alpha channel, the output image format is always `png`.)
-
-```bash
-ezsam examples/animal*.jpg -p animal -o examples
-```
-
-Multiple objects can be selected as the foreground. The output image `./car-1.out.png` contains the car and the person.
-
-```bash
-ezsam examples/car-1.jpg -p car, person
-```
-
-Use debug mode to fine tune or troubleshoot prompts. This writes output with foreground mask and object detections
-annotated over the original image file. Here we write out to `test/car-3.debug.jpg`.
-(Note the original image format `jpg` is preserved in debug mode!)
-
-```bash
-ezsam examples/car-3.jpg -p white car -o test -s .debug --debug
-```
-
-The object detection box threshold parameter can be used to fine tune objects for selection.
-
-```bash
-ezsam examples/car-3.jpg -p white car -o test --bmin 0.45
-```
-
-Writing prompts with specificity can also help.
-
-```bash
-ezsam examples/anime-girl-2.jpg -o examples -s .debug -p girl, phone, bag, railway crossing sign post --debug
-```
-
-## Models
-
-The tool uses [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) for object detection.
-
-To perform image segmentation, you can pick SAM or SAM-HQ:
-* [Segment Anything](https://github.com/facebookresearch/segment-anything) 
-* [Segment Anything HQ (SAM-HQ)](https://github.com/SysCV/SAM-HQ)
-
-For the best results use the biggest model your GPU has memory for. ViT = Vision Transformer, the model type. From best/slowest to worst/fastest: ViT-H > ViT-L > ViT-B > ViT-tiny.
-
-Note: ViT-tiny is for SAM-HQ only, you must use the `--hq` flag.
-
-## Development
-
-This project uses [pdm](https://github.com/pdm-project/pdm) for package management. Example installation:
-
-```bash
-pip install pipx
-pipx install pdm
-git clone https://github.com/ae9is/ezsam.git
-cd ezsam/packages/cli
-pdm install
-pdm start
-```
-
-Pre-commit is used for some commit hooks:
-```bash
-pip install pre-commit
-pre-commit install
-```
-
-## GPU memory troubleshooting
-
-If you *always* get an error stating "CUDA out of memory", try using a smaller Segment Anything model (vit_tiny, vit_b) or lower resolution (or less) input.
-
-If you only get a CUDA OOM error occasionally, or after a while, try to free up some memory by closing processes using the GPU:
-```bash
-# List commands using nvidia gpu
-fuser -v /dev/nvidia*
-```
-
-You can also try manually getting the GPU to clear some processes:
-```bash
-# Clears all processes accounted so far
-sudo nvidia-smi -caa
-```
-
-If you are using multiple GPUs, and so the GPU you're running CUDA on isn't driving your displays, you can also reset the GPU using:
-```bash
-# Trigger reset of one or more GPUs
-sudo nvidia-smi -r
-```
-
-Note: nvidia-smi is in the nvidia-utils package of [NVIDIA's CUDA repo for Ubuntu](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_network).
+Check out the docs at: [ezsam.org](https://www.ezsam.org)
diff --git a/packages/docs/README.md b/packages/docs/README.md
@@ -0,0 +1,29 @@
+# ezsam/docs
+
+Static docs site using [Material for MkDocs](https://github.com/squidfunk/mkdocs-material)
+
+## Install
+
+```bash
+pdm install
+```
+
+## Run
+
+```bash
+pdm start
+```
+
+## Deployment
+
+Checkout the github action at [.github/workflows/pages.yml](/.github/workflows/pages.yml)
+
+To manually deploy to your local Git repository's gh-pages branch:
+
+```bash
+pdm deploy-pages
+```
+
+## Development
+
+Checkout the repos for [mkdocs-material](https://github.com/squidfunk/mkdocs-material) and [pdm](https://github.com/pdm-project/pdm) for examples (for generating versioned docs, api docs from doc strings, etc...)
diff --git a/packages/docs/mkdocs.yml b/packages/docs/mkdocs.yml
@@ -0,0 +1,73 @@
+site_name: ezsam
+site_url: https://www.ezsam.org
+site_author: ae9is
+site_description: >-
+  Use ezsam to extract foreground from images or video via text prompts
+repo_name: ae9is/ezsam
+repo_url: https://github.com/ae9is/ezsam
+edit_uri: edit/main/packages/docs/src/
+docs_dir: src
+plugins:
+  - search
+markdown_extensions:
+  - admonition
+  - attr_list
+  - toc:
+      permalink: true
+  - pymdownx.emoji:
+      emoji_index: !!python/name:material.extensions.emoji.twemoji
+      emoji_generator: !!python/name:material.extensions.emoji.to_svg
+extra:
+  social:
+    - icon: fontawesome/brands/github
+      link: https://github.com/ae9is/ezsam
+#  analytics:
+#    provider: google
+#    property: !ENV GOOGLE_ANALYTICS_KEY
+#  consent:
+#    title: Would you like a free cookie? 🍪
+#    description: It's just to see how this docs site is used and potentially improve it.
+#    actions:
+#      - manage
+#      - reject
+#      - accept
+#copyright: <a href="#__consent">Change cookie settings</a>
+extra_css:
+  - assets/extra.css
+theme:
+  name: material
+  palette:
+    - media: '(prefers-color-scheme: dark)'
+      scheme: slate
+      toggle:
+        icon: material/weather-night
+        name: Switch to light mode
+    - media: '(prefers-color-scheme: light)'
+      scheme: default
+      toggle:
+        icon: material/weather-sunny
+        name: Switch to dark mode
+  features:
+    - content.action.edit
+    - content.action.view
+    - content.code.annotate
+    - content.code.copy
+    - content.tooltips
+    - navigation.footer
+    - navigation.indexes
+    - navigation.tracking
+    - navigation.path
+    - navigation.top
+#    - navigation.sections
+#    - navigation.tabs
+    - search.highlight
+    - search.share
+    - toc.follow
+  icon:
+    edit: material/pencil 
+    view: material/eye
+nav:
+  - About: index.md
+  - install.md
+  - usage.md
+  - changelog.md