Skip to content

BumpyClock/digests-api

Repository files navigation

Digests API

A high-performance RSS/Atom feed parser and enrichment service, available both as an HTTP API and a Go library.

Features

  • 🚀 Fast concurrent feed parsing - Parse multiple feeds in parallel
  • 🎨 Automatic color extraction - Extract dominant colors from images
  • đź“° Metadata enrichment - Extract og:image, descriptions, and more from articles
  • 🎙️ Podcast support - Full support for podcast feeds with iTunes extensions
  • đź’ľ Flexible caching - In-memory or SQLite cache options
  • 📚 Library mode - Use as a standalone Go library without HTTP dependencies
  • 🔍 Feed discovery - Search for RSS feeds by keyword
  • đź”— URL sharing - Create shareable collections of URLs

Architecture

Massively Simplified! The project now follows a streamlined architecture with only essential components:

src/
├── core/               # Business logic
│   ├── domain/         # Domain models (Feed, Item, etc.)
│   ├── feed/           # Feed parsing service
│   ├── services/       # Simple worker pool & enrichment
│   └── errors/         # Simplified error handling
├── api/                # HTTP API layer
│   ├── handlers/       # HTTP handlers
│   └── middleware/     # Rate limiting, logging
├── infrastructure/     # External implementations
│   ├── cache/memory/   # In-memory cache only
│   ├── database/sqlite/ # Simple SQLite client
│   └── logger/         # Basic logging
└── digests-lib/        # Go library interface
    ├── client.go       # Main client API
    └── types.go        # Public types

What's Gone: Redis, complex caching layers, distributed features, heavy middleware, complex worker pools.

Requirements

  • Go 1.21+
  • C compiler (for CGO/SQLite) - TDM-GCC on Windows

Installation

As an HTTP API

# Clone the repository
git clone https://github.com/BumpyClock/digests-api.git
cd digests-api

# Install dependencies
cd src && go mod download

# Run the API
go run cmd/api/main.go

# Or build and run
go build -o digests-api cmd/api/main.go
./digests-api

As a Go Library

go get github.com/BumpyClock/digests-api/digests-lib

Configuration

API Configuration

Simplified to 15 essential environment variables:

  • PORT - HTTP port (default: 8080)
  • LOG_LEVEL - Logging level (default: info)
  • DATABASE_PATH - SQLite database path (default: ./data/digests.db)
  • CACHE_TTL - General cache TTL (default: 1h)
  • FEED_CACHE_TTL - Feed cache TTL (default: 30m)
  • METADATA_CACHE_TTL - Metadata cache TTL (default: 24h)
  • RATE_LIMIT_REQUESTS - Rate limit per window (default: 100)
  • RATE_LIMIT_WINDOW - Rate limit window (default: 1m)
  • READ_TIMEOUT - Request read timeout (default: 30s)
  • WRITE_TIMEOUT - Response write timeout (default: 30s)
  • SHUTDOWN_TIMEOUT - Graceful shutdown timeout (default: 5s)
  • WORKER_POOL_SIZE - Worker pool size (default: 10)
  • MAX_REQUEST_SIZE - Max request body size (default: 10MB)
  • FEED_REFRESH_INTERVAL - Background refresh interval (default: 1h)
  • CORS_ORIGINS - CORS allowed origins (default: *)

Library Configuration

client, err := digests.NewClient(
    // Basic configuration only
    digests.WithDatabasePath("./feeds.db"),
    digests.WithTimeout(30 * time.Second),
    digests.WithUserAgent("MyApp/1.0"),
)

API Usage

Parse Multiple Feeds

curl -X POST http://localhost:8000/parse \
  -H "Content-Type: application/json" \
  -d '{
    "urls": [
      "https://news.ycombinator.com/rss",
      "https://feeds.arstechnica.com/arstechnica/index"
    ],
    "enrichment": {
      "extract_metadata": true,
      "extract_colors": true
    }
  }'

Parse Single Feed

curl "http://localhost:8000/feed?url=https://xkcd.com/rss.xml"

Disable Enrichment for Performance

curl -X POST http://localhost:8000/parse \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://example.com/feed.xml"],
    "enrichment": {
      "extract_metadata": false,
      "extract_colors": false
    }
  }'

Library Usage

Basic Example

package main

import (
    "context"
    "fmt"
    "log"
    
    "github.com/BumpyClock/digests-api/digests-lib"
)

func main() {
    // Create client with defaults
    client, err := digests.NewClient()
    if err != nil {
        log.Fatal(err)
    }
    defer client.Close()
    
    // Parse a feed
    feed, err := client.ParseFeed(
        context.Background(),
        "https://xkcd.com/rss.xml",
    )
    if err != nil {
        log.Fatal(err)
    }
    
    fmt.Printf("Feed: %s\n", feed.Title)
    for _, item := range feed.Items {
        fmt.Printf("- %s\n", item.Title)
    }
}

Advanced Example

// Parse without enrichment for speed
feeds, err := client.ParseFeeds(
    ctx,
    urls,
    digests.WithoutEnrichment(),
)

// Parse with pagination
feeds, err := client.ParseFeeds(
    ctx,
    urls,
    digests.WithPagination(1, 20), // Page 1, 20 items
)

// Search for feeds
results, err := client.Search(ctx, "technology news")

API Endpoints

Feed Parsing

  • POST /parse - Parse multiple feeds with enrichment options
  • GET /feed - Parse a single feed

Discovery

  • GET /discover - Discover feeds from a website URL

Metadata

  • POST /metadata/extract - Extract metadata from URLs

Validation

  • POST /validate - Validate feed URLs

Response Format

{
  "feeds": [{
    "id": "feed-id",
    "title": "Feed Title",
    "description": "Feed description",
    "url": "https://example.com/feed.xml",
    "feed_type": "article|podcast",
    "items": [{
      "id": "item-id",
      "title": "Article Title",
      "description": "Article description",
      "link": "https://example.com/article",
      "published": "2024-01-01T00:00:00Z",
      "thumbnail": "https://example.com/image.jpg",
      "thumbnail_color": {
        "r": 255,
        "g": 128,
        "b": 0
      }
    }]
  }]
}

Performance Considerations

  1. Simplified Architecture: Removed complex caching layers and distributed features
  2. Memory-Only Caching: Fast in-memory cache with configurable TTLs
  3. Simple Worker Pool: Fixed-size worker pool (configurable via WORKER_POOL_SIZE)
  4. SQLite Database: Single-file database with performance indexes

Development

Building

cd src

# Windows (with CGO for SQLite)
set CGO_ENABLED=1 && go build -o api.exe cmd/api/main.go

# Unix
CGO_ENABLED=1 go build -o api cmd/api/main.go

Development with Live Reload

# Windows (use dev.cmd for simplicity)
dev.cmd

# Unix (use air if available)
./run-air.sh

Docker

docker build -t digests-api .
docker run -p 8000:8000 digests-api

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages