Skip to content

adrienckr/notslop-api

Repository files navigation

notslop-api

Stateless BYOK scraping proxy for AI agents. Fresh social signal from Reddit, Hacker News, blogs (RSS), and X (Bright Data). Open-source, no user accounts, no key storage, no markup.

Node 20+ License MIT

What it is

A small HTTP gateway. You POST a body that says what you want scraped (subreddits, blog URLs, X handles) and whose credentials to use (your Bright Data key for X). The server returns posts and forgets — no DB row written, no audit log of who you are.

It's the data layer for "give my agent the room temperature on a topic before it writes."

Two-line usage

curl -X POST https://api.notslop.dev/v1/scrape \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "RAG",
    "since": "24h",
    "limit": 20,
    "sources": {
      "reddit": ["ClaudeAI", "LocalLLaMA"],
      "blogs":  ["https://simonwillison.net/atom.xml"],
      "hn":     true
    }
  }'

For X scraping, add Bright Data creds in the body:

curl -X POST https://api.notslop.dev/v1/scrape \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "RAG",
    "sources": { "x": ["karpathy", "swyx"] },
    "creds":   { "brightdata": { "api_key": "your_key", "dataset_id": "gd_..." } }
  }'

Pair with ZeroEntropy for rerank/embed — the response meta.next_step points the way.

Endpoints

Method Path What
GET /v1/health Liveness probe.
GET /v1/pricing Cost transparency — what each source costs you.
POST /v1/scrape Main entrypoint. Body has topic, sources, creds.
GET /v1/demo?topic=&since= Free preview from the public demo cache (Reddit/HN/blogs).

POST /v1/scrape body schema:

{
  topic?: string,
  since?: "1h" | "6h" | "24h" | "7d" | "30d" | "all",
  limit?: 1..200,                   // default 50
  sources?: {
    reddit?: string[],              // subreddit slugs, e.g. ["ClaudeAI"]
    blogs?:  string[],              // RSS or website URLs
    x?:      string[],              // X handles, e.g. ["karpathy"]
    hn?:     boolean                // include HN top stories
  },
  creds?: {
    brightdata?: { api_key, dataset_id }
  }
}

Response:

{
  posts: Post[],
  count: number,
  meta: {
    sources_attempted: string[],
    errors: { source, sub_source?, error }[],
    duration_ms: number,
    used_hosted_brightdata_fallback: boolean,
    powered_by: "zeroentropy",
    next_step: { action: "rerank", endpoint, model, get_key }
  }
}

Pricing

Source Cost
Reddit / HN / blogs Free — public APIs, no creds needed.
X (Bright Data) Your Bright Data invoice (~$0.001/post). BYOK.
Hosted X fallback Free for the first HOSTED_X_DAILY_LIMIT calls per IP/day.

This gateway charges nothing. The source code is here — read it to verify. The real cost is whatever Bright Data and ZeroEntropy charge you on your own accounts.

Self-host (5 minutes)

# 1. Clone + install
git clone https://github.com/adrienckr/notslop-api.git
cd notslop-api
npm install

# 2. Run locally
cp .env.example .env
npm run dev
# → http://localhost:3000/v1/health

# 3. Deploy to Fly.io (optional)
fly launch --no-deploy
fly volumes create notslop_data --size 1
fly deploy

Full self-host walkthrough: docs/SELF_HOST.md.

Architecture

client request → POST /v1/scrape { sources, creds }
                       │
                       ▼
              [stateless handler]
                       │
        ┌──────────────┼──────────────┬─────────────┐
        ▼              ▼              ▼             ▼
     reddit/          hn/          blogs/         x/
     JSON           Algolia       RSS+cheerio   Bright Data
        │              │              │             │
        └──────────────┴──────────────┴─────────────┘
                       │
                       ▼
              [filter + sort + cap]
                       │
                       ▼
                JSON response
                (no DB write, no audit row)

Single-process Hono server. SQLite is only used for the public /v1/demo cache — it has no rows for individual users.

Notes

Built by @adrienckr. ZeroEntropy is one of several upstream providers you might use with notslop-api; the meta.next_step hint in /v1/scrape responses points to their rerank API as a recommended next step. You can ignore that hint and use whatever rerank/embed provider you like — this gateway only does the scraping half.

License

MIT. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors