Skip to content

api-evangelist/baseten

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Baseten (baseten)

Baseten is a production inference platform for deploying and serving custom and pre-trained ML models. Offers a Model APIs catalog with OpenAI- and Anthropic-compatible endpoints (DeepSeek, Qwen, GLM, Nemotron), dedicated deployments via Truss, autoscaling GPU compute, async/queue inference, training, chains (multi-model workflows), and management APIs.

URL: Visit APIs.json URL

Run: Capabilities Using Naftiko

Type

  • x-type: company

Tags

  • AI, ML, Inference, Deployment, MLOps, OpenAI Compatible, Anthropic Compatible, Truss

APIs

  • Baseten LLM Inference API — OpenAI-compatible chat completions for Model APIs catalog. Base URL https://inference.baseten.co/v1. Docs · OpenAPI
  • Baseten Anthropic-Compatible Messages API — Anthropic Messages-compatible inference. OpenAPI
  • Baseten Management & Async API — Deployment management, async inference, chains, training. Base URL https://api.baseten.co.

Plans

  • Basic $0/mo PAYG: dedicated deployments, model APIs, training, fast cold starts, SOC 2 + HIPAA, email/in-app chat support.
  • Pro (volume discounts): everything in Basic + priority GPU, dedicated compute, higher rate limits, hands-on support.
  • Enterprise (custom): self-hosted options, custom SLAs, data residency, advanced RBAC.

Sample Pricing

  • Model APIs (per-token): DeepSeek V4 $1.74/M input · $3.48/M output. NVIDIA Nemotron 3 Super $0.30/M input · $0.75/M output.
  • Compute (per-minute): T4 $0.01052, up to B200 $0.16633. CPU from $0.00058. No charge for idle time.

Plans, Rate Limits, FinOps

Timestamps

  • Created: 2026-05-08
  • Modified: 2026-05-08

Common Properties

Maintainers

FN: Kin Lane

Email: kin@apievangelist.com

About

Production inference platform for ML models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors