Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdded comprehensive documentation for Predator v1.0.0 with architecture, functionalities, and release notes. Created versioned index pages for multiple BharatMLStack components (Quick Start, Inferflow, Numerix, Online Feature Store, SDKs, Skye, Trufflebox UI). Reorganized category ordering and rebuilt documentation webpack assets. Changes
Suggested reviewers
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs/predator/v1.0.0/index.html`:
- Line 6: The canonical and alternate link hrefs incorrectly point to
/inferflow/v1.0.0; update the rel="canonical" and both rel="alternate" (hreflang
and x-default) link elements in index.html so their href attributes use the
Predator path (https://meesho.github.io/BharatMLStack/predator/v1.0.0), ensuring
rel="canonical" and the two rel="alternate" entries are changed to the predator
URL to match the page context.
In `@docs/predator/v1.0.0/release-notes/index.html`:
- Line 6: The canonical and alternate link tags currently point to the Inferflow
release-notes URLs; update the <link rel="canonical" href="..."> and both <link
rel="alternate" href="..." hreflang="..."> entries in the HTML (the tags shown
with rel="canonical" and rel="alternate") to use the Predator path
(/predator/v1.0.0/release-notes) so they correctly reference this Predator
release notes page.
docs/predator/v1.0.0/index.html
Outdated
| <head> | ||
| <meta charset="UTF-8"> | ||
| <meta name="generator" content="Docusaurus v3.8.1"> | ||
| <title data-rh="true">v1.0.0 | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="v1.0.0 | BharatMLStack"><meta data-rh="true" name="description" content="Predator v1.0.0 — Triton inference, multi-backend, K8s deployment."><meta data-rh="true" property="og:description" content="Predator v1.0.0 — Triton inference, multi-backend, K8s deployment."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> |
There was a problem hiding this comment.
Fix canonical/alternate URLs to the Predator path.
Line 6 uses /inferflow/v1.0.0 for canonical and hreflang links, but this is the Predator docs page. That mismatch can cause SEO and indexing issues.
🛠️ Suggested fix
-<link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0">
-<link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0" hreflang="en">
-<link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0" hreflang="x-default">
+<link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0">
+<link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0" hreflang="en">
+<link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0" hreflang="x-default">📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <title data-rh="true">v1.0.0 | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="v1.0.0 | BharatMLStack"><meta data-rh="true" name="description" content="Predator v1.0.0 — Triton inference, multi-backend, K8s deployment."><meta data-rh="true" property="og:description" content="Predator v1.0.0 — Triton inference, multi-backend, K8s deployment."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> | |
| <title data-rh="true">v1.0.0 | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="v1.0.0 | BharatMLStack"><meta data-rh="true" name="description" content="Predator v1.0.0 — Triton inference, multi-backend, K8s deployment."><meta data-rh="true" property="og:description" content="Predator v1.0.0 — Triton inference, multi-backend, K8s deployment."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> |
🤖 Prompt for AI Agents
In `@docs/predator/v1.0.0/index.html` at line 6, The canonical and alternate link
hrefs incorrectly point to /inferflow/v1.0.0; update the rel="canonical" and
both rel="alternate" (hreflang and x-default) link elements in index.html so
their href attributes use the Predator path
(https://meesho.github.io/BharatMLStack/predator/v1.0.0), ensuring
rel="canonical" and the two rel="alternate" entries are changed to the predator
URL to match the page context.
| <head> | ||
| <meta charset="UTF-8"> | ||
| <meta name="generator" content="Docusaurus v3.8.1"> | ||
| <title data-rh="true">Release Notes | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="Release Notes | BharatMLStack"><meta data-rh="true" name="description" content="Predator v1.0.0 release notes: Triton inference, multi-backend, K8s deployment, observability."><meta data-rh="true" property="og:description" content="Predator v1.0.0 release notes: Triton inference, multi-backend, K8s deployment, observability."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"},{"@type":"ListItem","position":3,"name":"Release Notes","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> |
There was a problem hiding this comment.
Correct canonical/alternate URLs for Release Notes.
Line 6 points canonical/hreflang to /inferflow/v1.0.0/release-notes, but this is the Predator release notes page.
🛠️ Suggested fix
-<link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes">
-<link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes" hreflang="en">
-<link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes" hreflang="x-default">
+<link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes">
+<link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes" hreflang="en">
+<link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes" hreflang="x-default">📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <title data-rh="true">Release Notes | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="Release Notes | BharatMLStack"><meta data-rh="true" name="description" content="Predator v1.0.0 release notes: Triton inference, multi-backend, K8s deployment, observability."><meta data-rh="true" property="og:description" content="Predator v1.0.0 release notes: Triton inference, multi-backend, K8s deployment, observability."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"},{"@type":"ListItem","position":3,"name":"Release Notes","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> | |
| <title data-rh="true">Release Notes | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="Release Notes | BharatMLStack"><meta data-rh="true" name="description" content="Predator v1.0.0 release notes: Triton inference, multi-backend, K8s deployment, observability."><meta data-rh="true" property="og:description" content="Predator v1.0.0 release notes: Triton inference, multi-backend, K8s deployment, observability."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"},{"@type":"ListItem","position":3,"name":"Release Notes","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> |
🤖 Prompt for AI Agents
In `@docs/predator/v1.0.0/release-notes/index.html` at line 6, The canonical and
alternate link tags currently point to the Inferflow release-notes URLs; update
the <link rel="canonical" href="..."> and both <link rel="alternate" href="..."
hreflang="..."> entries in the HTML (the tags shown with rel="canonical" and
rel="alternate") to use the Predator path (/predator/v1.0.0/release-notes) so
they correctly reference this Predator release notes page.
| <head> | ||
| <meta charset="UTF-8"> | ||
| <meta name="generator" content="Docusaurus v3.8.1"> | ||
| <title data-rh="true">Release Notes | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="Release Notes | BharatMLStack"><meta data-rh="true" name="description" content="Predator v1.0.0 release notes: Triton inference, multi-backend, K8s deployment, observability."><meta data-rh="true" property="og:description" content="Predator v1.0.0 release notes: Triton inference, multi-backend, K8s deployment, observability."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0/release-notes" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"},{"@type":"ListItem","position":3,"name":"Release Notes","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0/release-notes"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> |
There was a problem hiding this comment.
Hardcoded sensitive data leaked
Semgrep has detected a leak of sensitive data in this code. This secret data could be used by internal or external malicious actors. We highly recommend you change, reset, or rotate the sensitive data.
A secret is hard-coded in the application. Secrets stored in source code, such as credentials, identifiers, and other types of sensitive data, can be leaked and used by internal or external malicious actors. It is recommended to rotate the secret and retrieve them from a secure secret vault or Hardware Security Module (HSM), alternatively environment variables can be used if allowed by your company policy.
💬 To ignore this, reply with:
• /fp <comment> for false positive
• /ar <comment> for acceptable risk
• /other <comment> for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by generic_uri_string.
docs/predator/v1.0.0/index.html
Outdated
| <head> | ||
| <meta charset="UTF-8"> | ||
| <meta name="generator" content="Docusaurus v3.8.1"> | ||
| <title data-rh="true">v1.0.0 | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="v1.0.0 | BharatMLStack"><meta data-rh="true" name="description" content="Predator v1.0.0 — Triton inference, multi-backend, K8s deployment."><meta data-rh="true" property="og:description" content="Predator v1.0.0 — Triton inference, multi-backend, K8s deployment."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/inferflow/v1.0.0" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> |
There was a problem hiding this comment.
Hardcoded sensitive data leaked
Semgrep has detected a leak of sensitive data in this code. This secret data could be used by internal or external malicious actors. We highly recommend you change, reset, or rotate the sensitive data.
A secret is hard-coded in the application. Secrets stored in source code, such as credentials, identifiers, and other types of sensitive data, can be leaked and used by internal or external malicious actors. It is recommended to rotate the secret and retrieve them from a secure secret vault or Hardware Security Module (HSM), alternatively environment variables can be used if allowed by your company policy.
💬 To ignore this, reply with:
• /fp <comment> for false positive
• /ar <comment> for acceptable risk
• /other <comment> for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by generic_uri_string.
| <head> | ||
| <meta charset="UTF-8"> | ||
| <meta name="generator" content="Docusaurus v3.8.1"> | ||
| <title data-rh="true">Key Functionalities | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0/functionalities"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="Key Functionalities | BharatMLStack"><meta data-rh="true" name="description" content="Predator key functionalities: Triton inference, multi-backend, dynamic batching, K8s deployment, observability."><meta data-rh="true" property="og:description" content="Predator key functionalities: Triton inference, multi-backend, dynamic batching, K8s deployment, observability."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/functionalities"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/functionalities" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/functionalities" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"},{"@type":"ListItem","position":3,"name":"Key Functionalities","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0/functionalities"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> |
There was a problem hiding this comment.
Hardcoded sensitive data leaked
Semgrep has detected a leak of sensitive data in this code. This secret data could be used by internal or external malicious actors. We highly recommend you change, reset, or rotate the sensitive data.
A secret is hard-coded in the application. Secrets stored in source code, such as credentials, identifiers, and other types of sensitive data, can be leaked and used by internal or external malicious actors. It is recommended to rotate the secret and retrieve them from a secure secret vault or Hardware Security Module (HSM), alternatively environment variables can be used if allowed by your company policy.
💬 To ignore this, reply with:
• /fp <comment> for false positive
• /ar <comment> for acceptable risk
• /other <comment> for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by generic_uri_string.
| <head> | ||
| <meta charset="UTF-8"> | ||
| <meta name="generator" content="Docusaurus v3.8.1"> | ||
| <title data-rh="true">Architecture | BharatMLStack</title><meta data-rh="true" name="viewport" content="width=device-width,initial-scale=1"><meta data-rh="true" name="twitter:card" content="summary_large_image"><meta data-rh="true" property="og:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" name="twitter:image" content="https://meesho.github.io/BharatMLStack/img/docusaurus-social-card.jpg"><meta data-rh="true" property="og:url" content="https://meesho.github.io/BharatMLStack/predator/v1.0.0/architecture"><meta data-rh="true" property="og:locale" content="en"><meta data-rh="true" name="docusaurus_locale" content="en"><meta data-rh="true" name="docsearch:language" content="en"><meta data-rh="true" name="docusaurus_version" content="current"><meta data-rh="true" name="docusaurus_tag" content="docs-default-current"><meta data-rh="true" name="docsearch:version" content="current"><meta data-rh="true" name="docsearch:docusaurus_tag" content="docs-default-current"><meta data-rh="true" property="og:title" content="Architecture | BharatMLStack"><meta data-rh="true" name="description" content="Predator is a scalable, high-performance model inference service built as a wrapper around NVIDIA Triton Inference Server, designed to serve ML models with low latency in Kubernetes, with OnFS and Interflow integration."><meta data-rh="true" property="og:description" content="Predator is a scalable, high-performance model inference service built as a wrapper around NVIDIA Triton Inference Server, designed to serve ML models with low latency in Kubernetes, with OnFS and Interflow integration."><link data-rh="true" rel="icon" href="/BharatMLStack/img/favicon.ico"><link data-rh="true" rel="canonical" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/architecture"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/architecture" hreflang="en"><link data-rh="true" rel="alternate" href="https://meesho.github.io/BharatMLStack/predator/v1.0.0/architecture" hreflang="x-default"><script data-rh="true" type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Predator","item":"https://meesho.github.io/BharatMLStack/category/predator"},{"@type":"ListItem","position":2,"name":"v1.0.0","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0"},{"@type":"ListItem","position":3,"name":"Architecture","item":"https://meesho.github.io/BharatMLStack/predator/v1.0.0/architecture"}]}</script><link rel="alternate" type="application/rss+xml" href="/BharatMLStack/blog/rss.xml" title="BharatMLStack RSS Feed"> |
There was a problem hiding this comment.
Hardcoded sensitive data leaked
Semgrep has detected a leak of sensitive data in this code. This secret data could be used by internal or external malicious actors. We highly recommend you change, reset, or rotate the sensitive data.
A secret is hard-coded in the application. Secrets stored in source code, such as credentials, identifiers, and other types of sensitive data, can be leaked and used by internal or external malicious actors. It is recommended to rotate the secret and retrieve them from a secure secret vault or Hardware Security Module (HSM), alternatively environment variables can be used if allowed by your company policy.
💬 To ignore this, reply with:
• /fp <comment> for false positive
• /ar <comment> for acceptable risk
• /other <comment> for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by generic_uri_string.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs-src/docs/predator/v1.0.0/architecture.md`:
- Line 10: Replace the typo "Interflow" with the correct canonical name
"Inferflow" in the architecture text; locate the occurrence of the term
"Interflow" in the sentence referencing orchestration between client
applications, the feature store and inference engine and update it to
"Inferflow" so it matches the canonical name used across the codebase and docs.
In `@docs/assets/js/72dc5b25.27f7362a.js`:
- Line 1: The generated JS asset docs/assets/js/72dc5b25.27f7362a.js contains
incorrect category metadata under the JSON key "categoryGeneratedIndex"
(description "Predator v1.0.0" vs expected "Online Feature Store v1.0.0");
regenerate the documentation assets by running the documentation build (e.g.,
the repo's docs build command such as npm/yarn run build or the documented
generator step), clear any build/cache (e.g., .cache/public) before rebuilding,
and verify the output asset now reflects the source _category_.json values (slug
and description) in the "categoryGeneratedIndex" entry. Ensure the docs
generator reads the source _category_.json files and re-run the CI/build job if
necessary to propagate the fixed artifact.
🧹 Nitpick comments (3)
docs/assets/js/runtime~main.94940372.js (1)
1-1: Consider excluding generated bundles from linting.This file looks like a minified build artifact; the lint errors reported for it are likely false positives. Excluding
docs/assets/js/*.js(or build artifacts) from linting would keep signal high without changing runtime behavior.docs-src/docs/predator/v1.0.0/architecture.md (2)
61-73: Add language identifier to fenced code block.The file tree structure code block lacks a language identifier. For consistency and to satisfy markdown linting, specify a language (e.g.,
textorplaintext).📝 Proposed fix
-``` +```text model_repository/ ├── model_A/ │ ├── config.pbtxt
97-101: Add language identifier to fenced code block.The pod architecture diagram code block lacks a language identifier.
📝 Proposed fix
-``` +```text Predator Pod ├── Init Container (Model Sync) ├── Triton Inference Server Container</details> </blockquote></details> </blockquote></details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
|
|
||
| Predator is a scalable, high-performance model inference service built as a wrapper around the **NVIDIA Triton Inference Server**. It is designed to serve a variety of machine learning models (Deep Learning, Tree-based, etc.) with low latency in a **Kubernetes (K8s)** environment. | ||
|
|
||
| The system integrates seamlessly with the **Online Feature Store (OnFS)** for real-time feature retrieval and uses **Interflow** as an orchestration layer to manage traffic between client applications (e.g. IOP), feature store and inference engine. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check for "Inferflow" vs "Interflow" usage across the repository
echo "=== Searching for 'Inferflow' ==="
rg -i "inferflow" --type md
echo ""
echo "=== Searching for 'Interflow' ==="
rg -i "interflow" --type mdRepository: Meesho/BharatMLStack
Length of output: 20504
Fix typo: "Interflow" should be "Inferflow".
Line 10 uses "Interflow" but the canonical name throughout the codebase is "Inferflow" — a graph-driven feature retrieval and model inference orchestration system. This appears to be a typo; update to match the correct system name used across all documentation and code.
🤖 Prompt for AI Agents
In `@docs-src/docs/predator/v1.0.0/architecture.md` at line 10, Replace the typo
"Interflow" with the correct canonical name "Inferflow" in the architecture
text; locate the occurrence of the term "Interflow" in the sentence referencing
orchestration between client applications, the feature store and inference
engine and update it to "Inferflow" so it matches the canonical name used across
the codebase and docs.
docs/assets/js/72dc5b25.27f7362a.js
Outdated
| @@ -0,0 +1 @@ | |||
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[8261],{3613:e=>{e.exports=JSON.parse('{"categoryGeneratedIndex":{"title":"v1.0.0","description":"Predator v1.0.0","slug":"/online-feature-store/v1.0.0","permalink":"/BharatMLStack/online-feature-store/v1.0.0","sidebar":"tutorialSidebar","navigation":{"previous":{"title":"Online Feature Store","permalink":"/BharatMLStack/category/online-feature-store"},"next":{"title":"Architecture","permalink":"/BharatMLStack/online-feature-store/v1.0.0/architecture"}}}}')}}]); No newline at end of file | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for _category_.json files and examine their contents
echo "=== Finding all _category_.json files ==="
fd "_category_\.json" -type f
echo ""
echo "=== Searching for files containing 'online-feature-store' and 'v1.0.0' ==="
fd "_category_\.json" -type f -exec grep -l "online-feature-store\|v1\.0\.0" {} \;
echo ""
echo "=== Checking content of matching _category_.json files ==="
fd "_category_\.json" -type f -exec sh -c 'echo "--- File: {} ---" && cat {}' \;Repository: Meesho/BharatMLStack
Length of output: 771
🏁 Script executed:
#!/bin/bash
# Search for _category_.json files with corrected syntax
echo "=== Finding _category_.json files ==="
find . -name "_category_.json" 2>/dev/null
echo ""
echo "=== Checking for docs directory structure ==="
find docs -type d -name "*online-feature-store*" 2>/dev/null | head -20
echo ""
echo "=== Search for category files in docs ==="
find docs -name "*category*" 2>/dev/null | head -20
echo ""
echo "=== Check docs structure ==="
ls -la docs/docs/ 2>/dev/null | head -20Repository: Meesho/BharatMLStack
Length of output: 1055
🏁 Script executed:
#!/bin/bash
# Read the online-feature-store v1.0.0 _category_.json file
echo "=== Content of online-feature-store/v1.0.0/_category_.json ==="
cat ./docs-src/docs/online-feature-store/v1.0.0/_category_.json | jq . 2>/dev/null || cat ./docs-src/docs/online-feature-store/v1.0.0/_category_.json
echo ""
echo "=== Content of predator/v1.0.0/_category_.json ==="
cat ./docs-src/docs/predator/v1.0.0/_category_.json | jq . 2>/dev/null || cat ./docs-src/docs/predator/v1.0.0/_category_.json
echo ""
echo "=== Content of online-feature-store/_category_.json ==="
cat ./docs-src/docs/online-feature-store/_category_.json | jq . 2>/dev/null || cat ./docs-src/docs/online-feature-store/_category_.jsonRepository: Meesho/BharatMLStack
Length of output: 984
🏁 Script executed:
#!/bin/bash
# Check the actual generated asset file
echo "=== Checking generated asset file ==="
if [ -f "docs/assets/js/72dc5b25.27f7362a.js" ]; then
echo "File exists. Checking content:"
head -c 500 docs/assets/js/72dc5b25.27f7362a.js
echo ""
echo ""
echo "Searching for 'Predator' or 'online-feature-store' in the file:"
grep -o '"description":"[^"]*"' docs/assets/js/72dc5b25.27f7362a.js | head -5
grep -o '"slug":"[^"]*"' docs/assets/js/72dc5b25.27f7362a.js | head -5
else
echo "File not found at docs/assets/js/72dc5b25.27f7362a.js"
fi
echo ""
echo "=== Searching for similar generated asset files ==="
find docs/assets/js -name "*.js" | head -10Repository: Meesho/BharatMLStack
Length of output: 1181
Regenerate documentation assets—generated code has incorrect category metadata.
The generated asset has a metadata mismatch: description "Predator v1.0.0" combined with slug "/online-feature-store/v1.0.0". However, the source _category_.json files are correct—docs-src/docs/online-feature-store/v1.0.0/_category_.json properly has description "Online Feature Store v1.0.0" with matching slug.
This is a build/generation issue, not a source configuration problem. Rebuild the documentation to regenerate the assets from the correct source metadata.
🤖 Prompt for AI Agents
In `@docs/assets/js/72dc5b25.27f7362a.js` at line 1, The generated JS asset
docs/assets/js/72dc5b25.27f7362a.js contains incorrect category metadata under
the JSON key "categoryGeneratedIndex" (description "Predator v1.0.0" vs expected
"Online Feature Store v1.0.0"); regenerate the documentation assets by running
the documentation build (e.g., the repo's docs build command such as npm/yarn
run build or the documented generator step), clear any build/cache (e.g.,
.cache/public) before rebuilding, and verify the output asset now reflects the
source _category_.json values (slug and description) in the
"categoryGeneratedIndex" entry. Ensure the docs generator reads the source
_category_.json files and re-run the CI/build job if necessary to propagate the
fixed artifact.
|
Semgrep found 6
Hardcoded sensitive data leaked Semgrep has detected a leak of sensitive data in this code. This secret data could be used by internal or external malicious actors. We highly recommend you change, reset, or rotate the sensitive data. A secret is hard-coded in the application. Secrets stored in source code, such as credentials, identifiers, and other types of sensitive data, can be leaked and used by internal or external malicious actors. It is recommended to rotate the secret and retrieve them from a secure secret vault or Hardware Security Module (HSM), alternatively environment variables can be used if allowed by your company policy. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@docs-src/package.json`:
- Line 28: Remove the "yarn": "1.22.22" devDependency entry from package.json
(the "yarn" key) and instead add a top-level packageManager field with the
desired manager and version (for example "packageManager": "yarn@1.22.22"); then
update CI/dev docs to enable corepack (corepack enable) so the enforced version
is used rather than listing yarn as a dependency.
| "@docusaurus/module-type-aliases": "3.8.1", | ||
| "@docusaurus/types": "3.8.1" | ||
| "@docusaurus/types": "3.8.1", | ||
| "yarn": "1.22.22" |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Avoid listing yarn as a devDependency.
Adding a package manager as a project dependency is an anti-pattern. It creates circular dependency logic (using yarn to install yarn) and can cause version conflicts with globally installed yarn.
If the goal is to enforce a specific yarn version, use the packageManager field instead, which is supported by Node.js corepack:
♻️ Recommended approach using packageManager field
Remove the yarn devDependency and add a packageManager field at the root level of package.json:
"devDependencies": {
"@docusaurus/module-type-aliases": "3.8.1",
- "@docusaurus/types": "3.8.1",
- "yarn": "1.22.22"
+ "@docusaurus/types": "3.8.1"
},
+ "packageManager": "yarn@1.22.22",
"browserslist": {Then enable corepack in your CI/development environment with:
corepack enable🤖 Prompt for AI Agents
In `@docs-src/package.json` at line 28, Remove the "yarn": "1.22.22" devDependency
entry from package.json (the "yarn" key) and instead add a top-level
packageManager field with the desired manager and version (for example
"packageManager": "yarn@1.22.22"); then update CI/dev docs to enable corepack
(corepack enable) so the enforced version is used rather than listing yarn as a
dependency.
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@docs-src/docs/trufflebox-ui/v1.0.0/index.md`:
- Line 12: Update the sentence that currently reads "feature catalog,
management, user management, and other admin operations" to remove redundancy by
changing it to "feature catalog management, user management, and other admin
operations" so the phrase is clear and parallel; locate and edit the sentence in
the Trufflebox UI description (the line containing "Trufflebox UI is a modern,
feature-rich UI framework for supporting MLOps.") and replace the listed items
accordingly.
In `@docs/assets/js/0fff8dc8.70193857.js`:
- Line 1: The Quick Start page currently displays default admin credentials
("admin@admin.com" / "admin") without a security warning; update the source
markdown (quick-start.md) so that near the "Default admin credentials" section
(rendered by function d and the code nodes showing "admin@admin.com" and
"admin") you add a prominent security note advising users to rotate these
credentials immediately, restrict them to local development only, and never
expose default logins in production or public environments.
In `@docs/assets/js/14eb3368.60af715e.js`:
- Line 1: The Biome linter is flagging generated Docusaurus bundle files (e.g.,
docs/assets/js/14eb3368.60af715e.js) with the noInvalidUseBeforeDeclaration
rule; add an ignore entry to your Biome config (biome.json) to exclude generated
assets under docs/assets/js/** (or the appropriate glob) so these minified
webpack bundles are not linted. Update biome.json's "ignore" array to include
that pattern and ensure the project picks up the config (restart or re-run lint)
so files like the bundle above are skipped.
In `@docs/assets/js/4caa95bf.4a20c460.js`:
- Line 1: The webpack chunk is a generated artifact; fix the underlying source:
rename the image referenced as "v1.0.0-psdb-fixed-length-encodding-..." to
"v1.0.0-psdb-fixed-length-encoding-..." (update the asset file and any markdown
references), and correct typos in the markdown source referenced by the parsed
metadata (source:"@site/docs/online-feature-store/v1.0.0/data-formats.md") —
change "at the hear of" → "at the heart of", "inspired form other" → "inspired
from other", and "used wile storing" → "used while storing" so regenerated
bundles (the code that builds the asset strings like the one containing
i(5153).A) will include the corrected filename and text.
In `@docs/assets/js/8cdb4121.8360b1d4.js`:
- Line 1: The rendered bundle shows a leftover "link" placeholder in the "Image
Streaming" list item inside the rendered tree produced by function d (JSX n.li
under the "Image Streaming & Deployment" section); update the original MDX
source for the post (the content that builds d) to replace the literal "link"
placeholder with the intended hyperlink (either a Markdown inline link or an
HTML <a> tag) so the "Image Streaming: We utilize container image streaming..."
bullet reads with the proper URL instead of the word "link".
In `@docs/assets/js/fcf4f6ca.8b12d88e.js`:
- Line 1: Update the exported JSON in the module where e.exports is set (the
object under categoryGeneratedIndex) to correct the typo in the "description"
value: replace "user managemnet" with "user management" so the description
string reads "...supports Feature catalog, management, user management and other
adminops"; locate the JSON assigned to e.exports (or the categoryGeneratedIndex
object) and update that description field accordingly.
🧹 Nitpick comments (5)
docs/assets/js/f2c141e4.353b4adf.js (1)
1-1: This is an auto-generated webpack bundle.This file is a minified build artifact produced by Docusaurus/webpack. Reviewing generated code for logic or style issues is not meaningful—the actual source of truth is the markdown/MDX blog content.
Consider whether committing built documentation assets is intentional. If the docs site is deployed via GitHub Pages from a
docs/folder, this pattern is acceptable. However, using CI/CD to build and deploy (e.g., to agh-pagesbranch or external hosting) would reduce repository noise and avoid merge conflicts in these generated files.docs/assets/js/3650a837.fd1a89f8.js (1)
1-1: Generated build artifact committed to version control.This is a minified webpack chunk generated by Docusaurus. Committing build artifacts is acceptable if deploying GitHub Pages directly from the repository branch. However, if you have a CI/CD pipeline that builds documentation during deployment, consider adding
docs/assets/js/*.jsto.gitignoreto avoid:
- Merge conflicts from hash changes on rebuilds
- Repository bloat from binary/minified assets
- Potential inconsistencies between source and built output
If this is intentional for your deployment strategy, this is fine as-is.
docs/assets/js/bf2864cf.6fc085c5.js (1)
1-1: Consider excluding generated webpack bundles from the repo.Minified build artifacts are hard to review and can drift from their source. Prefer committing the source MDX/JS and rebuilding in the docs pipeline, or at least ensure the build is deterministic to keep diffs minimal.
docs/assets/js/d01bc907.3a0113c2.js (1)
1-1: Document the rationale for committing generated build artifacts and establish regeneration practices.Build artifacts (94 JS files) in
docs/assets/are intentionally tracked in git without.gitignorerules or CI/CD automation. While this approach is valid for direct documentation hosting, it carries trade-offs: repository bloat, merge conflict potential, and staleness if source files drift from builds.To maintain this approach robustly, consider:
- Document the decision — Add a note in CONTRIBUTING.md or README explaining why artifacts are committed and when they must be regenerated
- Enforce regeneration — Establish a process (pre-commit hook, release checklist) ensuring builds are regenerated consistently before commits
- Automate if possible — Alternatively, evaluate setting up CI/CD to auto-generate and host docs, reducing manual maintenance
docs-src/docs/skye/v1.0.0/index.md (1)
8-14: Use.mdxextension for files with JSX/imports.This file uses JSX components and
importstatements, which are MDX features. While current Docusaurus versions (v2/v3) support these in.mdfiles, Docusaurus recommends and future versions may require.mdxfor files using JSX, imports, or exports. Rename toindex.mdxand update any path references for better compatibility and editor tooling support.
|
|
||
| # Trufflebox UI v1.0.0 | ||
|
|
||
| Trufflebox UI is a modern, feature-rich UI framework for supporting MLOps. It supports feature catalog, management, user management, and other admin operations. |
There was a problem hiding this comment.
Tighten wording to avoid redundancy.
“feature catalog, management, user management” reads repetitive/unclear. Consider “feature catalog management, user management, and other admin operations.”
🤖 Prompt for AI Agents
In `@docs-src/docs/trufflebox-ui/v1.0.0/index.md` at line 12, Update the sentence
that currently reads "feature catalog, management, user management, and other
admin operations" to remove redundancy by changing it to "feature catalog
management, user management, and other admin operations" so the phrase is clear
and parallel; locate and edit the sentence in the Trufflebox UI description (the
line containing "Trufflebox UI is a modern, feature-rich UI framework for
supporting MLOps.") and replace the listed items accordingly.
| @@ -0,0 +1 @@ | |||
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[9596],{5958:(e,n,s)=>{s.r(n),s.d(n,{assets:()=>a,contentTitle:()=>c,default:()=>h,frontMatter:()=>t,metadata:()=>r,toc:()=>o});const r=JSON.parse('{"id":"quick-start/v1.0.0/quick-start","title":"Quick Start","description":"Discord","source":"@site/docs/quick-start/v1.0.0/quick-start.md","sourceDirName":"quick-start/v1.0.0","slug":"/quick-start/v1.0.0/quick-start","permalink":"/BharatMLStack/quick-start/v1.0.0/quick-start","draft":false,"unlisted":false,"editUrl":"https://github.com/Meesho/BharatMLStack/tree/main/docs/docs/quick-start/v1.0.0/quick-start.md","tags":[],"version":"current","sidebarPosition":1,"frontMatter":{"title":"Quick Start","sidebar_position":1},"sidebar":"tutorialSidebar","previous":{"title":"v1.0.0","permalink":"/BharatMLStack/quick-start/v1.0.0"},"next":{"title":"Trufflebox UI","permalink":"/BharatMLStack/category/trufflebox-ui"}}');var i=s(4848),l=s(8453);const t={title:"Quick Start",sidebar_position:1},c="BharatML Stack Quick Start Guide",a={},o=[{value:"Prerequisites",id:"prerequisites",level:2},{value:"System Components",id:"system-components",level:2},{value:"Quick Start",id:"quick-start",level:2},{value:"Starting the System",id:"starting-the-system",level:3},{value:"Testing Different Versions",id:"testing-different-versions",level:3},{value:"Stopping the System",id:"stopping-the-system",level:3},{value:"Accessing Services",id:"accessing-services",level:2},{value:"Frontend UI",id:"frontend-ui",level:3},{value:"API Endpoints",id:"api-endpoints",level:3},{value:"Database Access",id:"database-access",level:3},{value:"Feature Store API Examples",id:"feature-store-api-examples",level:2},{value:"gRPC API Commands",id:"grpc-api-commands",level:3},{value:"Sample Request Bodies",id:"sample-request-bodies",level:3},{value:"Key Points",id:"key-points",level:3},{value:"Response Format Differences",id:"response-format-differences",level:3},{value:"Managing Services",id:"managing-services",level:2},{value:"Viewing Logs",id:"viewing-logs",level:3},{value:"Service Management",id:"service-management",level:3},{value:"Troubleshooting",id:"troubleshooting",level:2},{value:"Common Issues",id:"common-issues",level:3},{value:"Service Dependencies",id:"service-dependencies",level:3},{value:"Development",id:"development",level:2},{value:"Contributing",id:"contributing",level:2},{value:"Community & Support",id:"community--support",level:2},{value:"License",id:"license",level:2}];function d(e){const n={a:"a",code:"code",h1:"h1",h2:"h2",h3:"h3",header:"header",hr:"hr",img:"img",li:"li",ol:"ol",p:"p",pre:"pre",strong:"strong",ul:"ul",...(0,l.R)(),...e.components};return(0,i.jsxs)(i.Fragment,{children:[(0,i.jsx)(n.header,{children:(0,i.jsx)(n.h1,{id:"bharatml-stack-quick-start-guide",children:"BharatML Stack Quick Start Guide"})}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.a,{href:"https://discord.gg/XkT7XsV2AU",children:(0,i.jsx)(n.img,{src:"https://img.shields.io/badge/Discord-Join%20Chat-7289da?style=flat&logo=discord&logoColor=white",alt:"Discord"})})}),"\n",(0,i.jsx)(n.p,{children:"A quick way to get the BharatML Stack Online Feature Store platform up and running locally for development and testing."}),"\n",(0,i.jsx)(n.h2,{id:"prerequisites",children:"Prerequisites"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsx)(n.li,{children:"Docker and Docker Compose"}),"\n",(0,i.jsx)(n.li,{children:"Go 1.22 or later"}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"nc"})," (netcat) command for connectivity checks"]}),"\n",(0,i.jsx)(n.li,{children:"Bash shell"}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"grpcurl"})," for testing gRPC API endpoints (install from ",(0,i.jsx)(n.a,{href:"https://github.com/fullstorydev/grpcurl",children:"https://github.com/fullstorydev/grpcurl"}),")"]}),"\n"]}),"\n",(0,i.jsx)(n.h2,{id:"system-components",children:"System Components"}),"\n",(0,i.jsx)(n.p,{children:"BharatMLStack's Online Feature Store consists of several interconnected services:"}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Infrastructure Services:"})}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"ScyllaDB"}),": NoSQL database for high-performance feature storage"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"MySQL"}),": Relational database for metadata and configuration"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Redis"}),": In-memory data store for caching"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"etcd"}),": Distributed key-value store for service coordination"]}),"\n"]}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Application Services:"})}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Horizon"}),": Backend API service (runs on port 8082)"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Trufflebox UI"}),": Frontend web interface (runs on port 3000)"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Online Feature Store gRPC API Server"}),": High-performance gRPC interface (runs on port 8089)"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"etcd Workbench"}),": etcd management interface (runs on port 8081)"]}),"\n"]}),"\n",(0,i.jsx)(n.p,{children:"All services are orchestrated using Docker Compose with pre-built images from GitHub Container Registry (GHCR)."}),"\n",(0,i.jsx)(n.h2,{id:"quick-start",children:"Quick Start"}),"\n",(0,i.jsx)(n.h3,{id:"starting-the-system",children:"Starting the System"}),"\n",(0,i.jsx)(n.p,{children:"Run the start script to set up your workspace and launch all services:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"./start.sh\n"})}),"\n",(0,i.jsx)(n.h3,{id:"testing-different-versions",children:"Testing Different Versions"}),"\n",(0,i.jsx)(n.p,{children:"You can easily test different versions of the application services by setting environment variables:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"# Test specific versions [Replace with actual versions]\nONFS_VERSION=v1.2.3 HORIZON_VERSION=v2.1.0 TRUFFLEBOX_VERSION=v1.0.5 ./start.sh\n\n# Or set them in your workspace and run docker-compose directly\ncd workspace\nONFS_VERSION=main docker-compose up -d onfs-api-server\n"})}),"\n",(0,i.jsx)(n.p,{children:"Available version formats:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"latest"})," (default) - Latest stable release"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"main"})," - Latest development build"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"v1.2.3"})," - Specific version tag"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"sha-abcd1234"})," - Specific commit SHA"]}),"\n"]}),"\n",(0,i.jsx)(n.p,{children:"This will:"}),"\n",(0,i.jsxs)(n.ol,{children:["\n",(0,i.jsx)(n.li,{children:"Check for Go installation (1.22+ required)"}),"\n",(0,i.jsx)(n.li,{children:"Create a workspace directory with configuration files"}),"\n",(0,i.jsxs)(n.li,{children:["Pull and start all services using ",(0,i.jsx)(n.code,{children:"docker-compose up -d"})]}),"\n",(0,i.jsx)(n.li,{children:"Wait for services to become healthy"}),"\n",(0,i.jsx)(n.li,{children:"Initialize databases with required schemas"}),"\n",(0,i.jsx)(n.li,{children:"Display access information and helpful commands"}),"\n"]}),"\n",(0,i.jsx)(n.p,{children:"Once complete, you can access:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Trufflebox UI"}),": ",(0,i.jsx)(n.a,{href:"http://localhost:3000",children:"http://localhost:3000"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Horizon API"}),": ",(0,i.jsx)(n.a,{href:"http://localhost:8082",children:"http://localhost:8082"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Online Feature Store gRPC API"}),": ",(0,i.jsx)(n.a,{href:"http://localhost:8089",children:"http://localhost:8089"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"etcd Workbench"}),": ",(0,i.jsx)(n.a,{href:"http://localhost:8081",children:"http://localhost:8081"})]}),"\n"]}),"\n",(0,i.jsx)(n.h3,{id:"stopping-the-system",children:"Stopping the System"}),"\n",(0,i.jsx)(n.p,{children:"To stop all services:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"./stop.sh\n"})}),"\n",(0,i.jsx)(n.p,{children:"To stop and completely purge all containers, volumes, and workspace:"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"./stop.sh --purge\n"})}),"\n",(0,i.jsx)(n.h2,{id:"accessing-services",children:"Accessing Services"}),"\n",(0,i.jsx)(n.h3,{id:"frontend-ui",children:"Frontend UI"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"URL"}),": ",(0,i.jsx)(n.a,{href:"http://localhost:3000",children:"http://localhost:3000"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Default admin credentials"}),":","\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:["Email: ",(0,i.jsx)(n.code,{children:"admin@admin.com"})]}),"\n",(0,i.jsxs)(n.li,{children:["Password: ",(0,i.jsx)(n.code,{children:"admin"})]}),"\n"]}),"\n"]}),"\n"]}),"\n",(0,i.jsx)(n.h3,{id:"api-endpoints",children:"API Endpoints"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Horizon API"}),": ",(0,i.jsx)(n.a,{href:"http://localhost:8082",children:"http://localhost:8082"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:["Health check: ",(0,i.jsx)(n.a,{href:"http://localhost:8082/health",children:"http://localhost:8082/health"})]}),"\n"]}),"\n"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"ONFS gRPC API"}),": ",(0,i.jsx)(n.a,{href:"http://localhost:8089",children:"http://localhost:8089"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:["Health check: ",(0,i.jsx)(n.a,{href:"http://localhost:8089/health/self",children:"http://localhost:8089/health/self"})]}),"\n"]}),"\n"]}),"\n"]}),"\n",(0,i.jsx)(n.h3,{id:"database-access",children:"Database Access"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"MySQL"}),":"]}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsx)(n.li,{children:"Host: localhost"}),"\n",(0,i.jsx)(n.li,{children:"Port: 3306"}),"\n",(0,i.jsx)(n.li,{children:"Username: root"}),"\n",(0,i.jsx)(n.li,{children:"Password: root"}),"\n",(0,i.jsx)(n.li,{children:"Database: testdb"}),"\n"]}),"\n"]}),"\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"ScyllaDB"}),":"]}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsx)(n.li,{children:"Host: localhost"}),"\n",(0,i.jsx)(n.li,{children:"Port: 9042"}),"\n",(0,i.jsx)(n.li,{children:"Keyspace: onfs"}),"\n"]}),"\n"]}),"\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"Redis"}),":"]}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsx)(n.li,{children:"Host: localhost"}),"\n",(0,i.jsx)(n.li,{children:"Port: 6379"}),"\n"]}),"\n"]}),"\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"etcd"}),":"]}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:["Endpoint: ",(0,i.jsx)(n.a,{href:"http://localhost:2379",children:"http://localhost:2379"})]}),"\n",(0,i.jsxs)(n.li,{children:["Workbench: ",(0,i.jsx)(n.a,{href:"http://localhost:8081",children:"http://localhost:8081"})]}),"\n"]}),"\n"]}),"\n"]}),"\n",(0,i.jsx)(n.h2,{id:"feature-store-api-examples",children:"Feature Store API Examples"}),"\n",(0,i.jsx)(n.h3,{id:"grpc-api-commands",children:"gRPC API Commands"}),"\n",(0,i.jsxs)(n.p,{children:["Use the following ",(0,i.jsx)(n.code,{children:"grpcurl"})," commands to interact with the Online Feature Store gRPC API:"]}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Persist Features:"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:'grpcurl -plaintext -H "online-feature-store-caller-id: <caller-id>" -H "online-feature-store-auth-token: <auth-token>" -d \'<request-body>\' localhost:8089 persist.FeatureService/PersistFeatures\n'})}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Retrieve Features (Decoded):"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:'grpcurl -plaintext -H "online-feature-store-caller-id: <caller-id>" -H "online-feature-store-auth-token: <auth-token>" -d \'<request-body>\' localhost:8089 retrieve.FeatureService/RetrieveDecodedResult\n'})}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Retrieve Features (Binary):"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:'grpcurl -plaintext -H "online-feature-store-caller-id: <caller-id>" -H "online-feature-store-auth-token: <auth-token>" -d \'<request-body>\' localhost:8089 retrieve.FeatureService/RetrieveFeatures\n'})}),"\n",(0,i.jsx)(n.h3,{id:"sample-request-bodies",children:"Sample Request Bodies"}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Single Feature Group Persist:"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-json",children:'{\n "data": [{\n "key_values": ["10"],\n "feature_values": [{\n "values": {"fp32_values": [123.45]}\n }]\n }],\n "entity_label": "catalog",\n "feature_group_schema": [{\n "label": "int_fg",\n "feature_labels": ["id"]\n }],\n "keys_schema": ["catalog_id"]\n}\n'})}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Single Feature Group Retrieve:"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-json",children:'{\n "entity_label": "catalog",\n "feature_groups": [{\n "label": "int_fg",\n "feature_labels": ["id"]\n }],\n "keys_schema": ["catalog_id"],\n "keys": [{"cols": ["10"]}]\n}\n'})}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Multiple Feature Groups Persist:"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-json",children:'{\n "data": [\n {\n "key_values": ["1"],\n "feature_values": [\n {"values": {"fp32_values": [28.5]}},\n {"values": {"string_values": ["Bharat"]}}\n ]\n },\n {\n "key_values": ["2"],\n "feature_values": [\n {"values": {"fp32_values": [32.0]}},\n {"values": {"string_values": ["India"]}}\n ]\n }\n ],\n "entity_label": "catalog",\n "feature_group_schema": [\n {"label": "int_fg", "feature_labels": ["id"]},\n {"label": "string_fg", "feature_labels": ["name"]}\n ],\n "keys_schema": ["catalog_id"]\n}\n'})}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Multiple Feature Groups Retrieve:"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-json",children:'{\n "entity_label": "catalog",\n "feature_groups": [\n {"label": "int_fg", "feature_labels": ["id"]},\n {"label": "string_fg", "feature_labels": ["name"]}\n ],\n "keys_schema": ["catalog_id"],\n "keys": [\n {"cols": ["1"]},\n {"cols": ["2"]}\n ]\n}\n'})}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Vector Feature Group Persist:"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-json",children:'{\n "data": [{\n "key_values": ["123"],\n "feature_values": [{\n "values": {\n "vector": [{\n "values": {"fp32_values": [1.0, 2.0, 3.0, 4.0]}\n }]\n }\n }]\n }],\n "entity_label": "catalog",\n "feature_group_schema": [{\n "label": "vector_fg",\n "feature_labels": ["embedding"]\n }],\n "keys_schema": ["catalog_id"]\n}\n'})}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Vector Feature Group Retrieve:"})}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-json",children:'{\n "entity_label": "catalog",\n "feature_groups": [{\n "label": "vector_fg",\n "feature_labels": ["embedding"]\n }],\n "keys_schema": ["catalog_id"],\n "keys": [{"cols": ["123"]}]\n}\n'})}),"\n",(0,i.jsx)(n.h3,{id:"key-points",children:"Key Points"}),"\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.strong,{children:"Only one type per feature value block:"})}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"feature_values"})," is a list, and each item in the list has only one value type populated"]}),"\n",(0,i.jsxs)(n.li,{children:["For example: one item has only ",(0,i.jsx)(n.code,{children:"fp32_values"}),", another has only ",(0,i.jsx)(n.code,{children:"int64_values"})]}),"\n"]}),"\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"Field Types:"}),"\nThe following value types are supported:"]}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"fp32_values"}),": ",(0,i.jsx)(n.code,{children:"float32[]"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"fp64_values"}),": ",(0,i.jsx)(n.code,{children:"float64[]"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"int32_values"}),": ",(0,i.jsx)(n.code,{children:"int32[]"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"int64_values"}),": ",(0,i.jsx)(n.code,{children:"string[]"})," (because JSON doesn't support 64-bit ints directly)"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"uint32_values"}),": ",(0,i.jsx)(n.code,{children:"uint32[]"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"uint64_values"}),": ",(0,i.jsx)(n.code,{children:"string[]"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"string_values"}),": ",(0,i.jsx)(n.code,{children:"string[]"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"bool_values"}),": ",(0,i.jsx)(n.code,{children:"bool[]"})]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"vector"}),": list of objects with nested values (used for embedded features)"]}),"\n"]}),"\n",(0,i.jsx)(n.h3,{id:"response-format-differences",children:"Response Format Differences"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Retrieve Features (Binary)"}),": Returns data in binary format for optimal performance and reduced network overhead"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.strong,{children:"Retrieve Features (Decoded)"}),": Returns data in human-readable string format for easier debugging and development purposes"]}),"\n"]}),"\n",(0,i.jsx)(n.h2,{id:"managing-services",children:"Managing Services"}),"\n",(0,i.jsx)(n.h3,{id:"viewing-logs",children:"Viewing Logs"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"# View logs for all services\ncd workspace && docker-compose logs -f\n\n# View logs for specific services\ncd workspace && docker-compose logs -f horizon\ncd workspace && docker-compose logs -f trufflebox-ui\ncd workspace && docker-compose logs -f onfs-api-server\n"})}),"\n",(0,i.jsx)(n.h3,{id:"service-management",children:"Service Management"}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"# Restart a specific service\ncd workspace && docker-compose restart horizon\n\n# Stop all services\ncd workspace && docker-compose down\n\n# Start services again\ncd workspace && docker-compose up -d\n\n# Check service status\ncd workspace && docker-compose ps\n"})}),"\n",(0,i.jsx)(n.h2,{id:"troubleshooting",children:"Troubleshooting"}),"\n",(0,i.jsx)(n.h3,{id:"common-issues",children:"Common Issues"}),"\n",(0,i.jsxs)(n.ol,{children:["\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"Port conflicts"}),": Ensure ports 3000, 8081, 8082, 8089, 9042, 3306, 6379, and 2379 are not in use by other applications."]}),"\n"]}),"\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"Docker network issues"}),": If containers can't communicate, try recreating:"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"docker network rm onfs-network\ndocker network create onfs-network\n"})}),"\n"]}),"\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"Service health checks failing"}),": Check if all infrastructure services (databases) are running:"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"cd workspace && docker-compose ps\n"})}),"\n"]}),"\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsxs)(n.p,{children:[(0,i.jsx)(n.strong,{children:"Image pull issues"}),": Ensure you have access to GitHub Container Registry:"]}),"\n",(0,i.jsx)(n.pre,{children:(0,i.jsx)(n.code,{className:"language-bash",children:"docker login ghcr.io\n"})}),"\n"]}),"\n",(0,i.jsxs)(n.li,{children:["\n",(0,i.jsx)(n.p,{children:(0,i.jsx)(n.a,{href:"https://github.com/tzfun/etcd-workbench/blob/master/README.md",children:"How to use Etcd Workbench ?"})}),"\n"]}),"\n"]}),"\n",(0,i.jsx)(n.h3,{id:"service-dependencies",children:"Service Dependencies"}),"\n",(0,i.jsx)(n.p,{children:"Services start in the following order:"}),"\n",(0,i.jsxs)(n.ol,{children:["\n",(0,i.jsx)(n.li,{children:"Infrastructure services (ScyllaDB, MySQL, Redis, etcd)"}),"\n",(0,i.jsx)(n.li,{children:"Online Feature Store gRPC API Server"}),"\n",(0,i.jsx)(n.li,{children:"Horizon (depends on databases + ONFS API)"}),"\n",(0,i.jsx)(n.li,{children:"Trufflebox UI (depends on Horizon)"}),"\n"]}),"\n",(0,i.jsx)(n.p,{children:"If a service fails to start, check its dependencies are healthy first."}),"\n",(0,i.jsx)(n.h2,{id:"development",children:"Development"}),"\n",(0,i.jsx)(n.p,{children:"The workspace directory contains all runtime configuration:"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"workspace/docker-compose.yml"})," - Complete service orchestration"]}),"\n",(0,i.jsxs)(n.li,{children:[(0,i.jsx)(n.code,{children:"workspace/check_db_and_init.sh"})," - Database initialization script"]}),"\n"]}),"\n",(0,i.jsx)(n.p,{children:"You can modify environment variables in the docker-compose.yml file and restart services."}),"\n",(0,i.jsx)(n.h2,{id:"contributing",children:"Contributing"}),"\n",(0,i.jsxs)(n.p,{children:["We welcome contributions from the community! Please see our ",(0,i.jsx)(n.a,{href:"https://github.com/Meesho/BharatMLStack/blob/main/CONTRIBUTING.md",children:"Contributing Guide"})," for details on how to get started."]}),"\n",(0,i.jsx)(n.h2,{id:"community--support",children:"Community & Support"}),"\n",(0,i.jsxs)(n.ul,{children:["\n",(0,i.jsxs)(n.li,{children:["\ud83d\udcac ",(0,i.jsx)(n.strong,{children:"Discord"}),": Join our ",(0,i.jsx)(n.a,{href:"https://discord.gg/XkT7XsV2AU",children:"community chat"})]}),"\n",(0,i.jsxs)(n.li,{children:["\ud83d\udc1b ",(0,i.jsx)(n.strong,{children:"Issues"}),": Report bugs and request features on ",(0,i.jsx)(n.a,{href:"https://github.com/Meesho/BharatMLStack/issues",children:"GitHub Issues"})]}),"\n",(0,i.jsxs)(n.li,{children:["\ud83d\udce7 ",(0,i.jsx)(n.strong,{children:"Email"}),": Contact us at ",(0,i.jsx)(n.a,{href:"mailto:ml-oss@meesho.com",children:"ml-oss@meesho.com"})]}),"\n"]}),"\n",(0,i.jsx)(n.h2,{id:"license",children:"License"}),"\n",(0,i.jsxs)(n.p,{children:["BharatMLStack is open-source software licensed under the ",(0,i.jsx)(n.a,{href:"https://github.com/Meesho/BharatMLStack/blob/main/LICENSE.md",children:"BharatMLStack Business Source License 1.1"}),"."]}),"\n",(0,i.jsx)(n.hr,{}),"\n",(0,i.jsx)("div",{align:"center",children:(0,i.jsx)("strong",{children:"Built with \u2764\ufe0f for the ML community from Meesho"})}),"\n",(0,i.jsx)("div",{align:"center",children:(0,i.jsx)("strong",{children:"If you find this useful, \u2b50\ufe0f the repo \u2014 your support means the world to us!"})})]})}function h(e={}){const{wrapper:n}={...(0,l.R)(),...e.components};return n?(0,i.jsx)(n,{...e,children:(0,i.jsx)(d,{...e})}):d(e)}},8453:(e,n,s)=>{s.d(n,{R:()=>t,x:()=>c});var r=s(6540);const i={},l=r.createContext(i);function t(e){const n=r.useContext(l);return r.useMemo(function(){return"function"==typeof e?e(n):{...n,...e}},[n,e])}function c(e){let n;return n=e.disableParentContext?"function"==typeof e.components?e.components(i):e.components||i:t(e.components),r.createElement(l.Provider,{value:n},e.children)}}}]); No newline at end of file | |||
There was a problem hiding this comment.
Add a warning to rotate default admin credentials in the Quick Start docs.
The doc text includes default admin credentials (admin@admin.com / admin) without a clear warning to change them. Please add a security note in the source markdown (docs/quick-start/v1.0.0/quick-start.md) reminding users to rotate credentials and avoid exposing default logins outside local dev.
🤖 Prompt for AI Agents
In `@docs/assets/js/0fff8dc8.70193857.js` at line 1, The Quick Start page
currently displays default admin credentials ("admin@admin.com" / "admin")
without a security warning; update the source markdown (quick-start.md) so that
near the "Default admin credentials" section (rendered by function d and the
code nodes showing "admin@admin.com" and "admin") you add a prominent security
note advising users to rotate these credentials immediately, restrict them to
local development only, and never expose default logins in production or public
environments.
| @@ -0,0 +1 @@ | |||
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[6969],{594:(e,s,n)=>{n.d(s,{A:()=>j});n(6540);var t=n(4164),r=n(7559),a=n(6972),i=n(9169),c=n(8774),l=n(1312),o=n(6025),d=n(4848);function u(e){return(0,d.jsx)("svg",{viewBox:"0 0 24 24",...e,children:(0,d.jsx)("path",{d:"M10 19v-5h4v5c0 .55.45 1 1 1h3c.55 0 1-.45 1-1v-7h1.7c.46 0 .68-.57.33-.87L12.67 3.6c-.38-.34-.96-.34-1.34 0l-8.36 7.53c-.34.3-.13.87.33.87H5v7c0 .55.45 1 1 1h3c.55 0 1-.45 1-1z",fill:"currentColor"})})}const m={breadcrumbHomeIcon:"breadcrumbHomeIcon_YNFT"};function h(){const e=(0,o.Ay)("/");return(0,d.jsx)("li",{className:"breadcrumbs__item",children:(0,d.jsx)(c.A,{"aria-label":(0,l.T)({id:"theme.docs.breadcrumbs.home",message:"Home page",description:"The ARIA label for the home page in the breadcrumbs"}),className:"breadcrumbs__link",href:e,children:(0,d.jsx)(u,{className:m.breadcrumbHomeIcon})})})}var b=n(5260),x=n(4586);function p(e){const s=function({breadcrumbs:e}){const{siteConfig:s}=(0,x.A)();return{"@context":"https://schema.org","@type":"BreadcrumbList",itemListElement:e.filter(e=>e.href).map((e,n)=>({"@type":"ListItem",position:n+1,name:e.label,item:`${s.url}${e.href}`}))}}({breadcrumbs:e.breadcrumbs});return(0,d.jsx)(b.A,{children:(0,d.jsx)("script",{type:"application/ld+json",children:JSON.stringify(s)})})}const v={breadcrumbsContainer:"breadcrumbsContainer_Z_bl"};function g({children:e,href:s,isLast:n}){const t="breadcrumbs__link";return n?(0,d.jsx)("span",{className:t,children:e}):s?(0,d.jsx)(c.A,{className:t,href:s,children:(0,d.jsx)("span",{children:e})}):(0,d.jsx)("span",{className:t,children:e})}function f({children:e,active:s}){return(0,d.jsx)("li",{className:(0,t.A)("breadcrumbs__item",{"breadcrumbs__item--active":s}),children:e})}function j(){const e=(0,a.OF)(),s=(0,i.Dt)();return e?(0,d.jsxs)(d.Fragment,{children:[(0,d.jsx)(p,{breadcrumbs:e}),(0,d.jsx)("nav",{className:(0,t.A)(r.G.docs.docBreadcrumbs,v.breadcrumbsContainer),"aria-label":(0,l.T)({id:"theme.docs.breadcrumbs.navAriaLabel",message:"Breadcrumbs",description:"The ARIA label for the breadcrumbs"}),children:(0,d.jsxs)("ul",{className:"breadcrumbs",children:[s&&(0,d.jsx)(h,{}),e.map((s,n)=>{const t=n===e.length-1,r="category"===s.type&&s.linkUnlisted?void 0:s.href;return(0,d.jsx)(f,{active:t,children:(0,d.jsx)(g,{href:r,isLast:t,children:s.label})},n)})]})})]}):null}},1878:(e,s,n)=>{n.d(s,{A:()=>p});n(6540);var t=n(4164),r=n(4586),a=n(8774),i=n(1312),c=n(4070),l=n(7559),o=n(3886),d=n(3025),u=n(4848);const m={unreleased:function({siteTitle:e,versionMetadata:s}){return(0,u.jsx)(i.A,{id:"theme.docs.versions.unreleasedVersionLabel",description:"The label used to tell the user that he's browsing an unreleased doc version",values:{siteTitle:e,versionLabel:(0,u.jsx)("b",{children:s.label})},children:"This is unreleased documentation for {siteTitle} {versionLabel} version."})},unmaintained:function({siteTitle:e,versionMetadata:s}){return(0,u.jsx)(i.A,{id:"theme.docs.versions.unmaintainedVersionLabel",description:"The label used to tell the user that he's browsing an unmaintained doc version",values:{siteTitle:e,versionLabel:(0,u.jsx)("b",{children:s.label})},children:"This is documentation for {siteTitle} {versionLabel}, which is no longer actively maintained."})}};function h(e){const s=m[e.versionMetadata.banner];return(0,u.jsx)(s,{...e})}function b({versionLabel:e,to:s,onClick:n}){return(0,u.jsx)(i.A,{id:"theme.docs.versions.latestVersionSuggestionLabel",description:"The label used to tell the user to check the latest version",values:{versionLabel:e,latestVersionLink:(0,u.jsx)("b",{children:(0,u.jsx)(a.A,{to:s,onClick:n,children:(0,u.jsx)(i.A,{id:"theme.docs.versions.latestVersionLinkLabel",description:"The label used for the latest version suggestion link label",children:"latest version"})})})},children:"For up-to-date documentation, see the {latestVersionLink} ({versionLabel})."})}function x({className:e,versionMetadata:s}){const{siteConfig:{title:n}}=(0,r.A)(),{pluginId:a}=(0,c.vT)({failfast:!0}),{savePreferredVersionName:i}=(0,o.g1)(a),{latestDocSuggestion:d,latestVersionSuggestion:m}=(0,c.HW)(a),x=d??(p=m).docs.find(e=>e.id===p.mainDocId);var p;return(0,u.jsxs)("div",{className:(0,t.A)(e,l.G.docs.docVersionBanner,"alert alert--warning margin-bottom--md"),role:"alert",children:[(0,u.jsx)("div",{children:(0,u.jsx)(h,{siteTitle:n,versionMetadata:s})}),(0,u.jsx)("div",{className:"margin-top--md",children:(0,u.jsx)(b,{versionLabel:m.label,to:x.path,onClick:()=>i(m.name)})})]})}function p({className:e}){const s=(0,d.r)();return s.banner?(0,u.jsx)(x,{className:e,versionMetadata:s}):null}},4267:(e,s,n)=>{n.d(s,{A:()=>l});n(6540);var t=n(4164),r=n(1312),a=n(7559),i=n(3025),c=n(4848);function l({className:e}){const s=(0,i.r)();return s.badge?(0,c.jsx)("span",{className:(0,t.A)(e,a.G.docs.docVersionBadge,"badge badge--secondary"),children:(0,c.jsx)(r.A,{id:"theme.docs.versionBadge.label",values:{versionLabel:s.label},children:"Version: {versionLabel}"})}):null}},4795:(e,s,n)=>{n.d(s,{A:()=>j});n(6540);var t=n(4164),r=n(6972),a=n(8774),i=n(5846),c=n(6654),l=n(1312),o=n(1107);const d={cardContainer:"cardContainer_fWXF",cardTitle:"cardTitle_rnsV",cardDescription:"cardDescription_PWke"};var u=n(4848);function m({className:e,href:s,children:n}){return(0,u.jsx)(a.A,{href:s,className:(0,t.A)("card padding--lg",d.cardContainer,e),children:n})}function h({className:e,href:s,icon:n,title:r,description:a}){return(0,u.jsxs)(m,{href:s,className:e,children:[(0,u.jsxs)(o.A,{as:"h2",className:(0,t.A)("text--truncate",d.cardTitle),title:r,children:[n," ",r]}),a&&(0,u.jsx)("p",{className:(0,t.A)("text--truncate",d.cardDescription),title:a,children:a})]})}function b({item:e}){const s=(0,r.Nr)(e),n=function(){const{selectMessage:e}=(0,i.W)();return s=>e(s,(0,l.T)({message:"1 item|{count} items",id:"theme.docs.DocCard.categoryDescription.plurals",description:"The default description for a category card in the generated index about how many items this category includes"},{count:s}))}();return s?(0,u.jsx)(h,{className:e.className,href:s,icon:"\ud83d\uddc3\ufe0f",title:e.label,description:e.description??n(e.items.length)}):null}function x({item:e}){const s=(0,c.A)(e.href)?"\ud83d\udcc4\ufe0f":"\ud83d\udd17",n=(0,r.cC)(e.docId??void 0);return(0,u.jsx)(h,{className:e.className,href:e.href,icon:s,title:e.label,description:e.description??n?.description})}function p({item:e}){switch(e.type){case"link":return(0,u.jsx)(x,{item:e});case"category":return(0,u.jsx)(b,{item:e});default:throw new Error(`unknown item type ${JSON.stringify(e)}`)}}const v={docCardListItem:"docCardListItem_W1sv"};function g({className:e}){const s=(0,r.a4)();return(0,u.jsx)(j,{items:s,className:e})}function f({item:e}){return(0,u.jsx)("article",{className:(0,t.A)(v.docCardListItem,"col col--6"),children:(0,u.jsx)(p,{item:e})})}function j(e){const{items:s,className:n}=e;if(!s)return(0,u.jsx)(g,{...e});const a=(0,r.d1)(s);return(0,u.jsx)("section",{className:(0,t.A)("row",n),children:a.map((e,s)=>(0,u.jsx)(f,{item:e},s))})}},5846:(e,s,n)=>{n.d(s,{W:()=>o});var t=n(6540),r=n(4586);const a=["zero","one","two","few","many","other"];function i(e){return a.filter(s=>e.includes(s))}const c={locale:"en",pluralForms:i(["one","other"]),select:e=>1===e?"one":"other"};function l(){const{i18n:{currentLocale:e}}=(0,r.A)();return(0,t.useMemo)(()=>{try{return function(e){const s=new Intl.PluralRules(e);return{locale:e,pluralForms:i(s.resolvedOptions().pluralCategories),select:e=>s.select(e)}}(e)}catch(s){return console.error(`Failed to use Intl.PluralRules for locale "${e}".\nDocusaurus will fallback to the default (English) implementation.\nError: ${s.message}\n`),c}},[e])}function o(){const e=l();return{selectMessage:(s,n)=>function(e,s,n){const t=e.split("|");if(1===t.length)return t[0];t.length>n.pluralForms.length&&console.error(`For locale=${n.locale}, a maximum of ${n.pluralForms.length} plural forms are expected (${n.pluralForms.join(",")}), but the message contains ${t.length}: ${e}`);const r=n.select(s),a=n.pluralForms.indexOf(r);return t[Math.min(a,t.length-1)]}(n,s,e)}}},5847:(e,s,n)=>{n.r(s),n.d(s,{default:()=>p});n(6540);var t=n(5500),r=n(6972),a=n(6025),i=n(4795),c=n(7719),l=n(1878),o=n(4267),d=n(594),u=n(1107);const m={generatedIndexPage:"generatedIndexPage_vN6x",title:"title_kItE"};var h=n(4848);function b({categoryGeneratedIndex:e}){return(0,h.jsx)(t.be,{title:e.title,description:e.description,keywords:e.keywords,image:(0,a.Ay)(e.image)})}function x({categoryGeneratedIndex:e}){const s=(0,r.$S)();return(0,h.jsxs)("div",{className:m.generatedIndexPage,children:[(0,h.jsx)(l.A,{}),(0,h.jsx)(d.A,{}),(0,h.jsx)(o.A,{}),(0,h.jsxs)("header",{children:[(0,h.jsx)(u.A,{as:"h1",className:m.title,children:e.title}),e.description&&(0,h.jsx)("p",{children:e.description})]}),(0,h.jsx)("article",{className:"margin-top--lg",children:(0,h.jsx)(i.A,{items:s.items,className:m.list})}),(0,h.jsx)("footer",{className:"margin-top--md",children:(0,h.jsx)(c.A,{previous:e.navigation.previous,next:e.navigation.next})})]})}function p(e){return(0,h.jsxs)(h.Fragment,{children:[(0,h.jsx)(b,{...e}),(0,h.jsx)(x,{...e})]})}},7719:(e,s,n)=>{n.d(s,{A:()=>c});n(6540);var t=n(4164),r=n(1312),a=n(9022),i=n(4848);function c(e){const{className:s,previous:n,next:c}=e;return(0,i.jsxs)("nav",{className:(0,t.A)(s,"pagination-nav"),"aria-label":(0,r.T)({id:"theme.docs.paginator.navAriaLabel",message:"Docs pages",description:"The ARIA label for the docs pagination"}),children:[n&&(0,i.jsx)(a.A,{...n,subLabel:(0,i.jsx)(r.A,{id:"theme.docs.paginator.previous",description:"The label used to navigate to the previous doc",children:"Previous"})}),c&&(0,i.jsx)(a.A,{...c,subLabel:(0,i.jsx)(r.A,{id:"theme.docs.paginator.next",description:"The label used to navigate to the next doc",children:"Next"}),isNext:!0})]})}},9022:(e,s,n)=>{n.d(s,{A:()=>i});n(6540);var t=n(4164),r=n(8774),a=n(4848);function i(e){const{permalink:s,title:n,subLabel:i,isNext:c}=e;return(0,a.jsxs)(r.A,{className:(0,t.A)("pagination-nav__link",c?"pagination-nav__link--next":"pagination-nav__link--prev"),to:s,children:[i&&(0,a.jsx)("div",{className:"pagination-nav__sublabel",children:i}),(0,a.jsx)("div",{className:"pagination-nav__label",children:n})]})}}}]); No newline at end of file | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find and inspect Biome configuration files
echo "=== Looking for Biome config files ==="
fd -t f '(biome\.json|biome\.jsonc|\.biomeignore)' .
echo -e "\n=== Content of Biome config files ==="
# Show content of any Biome config files found
fd -t f 'biome\.json' . -x cat -n {}
fd -t f 'biome\.jsonc' . -x cat -n {}
fd -t f '\.biomeignore' . -x cat -n {}Repository: Meesho/BharatMLStack
Length of output: 2441
🏁 Script executed:
#!/bin/bash
# Verify if docs/assets/js is a generated directory
echo "=== Checking for generated asset indicators ==="
ls -la docs/assets/js/ 2>/dev/null | head -20
echo ""
echo "=== Looking for docusaurus or build config ==="
fd -t f '(docusaurus|package\.json)' docs/ --max-depth 2
echo ""
echo "=== Package.json in docs (if exists) ==="
if [ -f docs/package.json ]; then
cat -n docs/package.json
fiRepository: Meesho/BharatMLStack
Length of output: 1539
🏁 Script executed:
#!/bin/bash
# Look for docusaurus config and build setup at root
echo "=== Docusaurus config files ==="
fd -t f 'docusaurus\.' . --max-depth 2
echo ""
echo "=== Root package.json (check if docusaurus is configured) ==="
if [ -f package.json ]; then
head -50 package.json
fi
echo ""
echo "=== Check .gitignore for docs/assets ==="
if [ -f .gitignore ]; then
grep -n 'docs/assets\|build\|dist' .gitignore || echo "No matches found"
fiRepository: Meesho/BharatMLStack
Length of output: 240
🏁 Script executed:
#!/bin/bash
# Check docusaurus config to confirm docs/assets is generated
echo "=== Docusaurus config ==="
head -100 docs-src/docusaurus.config.js
echo ""
echo "=== Check if docs/assets is in .gitignore ==="
cat .gitignore | grep -i 'docs'
echo ""
echo "=== Check file headers for 'generated' markers ==="
head -c 500 docs/assets/js/14eb3368.60af715e.jsRepository: Meesho/BharatMLStack
Length of output: 4146
🏁 Script executed:
#!/bin/bash
# Check Biome documentation for ignore patterns in biome.json
echo "=== Check if Biome ignores can be set in biome.json ==="
# Look for any existing ignore patterns in biome.json
cat biome.json | grep -A 5 -B 5 'ignore\|Ignore' || echo "No ignore patterns found"
echo ""
echo "=== Biome supports 'ignore' via separate config ==="
echo "In Biome 2.3.14, ignores can be set using 'overrides' or via biome.jsonc"
echo "Let's check if there are override patterns:"
cat biome.json | head -80Repository: Meesho/BharatMLStack
Length of output: 2090
🏁 Script executed:
#!/bin/bash
# Verify the exact Biome rule triggered
echo "=== Check Biome rules for 'noInvalidUseBeforeDeclaration' ==="
# This rule is part of recommended:true in correctness category
# Let's confirm it's not disabled
cat biome.json | grep -i 'noInvalidUseBeforeDeclaration'
echo ""
echo "=== Verify docs/assets should NOT be linted ==="
echo "Generated assets (Webpack bundles) at docs/assets/js/ are:"
echo "- Auto-generated by Docusaurus build"
echo "- Minified and unreadable"
echo "- Should be excluded from linting"
echo ""
echo "Current status: NOT excluded from Biome"Repository: Meesho/BharatMLStack
Length of output: 371
Exclude generated Docusaurus bundle from Biome lint checks.
The noInvalidUseBeforeDeclaration rule in Biome is triggering on minified Webpack bundle output in docs/assets/js/. Since these are auto-generated assets from the Docusaurus build, they should not be linted. Add an ignore configuration to biome.json to exclude them:
{
"$schema": "https://biomejs.dev/schemas/2.3.14/schema.json",
+ "ignore": ["docs/assets/js/**"],
"formatter": {
"enabled": false
},🧰 Tools
🪛 Biome (2.3.14)
[error] 1-1: This variable is used before its declaration.
The variable is declared here:
(lint/correctness/noInvalidUseBeforeDeclaration)
🤖 Prompt for AI Agents
In `@docs/assets/js/14eb3368.60af715e.js` at line 1, The Biome linter is flagging
generated Docusaurus bundle files (e.g., docs/assets/js/14eb3368.60af715e.js)
with the noInvalidUseBeforeDeclaration rule; add an ignore entry to your Biome
config (biome.json) to exclude generated assets under docs/assets/js/** (or the
appropriate glob) so these minified webpack bundles are not linted. Update
biome.json's "ignore" array to include that pattern and ensure the project picks
up the config (restart or re-run lint) so files like the bundle above are
skipped.
docs/assets/js/4caa95bf.4a20c460.js
Outdated
| @@ -1 +1 @@ | |||
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[2344],{3560:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-psdb-anatomy-c1735559f93dce6d0bb3894d16047059.png"},6230:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-psdb-fixed-length-encodding-dd252110b084e01cf38f21de16b3a1a5.png"},7676:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-psdb-string-encoding-b1d69e9452269124d1b545020fa27d63.png"},7780:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-csdb-skip-read-e3926080f7341aa7d3c6ec6d8274ea14.png"},8453:(e,n,i)=>{i.d(n,{R:()=>d,x:()=>l});var t=i(6540);const s={},r=t.createContext(s);function d(e){const n=t.useContext(r);return t.useMemo(function(){return"function"==typeof e?e(n):{...n,...e}},[n,e])}function l(e){let n;return n=e.disableParentContext?"function"==typeof e.components?e.components(s):e.components||s:d(e.components),t.createElement(r.Provider,{value:n},e.children)}},8645:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-psdb-bool-encoding-4b154fdf5e6d79a67c91b6fb21c7209e.png"},9584:(e,n,i)=>{i.r(n),i.d(n,{assets:()=>c,contentTitle:()=>l,default:()=>h,frontMatter:()=>d,metadata:()=>t,toc:()=>a});const t=JSON.parse('{"id":"online-feature-store/v1.0.0/data-formats","title":"Data Formats","description":"In this section we will go through the data-formats which is at the hear of online-feature-store, it\'s inspired form other storage efficient formats like parquet & arrow, but custom made to deliver in constraint environment. The two key data-formats are:","source":"@site/docs/online-feature-store/v1.0.0/data-formats.md","sourceDirName":"online-feature-store/v1.0.0","slug":"/online-feature-store/v1.0.0/data-formats","permalink":"/BharatMLStack/online-feature-store/v1.0.0/data-formats","draft":false,"unlisted":false,"editUrl":"https://github.com/Meesho/BharatMLStack/tree/main/docs/docs/online-feature-store/v1.0.0/data-formats.md","tags":[],"version":"current","sidebarPosition":2,"frontMatter":{"title":"Data Formats","sidebar_position":2},"sidebar":"tutorialSidebar","previous":{"title":"Architecture","permalink":"/BharatMLStack/online-feature-store/v1.0.0/architecture"},"next":{"title":"Benchmarks","permalink":"/BharatMLStack/online-feature-store/v1.0.0/benchmarks"}}');var s=i(4848),r=i(8453);const d={title:"Data Formats",sidebar_position:2},l="Data Format for Permanent & Cache Storage",c={},a=[{value:"PSDB (Permanent Storage Data Block) Format",id:"psdb-permanent-storage-data-block-format",level:2},{value:"\ud83e\uddf1 Structure Overview",id:"-structure-overview",level:3},{value:"Supported Data Types",id:"supported-data-types",level:3},{value:"Scalar Types",id:"scalar-types",level:4},{value:"Vector Types",id:"vector-types",level:4},{value:"\ud83d\udce6 Encoding for Scalar Feature Type",id:"-encoding-for-scalar-feature-type",level:3},{value:"1. \ud83d\udd21 String Feature Group (Variable Length Encoding using Pascal)",id:"1--string-feature-group-variable-length-encoding-using-pascal",level:4},{value:"2. \ud83d\udfe9 Boolean Feature Group (Bit-Packed)",id:"2--boolean-feature-group-bit-packed",level:4},{value:"3. \ud83d\udccf Fixed-Length Feature Group",id:"3--fixed-length-feature-group",level:4},{value:"4. Compression",id:"4-compression",level:4},{value:"\ud83e\uddec Encoding for Vector Types",id:"-encoding-for-vector-types",level:3},{value:"Conceptual Overview",id:"conceptual-overview",level:4},{value:"Vector Length Metadata",id:"vector-length-metadata",level:4},{value:"Encoding Process",id:"encoding-process",level:4},{value:"<strong>Input Structure</strong>",id:"input-structure",level:5},{value:"<strong>Length Validation</strong>",id:"length-validation",level:5},{value:"<strong>Flattening Strategy</strong>",id:"flattening-strategy",level:5},{value:"<strong>Contiguous Layout</strong>",id:"contiguous-layout",level:5},{value:"\ud83d\udd04 Deserialization/Decoding Flow",id:"-deserializationdecoding-flow",level:3},{value:"Memory Efficiency Benefits",id:"memory-efficiency-benefits",level:3},{value:"Cache Storage Data Block (CSDB) Design",id:"cache-storage-data-block-csdb-design",level:2},{value:"Overview",id:"overview",level:3},{value:"Structure and Purpose",id:"structure-and-purpose",level:3},{value:"Core Fields and Memory Layout",id:"core-fields-and-memory-layout",level:4},{value:"Cache Types",id:"cache-types",level:4},{value:"Format & Encoding",id:"format--encoding",level:3},{value:"Differences Between In-Memory and Distributed Caching",id:"differences-between-in-memory-and-distributed-caching",level:3},{value:"Optimizations & Features",id:"optimizations--features",level:3}];function o(e){const n={code:"code",h1:"h1",h2:"h2",h3:"h3",h4:"h4",h5:"h5",header:"header",hr:"hr",img:"img",li:"li",ol:"ol",p:"p",pre:"pre",strong:"strong",table:"table",tbody:"tbody",td:"td",th:"th",thead:"thead",tr:"tr",ul:"ul",...(0,r.R)(),...e.components};return(0,s.jsxs)(s.Fragment,{children:[(0,s.jsx)(n.header,{children:(0,s.jsx)(n.h1,{id:"data-format-for-permanent--cache-storage",children:"Data Format for Permanent & Cache Storage"})}),"\n",(0,s.jsx)(n.p,{children:"In this section we will go through the data-formats which is at the hear of online-feature-store, it's inspired form other storage efficient formats like parquet & arrow, but custom made to deliver in constraint environment. The two key data-formats are:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"PSDB"})," - Permanent Storage Data Block used wile storing data in ScyllaDB"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"CSDB"})," - Cache Storage Data Block used while storing data in DragonflyDB or Redis, optimal for KV"]}),"\n",(0,s.jsx)(n.li,{}),"\n"]}),"\n",(0,s.jsx)(n.h2,{id:"psdb-permanent-storage-data-block-format",children:"PSDB (Permanent Storage Data Block) Format"}),"\n",(0,s.jsxs)(n.p,{children:["The ",(0,s.jsx)(n.strong,{children:"PSDB"})," format is a compact, versioned, and schema-aware binary layout used to store feature groups efficiently for ML inference. It supports multiple datatypes (strings, booleans, fixed-size vectors), versioning, TTL, and metadata encoding in a compact header."]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"-structure-overview",children:"\ud83e\uddf1 Structure Overview"}),"\n",(0,s.jsx)(n.p,{children:"Each PSDB block is composed of multiple byte sections:"}),"\n",(0,s.jsx)(n.p,{children:(0,s.jsx)(n.img,{alt:"Permanent Storage Data Block Anatomy",src:i(3560).A+"",width:"1854",height:"1102"})}),"\n",(0,s.jsxs)(n.table,{children:[(0,s.jsx)(n.thead,{children:(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.th,{children:"Byte"}),(0,s.jsx)(n.th,{children:"Bits"}),(0,s.jsx)(n.th,{children:"Field"}),(0,s.jsx)(n.th,{children:"Description"})]})}),(0,s.jsxs)(n.tbody,{children:[(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"0-1"}),(0,s.jsx)(n.td,{children:"0-15"}),(0,s.jsx)(n.td,{children:"Feature Schema Version"}),(0,s.jsx)(n.td,{children:"Version for tracking schema changes (additions/deletions) in feature group"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"2-6"}),(0,s.jsx)(n.td,{children:"16-55"}),(0,s.jsx)(n.td,{children:"Expiry Timestamp"}),(0,s.jsx)(n.td,{children:"Encoded as a compact representation, ~513 days max"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"7"}),(0,s.jsx)(n.td,{children:"56-59"}),(0,s.jsx)(n.td,{children:"Layout Version"}),(0,s.jsx)(n.td,{children:"Used to ensure backward compatibility with layout format changes"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"7"}),(0,s.jsx)(n.td,{children:"60-62"}),(0,s.jsx)(n.td,{children:"Compression Type"}),(0,s.jsx)(n.td,{children:"3-bit field specifying compression algorithm"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"7-8"}),(0,s.jsx)(n.td,{children:"63-67"}),(0,s.jsx)(n.td,{children:"Data Type"}),(0,s.jsx)(n.td,{children:"5-bit field split across bytes 7 and 8"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"8"}),(0,s.jsx)(n.td,{children:"68-71"}),(0,s.jsx)(n.td,{children:"Bool Last Valid Bit"}),(0,s.jsx)(n.td,{children:"4-bit field for last valid boolean bit"})]})]})]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"supported-data-types",children:"Supported Data Types"}),"\n",(0,s.jsx)(n.h4,{id:"scalar-types",children:"Scalar Types"}),"\n",(0,s.jsxs)(n.table,{children:[(0,s.jsx)(n.thead,{children:(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.th,{children:"Type"}),(0,s.jsx)(n.th,{children:"Container"}),(0,s.jsx)(n.th,{children:"Size"}),(0,s.jsx)(n.th,{children:"Description"})]})}),(0,s.jsxs)(n.tbody,{children:[(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"FP32"}),", ",(0,s.jsx)(n.code,{children:"FP16"}),", ",(0,s.jsx)(n.code,{children:"FP8E4M3"}),", ",(0,s.jsx)(n.code,{children:"FP8E5M2"})]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]float32"})}),(0,s.jsx)(n.td,{children:"4/2/1/1 bytes"}),(0,s.jsx)(n.td,{children:"Floating point numbers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"Int32"}),", ",(0,s.jsx)(n.code,{children:"Int16"}),", ",(0,s.jsx)(n.code,{children:"Int8"})]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]int32"})}),(0,s.jsx)(n.td,{children:"4/2/1 bytes"}),(0,s.jsx)(n.td,{children:"Signed integers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"Uint32"}),", ",(0,s.jsx)(n.code,{children:"Uint16"}),", ",(0,s.jsx)(n.code,{children:"Uint8"})]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]uint32"})}),(0,s.jsx)(n.td,{children:"4/2/1 bytes"}),(0,s.jsx)(n.td,{children:"Unsigned integers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"FP64"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]float64"})}),(0,s.jsx)(n.td,{children:"8 bytes"}),(0,s.jsx)(n.td,{children:"Double precision float"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Int64"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]int64"})}),(0,s.jsx)(n.td,{children:"8 bytes"}),(0,s.jsx)(n.td,{children:"64-bit signed integer"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Uint64"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]uint64"})}),(0,s.jsx)(n.td,{children:"8 bytes"}),(0,s.jsx)(n.td,{children:"64-bit unsigned integer"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"String"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]string"})}),(0,s.jsx)(n.td,{children:"Variable"}),(0,s.jsx)(n.td,{children:"Pascal-style strings"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Bool"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]uint8"})}),(0,s.jsx)(n.td,{children:"Bit-packed"}),(0,s.jsx)(n.td,{children:"Boolean values"})]})]})]}),"\n",(0,s.jsx)(n.h4,{id:"vector-types",children:"Vector Types"}),"\n",(0,s.jsxs)(n.table,{children:[(0,s.jsx)(n.thead,{children:(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.th,{children:"Type"}),(0,s.jsx)(n.th,{children:"Container"}),(0,s.jsx)(n.th,{children:"Description"})]})}),(0,s.jsxs)(n.tbody,{children:[(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"FP32Vector"}),", ",(0,s.jsx)(n.code,{children:"FP16Vector"}),", etc."]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]float32"})}),(0,s.jsx)(n.td,{children:"2D slices of floating point"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"Int32Vector"}),", ",(0,s.jsx)(n.code,{children:"Int16Vector"}),", etc."]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]int32"})}),(0,s.jsx)(n.td,{children:"2D slices of signed integers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"Uint32Vector"}),", ",(0,s.jsx)(n.code,{children:"Uint16Vector"}),", etc."]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]uint32"})}),(0,s.jsx)(n.td,{children:"2D slices of unsigned integers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"FP64Vector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]float64"})}),(0,s.jsx)(n.td,{children:"2D slices of doubles"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Int64Vector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]int64"})}),(0,s.jsx)(n.td,{children:"2D slices of 64-bit signed"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Uint64Vector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]uint64"})}),(0,s.jsx)(n.td,{children:"2D slices of 64-bit unsigned"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"StringVector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]string"})}),(0,s.jsx)(n.td,{children:"2D slices of strings"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"BoolVector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]uint8"})}),(0,s.jsx)(n.td,{children:"2D slices of bit-packed bools"})]})]})]}),"\n",(0,s.jsx)(n.h3,{id:"-encoding-for-scalar-feature-type",children:"\ud83d\udce6 Encoding for Scalar Feature Type"}),"\n",(0,s.jsx)(n.h4,{id:"1--string-feature-group-variable-length-encoding-using-pascal",children:"1. \ud83d\udd21 String Feature Group (Variable Length Encoding using Pascal)"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:["Max string length: ",(0,s.jsx)(n.strong,{children:"65536"})]}),"\n",(0,s.jsxs)(n.li,{children:["Format:\n",(0,s.jsx)(n.img,{alt:"PSDB String encoding",src:i(7676).A+"",width:"1488",height:"204"})]}),"\n",(0,s.jsxs)(n.li,{children:["Deserialization:","\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Read length prefixes"}),"\n",(0,s.jsxs)(n.li,{children:["Extract string bytes using ",(0,s.jsx)(n.code,{children:"StrLenX"})]}),"\n"]}),"\n"]}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"2--boolean-feature-group-bit-packed",children:"2. \ud83d\udfe9 Boolean Feature Group (Bit-Packed)"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Saves space using bit-level packing."}),"\n",(0,s.jsxs)(n.li,{children:["Encoding:","\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Raw: 1 byte per feature"}),"\n",(0,s.jsx)(n.li,{children:"Bit-packed: 1 bit per boolean"}),"\n",(0,s.jsxs)(n.li,{children:["Additional index (",(0,s.jsx)(n.code,{children:"bool last idx"}),") stores where the last bit resides"]}),"\n"]}),"\n"]}),"\n",(0,s.jsxs)(n.li,{children:["Format:\n",(0,s.jsx)(n.img,{alt:"PSDB Bool encoding",src:i(8645).A+"",width:"1120",height:"712"})]}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"3--fixed-length-feature-group",children:"3. \ud83d\udccf Fixed-Length Feature Group"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:["For fixed-size vectors (",(0,s.jsx)(n.code,{children:"n"})," bytes each)"]}),"\n",(0,s.jsxs)(n.li,{children:["Format:\n",(0,s.jsx)(n.img,{alt:"PSDB Fixed Length Datatype encoding",src:i(6230).A+"",width:"1122",height:"202"})]}),"\n",(0,s.jsx)(n.li,{children:"Efficient for dense numeric features like float32, int64, etc."}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"4-compression",children:"4. Compression"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.code,{children:"TypeNone (0)"}),": Raw storage"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.code,{children:"TypeZSTD (1)"}),": Compressed using Zstandard"]}),"\n"]}),"\n",(0,s.jsx)(n.p,{children:"Compression is opportunistic. During serialization, if compressed size is not smaller, PSDB falls back to uncompressed format. It keeps the read/high througput path use less CPU cycles. Also only data part of PSDB is compressed allowing decompression only if block has a valid TTL"}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"-encoding-for-vector-types",children:"\ud83e\uddec Encoding for Vector Types"}),"\n",(0,s.jsx)(n.h4,{id:"conceptual-overview",children:"Conceptual Overview"}),"\n",(0,s.jsx)(n.p,{children:"PSDB encodes vector data by flattening multi-dimensional arrays into a single contiguous byte buffer while preserving the ability to reconstruct the original vector boundaries."}),"\n",(0,s.jsx)(n.h4,{id:"vector-length-metadata",children:"Vector Length Metadata"}),"\n",(0,s.jsx)(n.p,{children:"Each feature group maintains metadata about vector dimensions in the Feature Registry. For example, if a feature group has:"}),"\n",(0,s.jsx)(n.pre,{children:(0,s.jsx)(n.code,{className:"language-yaml",children:"fg1:\n version-2:\n features:\n f1: { vector_len: 6, default: [bytes] }\n f2: { vector_len: 3, default: [bytes] }\n version-1:\n features:\n f1: { vector_len: 6, default: [bytes] }\n"})}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Feature f1 with vector_len: 6"}),"\n",(0,s.jsx)(n.li,{children:"Feature f2 with vector_len: 3"}),"\n",(0,s.jsx)(n.li,{}),"\n"]}),"\n",(0,s.jsx)(n.p,{children:"This means:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.code,{children:"f1"})," contains vectors of exactly 6 elements each"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.code,{children:"f2"})," contains vectors of exactly 3 elements each"]}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"encoding-process",children:"Encoding Process"}),"\n",(0,s.jsx)(n.h5,{id:"input-structure",children:(0,s.jsx)(n.strong,{children:"Input Structure"})}),"\n",(0,s.jsx)(n.p,{children:"The serializer receives vector data as 2D slices where:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Outer dimension represents different feature instances/entities"}),"\n",(0,s.jsx)(n.li,{children:"Inner dimension represents the vector elements for each instance"}),"\n"]}),"\n",(0,s.jsx)(n.h5,{id:"length-validation",children:(0,s.jsx)(n.strong,{children:"Length Validation"})}),"\n",(0,s.jsx)(n.p,{children:"Before encoding, PSDB validates that each vector's actual length matches the declared vector_len from the feature metadata. This ensures data integrity and enables efficient decoding."}),"\n",(0,s.jsx)(n.h5,{id:"flattening-strategy",children:(0,s.jsx)(n.strong,{children:"Flattening Strategy"})}),"\n",(0,s.jsx)(n.p,{children:"Vectors are serialized in row-major order (also called C-style order):"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"All elements of the first vector are written consecutively"}),"\n",(0,s.jsx)(n.li,{children:"Followed by all elements of the second vector"}),"\n",(0,s.jsx)(n.li,{children:"And so on..."}),"\n"]}),"\n",(0,s.jsx)(n.h5,{id:"contiguous-layout",children:(0,s.jsx)(n.strong,{children:"Contiguous Layout"})}),"\n",(0,s.jsx)(n.p,{children:"The resulting byte buffer contains all vector elements placed end-to-end without gaps or separators. The decoder can reconstruct vector boundaries because it knows:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"The data type size (e.g., 4 bytes for float32), from feature registry"}),"\n",(0,s.jsx)(n.li,{children:"The vector length for each position, from feature registry"}),"\n",(0,s.jsx)(n.li,{children:"The total number of vectors, from feature registry"}),"\n",(0,s.jsxs)(n.li,{children:["In case of ",(0,s.jsx)(n.code,{children:"variable length"})," length is encoded into the data, like for ",(0,s.jsx)(n.code,{children:"String"})," data-type"]}),"\n"]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"-deserializationdecoding-flow",children:"\ud83d\udd04 Deserialization/Decoding Flow"}),"\n",(0,s.jsxs)(n.ol,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Extract version"})," from first 2 bytes."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Look up schema"})," from etcd using the version."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Determine feature shapes"})," (e.g., vector lengths)."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Slice and decode"})," data from byte buffer accordingly."]}),"\n"]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"memory-efficiency-benefits",children:"Memory Efficiency Benefits"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"No Padding"}),": Elements are packed tightly without alignment padding"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"No Delimiters"}),": Vector boundaries are implicit, not stored explicitly"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Cache Friendly"}),": Sequential memory access patterns during encoding/decoding"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Minimal Metadata"}),": Only vector lengths are stored separately, not per-element"]}),"\n"]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h2,{id:"cache-storage-data-block-csdb-design",children:"Cache Storage Data Block (CSDB) Design"}),"\n",(0,s.jsx)(n.h3,{id:"overview",children:"Overview"}),"\n",(0,s.jsx)(n.p,{children:"The Cache Storage Data Block (CSDB) is a compact binary data format that encapsulates serialized data blocks for multiple feature groups. It is designed to support both in-memory and distributed caching of deserialized PSDB (Permanent Storage Data Block) content, optimizing for speed, deduplication, and minimal memory overhead."}),"\n",(0,s.jsx)(n.h3,{id:"structure-and-purpose",children:"Structure and Purpose"}),"\n",(0,s.jsx)(n.p,{children:"Each CSDB contains a mapping of feature group IDs (FG IDs) to deserialized PSDBs. For distributed systems, this structure is flattened into a serialized byte slice. The CSDB supports layout versioning for backward compatibility and negative caching for feature groups with no associated data."}),"\n",(0,s.jsx)(n.h4,{id:"core-fields-and-memory-layout",children:"Core Fields and Memory Layout"}),"\n",(0,s.jsx)(n.pre,{children:(0,s.jsx)(n.code,{className:"language-go",children:"type CacheStorageDataBlock struct {\n // 8-byte aligned map pointer\n FGIdToDDB map[int]*DeserializedPSDB // offset: 0\n\n // 24-byte slice (ptr, len, cap)\n serializedCSDB []byte // offset: 8\n\n // 4-byte fields\n TTL uint32 // offset: 32\n\n // 1-byte fields\n layoutVersion uint8 // offset: 36\n cacheType CacheType // offset: 37\n // 2 bytes padding to maintain 4-byte alignment\n}\n"})}),"\n",(0,s.jsx)(n.p,{children:"The structure is memory-aligned for optimal performance:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Pointers and slices are 8-byte aligned"}),"\n",(0,s.jsxs)(n.li,{children:["Smaller fields (like ",(0,s.jsx)(n.code,{children:"uint8"}),") are grouped and padded to avoid false sharing"]}),"\n",(0,s.jsx)(n.li,{children:"This layout ensures efficient use of CPU caches during access"}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"cache-types",children:"Cache Types"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"In-Memory Cache"}),": Uses the ",(0,s.jsx)(n.code,{children:"FGIdToDDB"})," map directly and avoids serialization unless explicitly requested."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Distributed Cache"}),": Stores a serialized binary format in ",(0,s.jsx)(n.code,{children:"serializedCSDB"}),", which is deserialized lazily when required."]}),"\n"]}),"\n",(0,s.jsx)(n.h3,{id:"format--encoding",children:"Format & Encoding"}),"\n",(0,s.jsxs)(n.p,{children:[(0,s.jsx)(n.strong,{children:"CSDB Binary Layout"}),": Serialized CSDBs follow this compact format:"]}),"\n",(0,s.jsx)(n.pre,{children:(0,s.jsx)(n.code,{children:"[LayoutVersion (1 byte)][FGID (2 bytes)][DataLen (2 bytes)][Data ...] \u2192 repeated per feature group\n"})}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:["FGID and DataLen are encoded as ",(0,s.jsx)(n.code,{children:"uint16"})]}),"\n",(0,s.jsxs)(n.li,{children:["If ",(0,s.jsx)(n.code,{children:"DataLen == 0"}),", it denotes a negative cache (no data available for that FG)"]}),"\n",(0,s.jsx)(n.li,{children:"The data section contains the PSDB header and either compressed or uncompressed data"}),"\n"]}),"\n",(0,s.jsx)(n.p,{children:"This layout allows fast scanning and partial deserialization for selected FG IDs, making it optimal for large-scale caching systems."}),"\n",(0,s.jsx)(n.h3,{id:"differences-between-in-memory-and-distributed-caching",children:"Differences Between In-Memory and Distributed Caching"}),"\n",(0,s.jsxs)(n.table,{children:[(0,s.jsx)(n.thead,{children:(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.th,{children:"Aspect"}),(0,s.jsx)(n.th,{children:"In-Memory CSDB"}),(0,s.jsx)(n.th,{children:"Distributed CSDB"})]})}),(0,s.jsxs)(n.tbody,{children:[(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Storage Format"}),(0,s.jsx)(n.td,{children:"Live Go objects (map[int]*DeserializedPSDB)"}),(0,s.jsxs)(n.td,{children:["Serialized byte buffer (",(0,s.jsx)(n.code,{children:"[]byte"}),")"]})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Deserialization"}),(0,s.jsx)(n.td,{children:"Performed on-demand using offset map"}),(0,s.jsx)(n.td,{children:"Performed on-demand using offset map"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Compression"}),(0,s.jsx)(n.td,{children:"Optional during serialization"}),(0,s.jsx)(n.td,{children:"Typically enabled to reduce payload size"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Usage Pattern"}),(0,s.jsx)(n.td,{children:"Fast lookup in active process memory"}),(0,s.jsx)(n.td,{children:"Cross-node cache sharing and persistence"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Memory Overhead"}),(0,s.jsx)(n.td,{children:"Higher (due to live objects)"}),(0,s.jsx)(n.td,{children:"Lower (compact representation)"})]})]})]}),"\n",(0,s.jsx)(n.h3,{id:"optimizations--features",children:"Optimizations & Features"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Partial FG ID Fetch"}),": When only a subset of FG IDs is needed, CSDB avoids unnecessary deserialization of other IDs."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Negative Caching"}),": FG IDs with no data are encoded with ",(0,s.jsx)(n.code,{children:"DataLen=0"}),", saving space and avoiding repeated lookups."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Offset-Length Map"}),": During deserialization, FGID to offset+length pairs are cached internally for efficient random access."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Versioning Support"}),": Layout version is stored as the first byte to enable format upgrades while maintaining backward compatibility."]}),"\n"]}),"\n",(0,s.jsx)(n.p,{children:"Diagram below explains how compute cycles are saved by partial de-compression."}),"\n",(0,s.jsx)(n.p,{children:(0,s.jsx)(n.img,{alt:"CSDB Partial Decompression",src:i(7780).A+"",width:"2292",height:"828"})})]})}function h(e={}){const{wrapper:n}={...(0,r.R)(),...e.components};return n?(0,s.jsx)(n,{...e,children:(0,s.jsx)(o,{...e})}):o(e)}}}]); No newline at end of file | |||
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[2344],{1228:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-psdb-bool-encoding-4b154fdf5e6d79a67c91b6fb21c7209e.png"},2041:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-psdb-string-encoding-b1d69e9452269124d1b545020fa27d63.png"},5153:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-psdb-fixed-length-encodding-dd252110b084e01cf38f21de16b3a1a5.png"},6305:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-psdb-anatomy-c1735559f93dce6d0bb3894d16047059.png"},8453:(e,n,i)=>{i.d(n,{R:()=>d,x:()=>l});var t=i(6540);const s={},r=t.createContext(s);function d(e){const n=t.useContext(r);return t.useMemo(function(){return"function"==typeof e?e(n):{...n,...e}},[n,e])}function l(e){let n;return n=e.disableParentContext?"function"==typeof e.components?e.components(s):e.components||s:d(e.components),t.createElement(r.Provider,{value:n},e.children)}},9173:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/v1.0.0-csdb-skip-read-e3926080f7341aa7d3c6ec6d8274ea14.png"},9584:(e,n,i)=>{i.r(n),i.d(n,{assets:()=>c,contentTitle:()=>l,default:()=>h,frontMatter:()=>d,metadata:()=>t,toc:()=>a});const t=JSON.parse('{"id":"online-feature-store/v1.0.0/data-formats","title":"Data Formats","description":"In this section we will go through the data-formats which is at the hear of online-feature-store, it\'s inspired form other storage efficient formats like parquet & arrow, but custom made to deliver in constraint environment. The two key data-formats are:","source":"@site/docs/online-feature-store/v1.0.0/data-formats.md","sourceDirName":"online-feature-store/v1.0.0","slug":"/online-feature-store/v1.0.0/data-formats","permalink":"/BharatMLStack/online-feature-store/v1.0.0/data-formats","draft":false,"unlisted":false,"editUrl":"https://github.com/Meesho/BharatMLStack/tree/main/docs/docs/online-feature-store/v1.0.0/data-formats.md","tags":[],"version":"current","sidebarPosition":2,"frontMatter":{"title":"Data Formats","sidebar_position":2},"sidebar":"tutorialSidebar","previous":{"title":"Architecture","permalink":"/BharatMLStack/online-feature-store/v1.0.0/architecture"},"next":{"title":"Benchmarks","permalink":"/BharatMLStack/online-feature-store/v1.0.0/benchmarks"}}');var s=i(4848),r=i(8453);const d={title:"Data Formats",sidebar_position:2},l="Data Format for Permanent & Cache Storage",c={},a=[{value:"PSDB (Permanent Storage Data Block) Format",id:"psdb-permanent-storage-data-block-format",level:2},{value:"\ud83e\uddf1 Structure Overview",id:"-structure-overview",level:3},{value:"Supported Data Types",id:"supported-data-types",level:3},{value:"Scalar Types",id:"scalar-types",level:4},{value:"Vector Types",id:"vector-types",level:4},{value:"\ud83d\udce6 Encoding for Scalar Feature Type",id:"-encoding-for-scalar-feature-type",level:3},{value:"1. \ud83d\udd21 String Feature Group (Variable Length Encoding using Pascal)",id:"1--string-feature-group-variable-length-encoding-using-pascal",level:4},{value:"2. \ud83d\udfe9 Boolean Feature Group (Bit-Packed)",id:"2--boolean-feature-group-bit-packed",level:4},{value:"3. \ud83d\udccf Fixed-Length Feature Group",id:"3--fixed-length-feature-group",level:4},{value:"4. Compression",id:"4-compression",level:4},{value:"\ud83e\uddec Encoding for Vector Types",id:"-encoding-for-vector-types",level:3},{value:"Conceptual Overview",id:"conceptual-overview",level:4},{value:"Vector Length Metadata",id:"vector-length-metadata",level:4},{value:"Encoding Process",id:"encoding-process",level:4},{value:"<strong>Input Structure</strong>",id:"input-structure",level:5},{value:"<strong>Length Validation</strong>",id:"length-validation",level:5},{value:"<strong>Flattening Strategy</strong>",id:"flattening-strategy",level:5},{value:"<strong>Contiguous Layout</strong>",id:"contiguous-layout",level:5},{value:"\ud83d\udd04 Deserialization/Decoding Flow",id:"-deserializationdecoding-flow",level:3},{value:"Memory Efficiency Benefits",id:"memory-efficiency-benefits",level:3},{value:"Cache Storage Data Block (CSDB) Design",id:"cache-storage-data-block-csdb-design",level:2},{value:"Overview",id:"overview",level:3},{value:"Structure and Purpose",id:"structure-and-purpose",level:3},{value:"Core Fields and Memory Layout",id:"core-fields-and-memory-layout",level:4},{value:"Cache Types",id:"cache-types",level:4},{value:"Format & Encoding",id:"format--encoding",level:3},{value:"Differences Between In-Memory and Distributed Caching",id:"differences-between-in-memory-and-distributed-caching",level:3},{value:"Optimizations & Features",id:"optimizations--features",level:3}];function o(e){const n={code:"code",h1:"h1",h2:"h2",h3:"h3",h4:"h4",h5:"h5",header:"header",hr:"hr",img:"img",li:"li",ol:"ol",p:"p",pre:"pre",strong:"strong",table:"table",tbody:"tbody",td:"td",th:"th",thead:"thead",tr:"tr",ul:"ul",...(0,r.R)(),...e.components};return(0,s.jsxs)(s.Fragment,{children:[(0,s.jsx)(n.header,{children:(0,s.jsx)(n.h1,{id:"data-format-for-permanent--cache-storage",children:"Data Format for Permanent & Cache Storage"})}),"\n",(0,s.jsx)(n.p,{children:"In this section we will go through the data-formats which is at the hear of online-feature-store, it's inspired form other storage efficient formats like parquet & arrow, but custom made to deliver in constraint environment. The two key data-formats are:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"PSDB"})," - Permanent Storage Data Block used wile storing data in ScyllaDB"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"CSDB"})," - Cache Storage Data Block used while storing data in DragonflyDB or Redis, optimal for KV"]}),"\n",(0,s.jsx)(n.li,{}),"\n"]}),"\n",(0,s.jsx)(n.h2,{id:"psdb-permanent-storage-data-block-format",children:"PSDB (Permanent Storage Data Block) Format"}),"\n",(0,s.jsxs)(n.p,{children:["The ",(0,s.jsx)(n.strong,{children:"PSDB"})," format is a compact, versioned, and schema-aware binary layout used to store feature groups efficiently for ML inference. It supports multiple datatypes (strings, booleans, fixed-size vectors), versioning, TTL, and metadata encoding in a compact header."]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"-structure-overview",children:"\ud83e\uddf1 Structure Overview"}),"\n",(0,s.jsx)(n.p,{children:"Each PSDB block is composed of multiple byte sections:"}),"\n",(0,s.jsx)(n.p,{children:(0,s.jsx)(n.img,{alt:"Permanent Storage Data Block Anatomy",src:i(6305).A+"",width:"1854",height:"1102"})}),"\n",(0,s.jsxs)(n.table,{children:[(0,s.jsx)(n.thead,{children:(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.th,{children:"Byte"}),(0,s.jsx)(n.th,{children:"Bits"}),(0,s.jsx)(n.th,{children:"Field"}),(0,s.jsx)(n.th,{children:"Description"})]})}),(0,s.jsxs)(n.tbody,{children:[(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"0-1"}),(0,s.jsx)(n.td,{children:"0-15"}),(0,s.jsx)(n.td,{children:"Feature Schema Version"}),(0,s.jsx)(n.td,{children:"Version for tracking schema changes (additions/deletions) in feature group"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"2-6"}),(0,s.jsx)(n.td,{children:"16-55"}),(0,s.jsx)(n.td,{children:"Expiry Timestamp"}),(0,s.jsx)(n.td,{children:"Encoded as a compact representation, ~513 days max"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"7"}),(0,s.jsx)(n.td,{children:"56-59"}),(0,s.jsx)(n.td,{children:"Layout Version"}),(0,s.jsx)(n.td,{children:"Used to ensure backward compatibility with layout format changes"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"7"}),(0,s.jsx)(n.td,{children:"60-62"}),(0,s.jsx)(n.td,{children:"Compression Type"}),(0,s.jsx)(n.td,{children:"3-bit field specifying compression algorithm"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"7-8"}),(0,s.jsx)(n.td,{children:"63-67"}),(0,s.jsx)(n.td,{children:"Data Type"}),(0,s.jsx)(n.td,{children:"5-bit field split across bytes 7 and 8"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"8"}),(0,s.jsx)(n.td,{children:"68-71"}),(0,s.jsx)(n.td,{children:"Bool Last Valid Bit"}),(0,s.jsx)(n.td,{children:"4-bit field for last valid boolean bit"})]})]})]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"supported-data-types",children:"Supported Data Types"}),"\n",(0,s.jsx)(n.h4,{id:"scalar-types",children:"Scalar Types"}),"\n",(0,s.jsxs)(n.table,{children:[(0,s.jsx)(n.thead,{children:(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.th,{children:"Type"}),(0,s.jsx)(n.th,{children:"Container"}),(0,s.jsx)(n.th,{children:"Size"}),(0,s.jsx)(n.th,{children:"Description"})]})}),(0,s.jsxs)(n.tbody,{children:[(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"FP32"}),", ",(0,s.jsx)(n.code,{children:"FP16"}),", ",(0,s.jsx)(n.code,{children:"FP8E4M3"}),", ",(0,s.jsx)(n.code,{children:"FP8E5M2"})]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]float32"})}),(0,s.jsx)(n.td,{children:"4/2/1/1 bytes"}),(0,s.jsx)(n.td,{children:"Floating point numbers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"Int32"}),", ",(0,s.jsx)(n.code,{children:"Int16"}),", ",(0,s.jsx)(n.code,{children:"Int8"})]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]int32"})}),(0,s.jsx)(n.td,{children:"4/2/1 bytes"}),(0,s.jsx)(n.td,{children:"Signed integers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"Uint32"}),", ",(0,s.jsx)(n.code,{children:"Uint16"}),", ",(0,s.jsx)(n.code,{children:"Uint8"})]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]uint32"})}),(0,s.jsx)(n.td,{children:"4/2/1 bytes"}),(0,s.jsx)(n.td,{children:"Unsigned integers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"FP64"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]float64"})}),(0,s.jsx)(n.td,{children:"8 bytes"}),(0,s.jsx)(n.td,{children:"Double precision float"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Int64"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]int64"})}),(0,s.jsx)(n.td,{children:"8 bytes"}),(0,s.jsx)(n.td,{children:"64-bit signed integer"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Uint64"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]uint64"})}),(0,s.jsx)(n.td,{children:"8 bytes"}),(0,s.jsx)(n.td,{children:"64-bit unsigned integer"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"String"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]string"})}),(0,s.jsx)(n.td,{children:"Variable"}),(0,s.jsx)(n.td,{children:"Pascal-style strings"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Bool"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[]uint8"})}),(0,s.jsx)(n.td,{children:"Bit-packed"}),(0,s.jsx)(n.td,{children:"Boolean values"})]})]})]}),"\n",(0,s.jsx)(n.h4,{id:"vector-types",children:"Vector Types"}),"\n",(0,s.jsxs)(n.table,{children:[(0,s.jsx)(n.thead,{children:(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.th,{children:"Type"}),(0,s.jsx)(n.th,{children:"Container"}),(0,s.jsx)(n.th,{children:"Description"})]})}),(0,s.jsxs)(n.tbody,{children:[(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"FP32Vector"}),", ",(0,s.jsx)(n.code,{children:"FP16Vector"}),", etc."]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]float32"})}),(0,s.jsx)(n.td,{children:"2D slices of floating point"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"Int32Vector"}),", ",(0,s.jsx)(n.code,{children:"Int16Vector"}),", etc."]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]int32"})}),(0,s.jsx)(n.td,{children:"2D slices of signed integers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsxs)(n.td,{children:[(0,s.jsx)(n.code,{children:"Uint32Vector"}),", ",(0,s.jsx)(n.code,{children:"Uint16Vector"}),", etc."]}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]uint32"})}),(0,s.jsx)(n.td,{children:"2D slices of unsigned integers"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"FP64Vector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]float64"})}),(0,s.jsx)(n.td,{children:"2D slices of doubles"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Int64Vector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]int64"})}),(0,s.jsx)(n.td,{children:"2D slices of 64-bit signed"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"Uint64Vector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]uint64"})}),(0,s.jsx)(n.td,{children:"2D slices of 64-bit unsigned"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"StringVector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]string"})}),(0,s.jsx)(n.td,{children:"2D slices of strings"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"BoolVector"})}),(0,s.jsx)(n.td,{children:(0,s.jsx)(n.code,{children:"[][]uint8"})}),(0,s.jsx)(n.td,{children:"2D slices of bit-packed bools"})]})]})]}),"\n",(0,s.jsx)(n.h3,{id:"-encoding-for-scalar-feature-type",children:"\ud83d\udce6 Encoding for Scalar Feature Type"}),"\n",(0,s.jsx)(n.h4,{id:"1--string-feature-group-variable-length-encoding-using-pascal",children:"1. \ud83d\udd21 String Feature Group (Variable Length Encoding using Pascal)"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:["Max string length: ",(0,s.jsx)(n.strong,{children:"65536"})]}),"\n",(0,s.jsxs)(n.li,{children:["Format:\n",(0,s.jsx)(n.img,{alt:"PSDB String encoding",src:i(2041).A+"",width:"1488",height:"204"})]}),"\n",(0,s.jsxs)(n.li,{children:["Deserialization:","\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Read length prefixes"}),"\n",(0,s.jsxs)(n.li,{children:["Extract string bytes using ",(0,s.jsx)(n.code,{children:"StrLenX"})]}),"\n"]}),"\n"]}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"2--boolean-feature-group-bit-packed",children:"2. \ud83d\udfe9 Boolean Feature Group (Bit-Packed)"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Saves space using bit-level packing."}),"\n",(0,s.jsxs)(n.li,{children:["Encoding:","\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Raw: 1 byte per feature"}),"\n",(0,s.jsx)(n.li,{children:"Bit-packed: 1 bit per boolean"}),"\n",(0,s.jsxs)(n.li,{children:["Additional index (",(0,s.jsx)(n.code,{children:"bool last idx"}),") stores where the last bit resides"]}),"\n"]}),"\n"]}),"\n",(0,s.jsxs)(n.li,{children:["Format:\n",(0,s.jsx)(n.img,{alt:"PSDB Bool encoding",src:i(1228).A+"",width:"1120",height:"712"})]}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"3--fixed-length-feature-group",children:"3. \ud83d\udccf Fixed-Length Feature Group"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:["For fixed-size vectors (",(0,s.jsx)(n.code,{children:"n"})," bytes each)"]}),"\n",(0,s.jsxs)(n.li,{children:["Format:\n",(0,s.jsx)(n.img,{alt:"PSDB Fixed Length Datatype encoding",src:i(5153).A+"",width:"1122",height:"202"})]}),"\n",(0,s.jsx)(n.li,{children:"Efficient for dense numeric features like float32, int64, etc."}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"4-compression",children:"4. Compression"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.code,{children:"TypeNone (0)"}),": Raw storage"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.code,{children:"TypeZSTD (1)"}),": Compressed using Zstandard"]}),"\n"]}),"\n",(0,s.jsx)(n.p,{children:"Compression is opportunistic. During serialization, if compressed size is not smaller, PSDB falls back to uncompressed format. It keeps the read/high througput path use less CPU cycles. Also only data part of PSDB is compressed allowing decompression only if block has a valid TTL"}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"-encoding-for-vector-types",children:"\ud83e\uddec Encoding for Vector Types"}),"\n",(0,s.jsx)(n.h4,{id:"conceptual-overview",children:"Conceptual Overview"}),"\n",(0,s.jsx)(n.p,{children:"PSDB encodes vector data by flattening multi-dimensional arrays into a single contiguous byte buffer while preserving the ability to reconstruct the original vector boundaries."}),"\n",(0,s.jsx)(n.h4,{id:"vector-length-metadata",children:"Vector Length Metadata"}),"\n",(0,s.jsx)(n.p,{children:"Each feature group maintains metadata about vector dimensions in the Feature Registry. For example, if a feature group has:"}),"\n",(0,s.jsx)(n.pre,{children:(0,s.jsx)(n.code,{className:"language-yaml",children:"fg1:\n version-2:\n features:\n f1: { vector_len: 6, default: [bytes] }\n f2: { vector_len: 3, default: [bytes] }\n version-1:\n features:\n f1: { vector_len: 6, default: [bytes] }\n"})}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Feature f1 with vector_len: 6"}),"\n",(0,s.jsx)(n.li,{children:"Feature f2 with vector_len: 3"}),"\n",(0,s.jsx)(n.li,{}),"\n"]}),"\n",(0,s.jsx)(n.p,{children:"This means:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.code,{children:"f1"})," contains vectors of exactly 6 elements each"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.code,{children:"f2"})," contains vectors of exactly 3 elements each"]}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"encoding-process",children:"Encoding Process"}),"\n",(0,s.jsx)(n.h5,{id:"input-structure",children:(0,s.jsx)(n.strong,{children:"Input Structure"})}),"\n",(0,s.jsx)(n.p,{children:"The serializer receives vector data as 2D slices where:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Outer dimension represents different feature instances/entities"}),"\n",(0,s.jsx)(n.li,{children:"Inner dimension represents the vector elements for each instance"}),"\n"]}),"\n",(0,s.jsx)(n.h5,{id:"length-validation",children:(0,s.jsx)(n.strong,{children:"Length Validation"})}),"\n",(0,s.jsx)(n.p,{children:"Before encoding, PSDB validates that each vector's actual length matches the declared vector_len from the feature metadata. This ensures data integrity and enables efficient decoding."}),"\n",(0,s.jsx)(n.h5,{id:"flattening-strategy",children:(0,s.jsx)(n.strong,{children:"Flattening Strategy"})}),"\n",(0,s.jsx)(n.p,{children:"Vectors are serialized in row-major order (also called C-style order):"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"All elements of the first vector are written consecutively"}),"\n",(0,s.jsx)(n.li,{children:"Followed by all elements of the second vector"}),"\n",(0,s.jsx)(n.li,{children:"And so on..."}),"\n"]}),"\n",(0,s.jsx)(n.h5,{id:"contiguous-layout",children:(0,s.jsx)(n.strong,{children:"Contiguous Layout"})}),"\n",(0,s.jsx)(n.p,{children:"The resulting byte buffer contains all vector elements placed end-to-end without gaps or separators. The decoder can reconstruct vector boundaries because it knows:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"The data type size (e.g., 4 bytes for float32), from feature registry"}),"\n",(0,s.jsx)(n.li,{children:"The vector length for each position, from feature registry"}),"\n",(0,s.jsx)(n.li,{children:"The total number of vectors, from feature registry"}),"\n",(0,s.jsxs)(n.li,{children:["In case of ",(0,s.jsx)(n.code,{children:"variable length"})," length is encoded into the data, like for ",(0,s.jsx)(n.code,{children:"String"})," data-type"]}),"\n"]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"-deserializationdecoding-flow",children:"\ud83d\udd04 Deserialization/Decoding Flow"}),"\n",(0,s.jsxs)(n.ol,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Extract version"})," from first 2 bytes."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Look up schema"})," from etcd using the version."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Determine feature shapes"})," (e.g., vector lengths)."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Slice and decode"})," data from byte buffer accordingly."]}),"\n"]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h3,{id:"memory-efficiency-benefits",children:"Memory Efficiency Benefits"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"No Padding"}),": Elements are packed tightly without alignment padding"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"No Delimiters"}),": Vector boundaries are implicit, not stored explicitly"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Cache Friendly"}),": Sequential memory access patterns during encoding/decoding"]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Minimal Metadata"}),": Only vector lengths are stored separately, not per-element"]}),"\n"]}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.hr,{}),"\n",(0,s.jsx)(n.h2,{id:"cache-storage-data-block-csdb-design",children:"Cache Storage Data Block (CSDB) Design"}),"\n",(0,s.jsx)(n.h3,{id:"overview",children:"Overview"}),"\n",(0,s.jsx)(n.p,{children:"The Cache Storage Data Block (CSDB) is a compact binary data format that encapsulates serialized data blocks for multiple feature groups. It is designed to support both in-memory and distributed caching of deserialized PSDB (Permanent Storage Data Block) content, optimizing for speed, deduplication, and minimal memory overhead."}),"\n",(0,s.jsx)(n.h3,{id:"structure-and-purpose",children:"Structure and Purpose"}),"\n",(0,s.jsx)(n.p,{children:"Each CSDB contains a mapping of feature group IDs (FG IDs) to deserialized PSDBs. For distributed systems, this structure is flattened into a serialized byte slice. The CSDB supports layout versioning for backward compatibility and negative caching for feature groups with no associated data."}),"\n",(0,s.jsx)(n.h4,{id:"core-fields-and-memory-layout",children:"Core Fields and Memory Layout"}),"\n",(0,s.jsx)(n.pre,{children:(0,s.jsx)(n.code,{className:"language-go",children:"type CacheStorageDataBlock struct {\n // 8-byte aligned map pointer\n FGIdToDDB map[int]*DeserializedPSDB // offset: 0\n\n // 24-byte slice (ptr, len, cap)\n serializedCSDB []byte // offset: 8\n\n // 4-byte fields\n TTL uint32 // offset: 32\n\n // 1-byte fields\n layoutVersion uint8 // offset: 36\n cacheType CacheType // offset: 37\n // 2 bytes padding to maintain 4-byte alignment\n}\n"})}),"\n",(0,s.jsx)(n.p,{children:"The structure is memory-aligned for optimal performance:"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsx)(n.li,{children:"Pointers and slices are 8-byte aligned"}),"\n",(0,s.jsxs)(n.li,{children:["Smaller fields (like ",(0,s.jsx)(n.code,{children:"uint8"}),") are grouped and padded to avoid false sharing"]}),"\n",(0,s.jsx)(n.li,{children:"This layout ensures efficient use of CPU caches during access"}),"\n"]}),"\n",(0,s.jsx)(n.h4,{id:"cache-types",children:"Cache Types"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"In-Memory Cache"}),": Uses the ",(0,s.jsx)(n.code,{children:"FGIdToDDB"})," map directly and avoids serialization unless explicitly requested."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Distributed Cache"}),": Stores a serialized binary format in ",(0,s.jsx)(n.code,{children:"serializedCSDB"}),", which is deserialized lazily when required."]}),"\n"]}),"\n",(0,s.jsx)(n.h3,{id:"format--encoding",children:"Format & Encoding"}),"\n",(0,s.jsxs)(n.p,{children:[(0,s.jsx)(n.strong,{children:"CSDB Binary Layout"}),": Serialized CSDBs follow this compact format:"]}),"\n",(0,s.jsx)(n.pre,{children:(0,s.jsx)(n.code,{children:"[LayoutVersion (1 byte)][FGID (2 bytes)][DataLen (2 bytes)][Data ...] \u2192 repeated per feature group\n"})}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:["FGID and DataLen are encoded as ",(0,s.jsx)(n.code,{children:"uint16"})]}),"\n",(0,s.jsxs)(n.li,{children:["If ",(0,s.jsx)(n.code,{children:"DataLen == 0"}),", it denotes a negative cache (no data available for that FG)"]}),"\n",(0,s.jsx)(n.li,{children:"The data section contains the PSDB header and either compressed or uncompressed data"}),"\n"]}),"\n",(0,s.jsx)(n.p,{children:"This layout allows fast scanning and partial deserialization for selected FG IDs, making it optimal for large-scale caching systems."}),"\n",(0,s.jsx)(n.h3,{id:"differences-between-in-memory-and-distributed-caching",children:"Differences Between In-Memory and Distributed Caching"}),"\n",(0,s.jsxs)(n.table,{children:[(0,s.jsx)(n.thead,{children:(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.th,{children:"Aspect"}),(0,s.jsx)(n.th,{children:"In-Memory CSDB"}),(0,s.jsx)(n.th,{children:"Distributed CSDB"})]})}),(0,s.jsxs)(n.tbody,{children:[(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Storage Format"}),(0,s.jsx)(n.td,{children:"Live Go objects (map[int]*DeserializedPSDB)"}),(0,s.jsxs)(n.td,{children:["Serialized byte buffer (",(0,s.jsx)(n.code,{children:"[]byte"}),")"]})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Deserialization"}),(0,s.jsx)(n.td,{children:"Performed on-demand using offset map"}),(0,s.jsx)(n.td,{children:"Performed on-demand using offset map"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Compression"}),(0,s.jsx)(n.td,{children:"Optional during serialization"}),(0,s.jsx)(n.td,{children:"Typically enabled to reduce payload size"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Usage Pattern"}),(0,s.jsx)(n.td,{children:"Fast lookup in active process memory"}),(0,s.jsx)(n.td,{children:"Cross-node cache sharing and persistence"})]}),(0,s.jsxs)(n.tr,{children:[(0,s.jsx)(n.td,{children:"Memory Overhead"}),(0,s.jsx)(n.td,{children:"Higher (due to live objects)"}),(0,s.jsx)(n.td,{children:"Lower (compact representation)"})]})]})]}),"\n",(0,s.jsx)(n.h3,{id:"optimizations--features",children:"Optimizations & Features"}),"\n",(0,s.jsxs)(n.ul,{children:["\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Partial FG ID Fetch"}),": When only a subset of FG IDs is needed, CSDB avoids unnecessary deserialization of other IDs."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Negative Caching"}),": FG IDs with no data are encoded with ",(0,s.jsx)(n.code,{children:"DataLen=0"}),", saving space and avoiding repeated lookups."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Offset-Length Map"}),": During deserialization, FGID to offset+length pairs are cached internally for efficient random access."]}),"\n",(0,s.jsxs)(n.li,{children:[(0,s.jsx)(n.strong,{children:"Versioning Support"}),": Layout version is stored as the first byte to enable format upgrades while maintaining backward compatibility."]}),"\n"]}),"\n",(0,s.jsx)(n.p,{children:"Diagram below explains how compute cycles are saved by partial de-compression."}),"\n",(0,s.jsx)(n.p,{children:(0,s.jsx)(n.img,{alt:"CSDB Partial Decompression",src:i(9173).A+"",width:"2292",height:"828"})})]})}function h(e={}){const{wrapper:n}={...(0,r.R)(),...e.components};return n?(0,s.jsx)(n,{...e,children:(0,s.jsx)(o,{...e})}):o(e)}}}]); No newline at end of file | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
echo "=== Checking for source markdown file ==="
find docs -name "data-formats.md" -type f
echo ""
echo "=== Checking for typos in markdown (hear/form) ==="
rg -n "at the hear of|inspired form" --type md docs/ 2>/dev/null || echo "Pattern not found in .md files"
echo ""
echo "=== Checking for image filename with 'encodding' ==="
fd "encodding" docs/ 2>/dev/null || echo "No files with 'encodding' found"
echo ""
echo "=== Checking for 'encoding' variant in filenames ==="
fd "fixed.*length.*encod" docs/ 2>/dev/null || echo "No matching encoding files found"
echo ""
echo "=== Listing image files in docs/assets/images ==="
find docs/assets/images -name "*encoding*" -o -name "*psdb*" 2>/dev/null | head -20Repository: Meesho/BharatMLStack
Length of output: 958
🏁 Script executed:
#!/bin/bash
echo "=== Search for markdown files with 'hear of' or 'from other' ==="
rg -n "hear of|form other" docs/ 2>/dev/null || echo "Not found"
echo ""
echo "=== Check if markdown exists at different location ==="
find docs -name "*data*format*.md" -type f
echo ""
echo "=== Check git status of the image files ==="
git ls-files docs/assets/images/v1.0.0-psdb-fixed-length-encodding*
echo ""
echo "=== Check the actual bundled JS file for the typos ==="
grep -o "hear\|form other" docs/assets/js/4caa95bf.4a20c460.js | head -20Repository: Meesho/BharatMLStack
Length of output: 50376
🏁 Script executed:
#!/bin/bash
echo "=== Looking for markdown source files ==="
find docs -name "*data-format*" -type f
echo ""
echo "=== Checking entire docs directory structure ==="
find docs -maxdepth 3 -type d | grep -E "(online-feature|data)" | head -20
echo ""
echo "=== Checking if source markdown exists anywhere ==="
fd "\.md$" docs/ | head -30Repository: Meesho/BharatMLStack
Length of output: 570
🏁 Script executed:
#!/bin/bash
echo "=== Checking git status ==="
git ls-files | grep "\.md$" | head -20
echo ""
echo "=== Looking for any source files at all ==="
git ls-files | grep -E "\.(ts|tsx|jsx|js|go|py)$" | head -20
echo ""
echo "=== Checking docusaurus config ==="
find . -maxdepth 2 -name "docusaurus.config.*" -o -name "sidebars.*"
echo ""
echo "=== Looking at directory structure ==="
ls -la docs/Repository: Meesho/BharatMLStack
Length of output: 2898
🏁 Script executed:
#!/bin/bash
echo "=== Checking for data-formats source in docs-src ==="
find docs-src -name "*data-format*"
echo ""
echo "=== Listing all md files in online-feature-store ==="
find docs-src/docs/online-feature-store -type f -name "*.md"
echo ""
echo "=== Checking content of data-formats.md source ==="
cat docs-src/docs/online-feature-store/v1.0.0/data-formats.md | head -50Repository: Meesho/BharatMLStack
Length of output: 3296
This is an auto-generated build artifact — fix the source files instead.
This minified webpack chunk is generated by Docusaurus and will be overwritten on the next build. However, the underlying source files contain typos that should be corrected:
-
Image filename typo: Rename
v1.0.0-psdb-fixed-length-encodding.png→v1.0.0-psdb-fixed-length-encoding.png(remove double 'd'). -
Typos in source markdown (
docs-src/docs/online-feature-store/v1.0.0/data-formats.md):- "at the hear of" → "at the heart of"
- "inspired form other" → "inspired from other"
- "used wile storing" → "used while storing"
🤖 Prompt for AI Agents
In `@docs/assets/js/4caa95bf.4a20c460.js` at line 1, The webpack chunk is a
generated artifact; fix the underlying source: rename the image referenced as
"v1.0.0-psdb-fixed-length-encodding-..." to
"v1.0.0-psdb-fixed-length-encoding-..." (update the asset file and any markdown
references), and correct typos in the markdown source referenced by the parsed
metadata (source:"@site/docs/online-feature-store/v1.0.0/data-formats.md") —
change "at the hear of" → "at the heart of", "inspired form other" → "inspired
from other", and "used wile storing" → "used while storing" so regenerated
bundles (the code that builds the asset strings like the one containing
i(5153).A) will include the corrected filename and text.
docs/assets/js/8cdb4121.8360b1d4.js
Outdated
| @@ -0,0 +1 @@ | |||
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[252],{294:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/llm-plat-9ac69c0ffd8c387d177e582611b8c775.png"},1569:(e,n,i)=>{i.d(n,{A:()=>t});const t=i.p+"assets/images/bms-7399e8796d2cd24617c432518ce3f312.png"},2233:e=>{e.exports=JSON.parse('{"permalink":"/BharatMLStack/blog/post-four","editUrl":"https://github.com/Meesho/BharatMLStack/tree/main/docs/blog/bharatmlstack-history/post-four/index.md","source":"@site/blog/bharatmlstack-history/post-four/index.md","title":"Designing a Production-Grade LLM Inference Platform: From Model Weights to Scalable GPU Serving","description":"BharatMLStack","date":"2025-03-29T00:00:00.000Z","tags":[{"inline":true,"label":"llm","permalink":"/BharatMLStack/blog/tags/llm"},{"inline":true,"label":"vllm","permalink":"/BharatMLStack/blog/tags/vllm"},{"inline":true,"label":"tensorrt-llm","permalink":"/BharatMLStack/blog/tags/tensorrt-llm"},{"inline":true,"label":"mlplatform","permalink":"/BharatMLStack/blog/tags/mlplatform"},{"inline":true,"label":"meesho","permalink":"/BharatMLStack/blog/tags/meesho"},{"inline":true,"label":"bharatmlstack","permalink":"/BharatMLStack/blog/tags/bharatmlstack"}],"readingTime":13.38,"hasTruncateMarker":false,"authors":[{"name":"Jaya Kumar","title":"Lead ML Engineer @ Meesho","url":"https://github.com/jayakommuru","imageURL":"https://github.com/jayakommuru.png","key":"jaya","page":null}],"frontMatter":{"slug":"post-four","title":"Designing a Production-Grade LLM Inference Platform: From Model Weights to Scalable GPU Serving","authors":["jaya"],"date":"2025-3-29","tags":["llm","vllm","tensorrt-llm","mlplatform","meesho","bharatmlstack"]},"unlisted":false,"prevItem":{"title":"LLM Inference Optimization Techniques: Engineering Sub-Second Latency at Scale","permalink":"/BharatMLStack/blog/post-five"},"nextItem":{"title":"Cracking the Code: Scaling Model Inference & Real-Time Embedding Search","permalink":"/BharatMLStack/blog/post-three"}}')},2531:(e,n,i)=>{i.r(n),i.d(n,{assets:()=>l,contentTitle:()=>o,default:()=>h,frontMatter:()=>a,metadata:()=>t,toc:()=>c});var t=i(2233),r=i(4848),s=i(8453);const a={slug:"post-four",title:"Designing a Production-Grade LLM Inference Platform: From Model Weights to Scalable GPU Serving",authors:["jaya"],date:"2025-3-29",tags:["llm","vllm","tensorrt-llm","mlplatform","meesho","bharatmlstack"]},o=void 0,l={authorsImageUrls:[void 0]},c=[{value:"Designing a Production-Grade LLM Inference Platform: From Model Weights to Scalable GPU Serving",id:"designing-a-production-grade-llm-inference-platform-from-model-weights-to-scalable-gpu-serving",level:2},{value:"Why LLM Inference Is not just bigger ML model serving",id:"why-llm-inference-is-not-just-bigger-ml-model-serving",level:2},{value:"Autoregressive Generation and Sequential Computation:",id:"autoregressive-generation-and-sequential-computation",level:3},{value:"Prefill and Decode Phases:",id:"prefill-and-decode-phases",level:3},{value:"Context Management and KV Caching:",id:"context-management-and-kv-caching",level:3},{value:"Dynamic and Irregular Workloads:",id:"dynamic-and-irregular-workloads",level:3},{value:"Streaming and User Experience Constraints:",id:"streaming-and-user-experience-constraints",level:3},{value:"LLMOps: High-Level Architecture",id:"llmops-high-level-architecture",level:2},{value:"Supported Inference backends (TensorRT LLM, Dynamo & vLLM)",id:"supported-inference-backends-tensorrt-llm--dynamo--vllm",level:2},{value:"Conclusion",id:"conclusion",level:2},{value:"Future Explorations",id:"future-explorations",level:2}];function d(e){const n={h2:"h2",h3:"h3",img:"img",li:"li",ol:"ol",p:"p",ul:"ul",...(0,s.R)(),...e.components};return(0,r.jsxs)(r.Fragment,{children:[(0,r.jsx)(n.p,{children:(0,r.jsx)(n.img,{alt:"BharatMLStack",src:i(1569).A+"",width:"1396",height:"460"})}),"\n",(0,r.jsx)(n.h2,{id:"designing-a-production-grade-llm-inference-platform-from-model-weights-to-scalable-gpu-serving",children:"Designing a Production-Grade LLM Inference Platform: From Model Weights to Scalable GPU Serving"}),"\n",(0,r.jsx)(n.p,{children:"Serving large language models in production introduces new challenges across infrastructure, performance optimization, and operational lifecycle management. The LLM Inference Platform addresses these challenges by providing a unified system for deploying and managing open-source and fine-tuned LLMs at scale."}),"\n",(0,r.jsx)(n.p,{children:"The platform implements a complete LLMOps lifecycle \u2014 from model registration and automated compilation to deployment, runtime optimization, and monitoring. Designed as a self-service environment, users can onboard models directly from open repositories such as Hugging Face or upload custom fine-tuned models, and deploy them using a single-click workflow with no manual infrastructure or configuration steps required."}),"\n",(0,r.jsx)(n.p,{children:"In addition to fully automated deployment, the platform allows users to select and apply custom inference optimization techniques \u2014 such as quantization strategies, batching configurations, and runtime-specific performance enhancements \u2014 enabling teams to balance latency, throughput, and cost based on their use case. The goal is to reduce operational friction while enabling high-performance, production-grade LLM inference."}),"\n",(0,r.jsx)(n.h2,{id:"why-llm-inference-is-not-just-bigger-ml-model-serving",children:"Why LLM Inference Is not just bigger ML model serving"}),"\n",(0,r.jsx)(n.p,{children:"Large language model (LLM) inference introduces a fundamentally different set of challenges compared to traditional machine learning inference. While classical ML models typically perform a single forward pass to produce a fixed prediction, LLMs operate as autoregressive systems, generating outputs token by token based on previously generated context. This difference dramatically changes how inference systems must be designed, optimized, and scaled."}),"\n",(0,r.jsx)(n.h3,{id:"autoregressive-generation-and-sequential-computation",children:"Autoregressive Generation and Sequential Computation:"}),"\n",(0,r.jsx)(n.p,{children:"Unlike traditional models such as classifiers or recommenders \u2014 where inference cost is relatively constant \u2014 LLMs generate responses incrementally. Each new token depends on all previously generated tokens, making inference inherently sequential and dynamic. This means latency and compute requirements vary significantly depending on prompt length and output size, introducing complexity in scheduling and resource allocation.\nBecause tokens cannot be generated fully in parallel during decoding, GPUs may become underutilized without specialized batching and scheduling strategies. This has led to the development of dedicated LLM inference engines optimized for token-level execution."}),"\n",(0,r.jsx)(n.h3,{id:"prefill-and-decode-phases",children:"Prefill and Decode Phases:"}),"\n",(0,r.jsx)(n.p,{children:"LLM inference typically consists of two distinct stages:"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Prefill phase \u2014 the model processes the input prompt and builds internal representations. This stage is compute-heavy and highly parallelizable."}),"\n",(0,r.jsx)(n.li,{children:"Decode phase \u2014 the model generates tokens sequentially, predicting one token at a time using previously generated context."}),"\n"]}),"\n",(0,r.jsx)(n.p,{children:"The decode stage often becomes memory-bound rather than compute-bound, which creates new performance bottlenecks compared to traditional ML workloads."}),"\n",(0,r.jsx)(n.h3,{id:"context-management-and-kv-caching",children:"Context Management and KV Caching:"}),"\n",(0,r.jsx)(n.p,{children:"Another fundamental difference lies in how LLMs maintain context. Transformer-based models rely on attention mechanisms that require access to past token representations. To avoid recomputing these representations repeatedly, inference engines use key-value (KV) caching, which stores intermediate activations from previous tokens.\nKV caching significantly improves performance by eliminating redundant computation, but it introduces new challenges:"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Memory consumption grows with sequence length and batch size"}),"\n",(0,r.jsx)(n.li,{children:"GPU memory becomes a critical bottleneck"}),"\n",(0,r.jsx)(n.li,{children:"Efficient memory management becomes essential for scaling concurrent requests"}),"\n"]}),"\n",(0,r.jsx)(n.p,{children:"This tradeoff between compute efficiency and memory usage is unique to LLM inference workloads."}),"\n",(0,r.jsx)(n.h3,{id:"dynamic-and-irregular-workloads",children:"Dynamic and Irregular Workloads:"}),"\n",(0,r.jsx)(n.p,{children:"Traditional ML inference typically operates on fixed-size inputs with predictable latency. In contrast, LLM requests vary widely in prompt length, output length, and runtime behavior. As a result:"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Batch sizes must be dynamic rather than static"}),"\n",(0,r.jsx)(n.li,{children:"Requests may enter and leave batches asynchronously"}),"\n",(0,r.jsx)(n.li,{children:"Scheduling systems must continuously rebalance workloads to maximize GPU utilization"}),"\n"]}),"\n",(0,r.jsx)(n.p,{children:"These characteristics require specialized serving architectures that differ significantly from standard ML serving pipelines."}),"\n",(0,r.jsx)(n.h3,{id:"streaming-and-user-experience-constraints",children:"Streaming and User Experience Constraints:"}),"\n",(0,r.jsx)(n.p,{children:"Another distinguishing factor is the expectation of real-time streaming responses. Instead of returning a single output, LLM systems often stream tokens to users as they are generated.\nBecause of these differences \u2014 sequential generation, growing memory requirements, dynamic workloads, and streaming constraints \u2014 LLM inference cannot be treated as a simple extension of existing ML serving systems. Production platforms must incorporate specialized runtime engines, advanced optimization techniques, and observability tailored specifically to LLM workloads."}),"\n",(0,r.jsx)(n.h2,{id:"llmops-high-level-architecture",children:"LLMOps: High-Level Architecture"}),"\n",(0,r.jsx)(n.p,{children:(0,r.jsx)(n.img,{alt:"LLM Architecture",src:i(294).A+"",width:"1302",height:"830"})}),"\n",(0,r.jsx)(n.p,{children:"The LLM Inference Framework is designed as a fully automated, end-to-end system for deploying and operating open-source and fine-tuned large language models at scale. The architecture abstracts the complexity of model optimization, hardware selection, deployment, and runtime management into a unified workflow that enables users to move from raw model weights to production-ready inference endpoints with minimal manual intervention."}),"\n",(0,r.jsx)(n.p,{children:"Our LLM Inference Framework is architected not just as a serving engine, but as a complete lifecycle management system. As illustrated in the high-level design below, the platform automates the journey of a model through seven distinct stages, ensuring reproducibility, performance, and scalability."}),"\n",(0,r.jsxs)(n.ol,{children:["\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Onboarding & Registration (The Source of Truth)"}),"\n",(0,r.jsx)(n.p,{children:"The lifecycle begins with the Data Scientist or engineer."}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Model Ingestion: Users onboard models\u2014whether open-source (Hugging Face, NeMo) or internally fine-tuned\u2014via the Truffle Box SDK/UI."}),"\n",(0,r.jsx)(n.li,{children:'LLM + Prompt Registry: Unlike traditional systems that only track model weights, our registry is a unified control plane. It stores both the Model Artifacts and the Prompt Templates. This allows Data Scientists to register and version-control prompts (e.g., "customer_support_v2") independently of the application code.'}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:'The "Black Box" Build Engine'}),"\n",(0,r.jsx)(n.p,{children:"Once a model is registered, the Automated LLM Compiler + Quantizer Module kicks off a background job on ephemeral GPU resources."}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Transformation: The raw model is converted into a TRT-LLM Checkpoint."}),"\n",(0,r.jsx)(n.li,{children:"Quantization: The system automatically applies quantization algorithms (like INT4 AWQ or FP8) to reduce memory footprint."}),"\n",(0,r.jsx)(n.li,{children:"Engine Building: Finally, it compiles a highly optimized TRT Engine specifically tuned for the target hardware."}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Intelligent Profiling & Validation"}),"\n",(0,r.jsx)(n.p,{children:"Before deployment, the new engine passes through the Hardware & Inference Runtime Profiler."}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Benchmarking: This module empirically tests the engine against various hardware configurations (L4 vs. A100) and runtimes (TRT-LLM vs. vLLM)."}),"\n",(0,r.jsx)(n.li,{children:"Optimization: It recommends the optimal configuration that meets latency SLAs (Time-To-First-Token) while minimizing cost."}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Smart Artifact Generation & Distribution"}),"\n",(0,r.jsx)(n.p,{children:'To solve the Kubernetes "Cold Start" problem, the LLM Serving Artifacts Generation module packages the model using a bifurcated strategy:'}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Standard Models: Artifacts are uploaded to Cloud Storage (GCS) and downloaded by pods at startup."}),"\n",(0,r.jsx)(n.li,{children:"Very Large Models: For massive models (>8GB) where network downloads are too slow, the system pre-caches the model onto Secondary Boot Disks. These disks are attached directly to new GPU nodes during autoscaling, eliminating download wait times."}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Image Streaming & Deployment"}),"\n",(0,r.jsx)(n.p,{children:"Simultaneously, the inference runtime container images are pulled from the Artifact Registry."}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Image Streaming: We utilize container image streaming to allow pods to start initializing while the massive Triton/Dynamo container layers are still downloading, further shaving seconds off the startup time. link"}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"The Inference Runtime (Kubernetes)"}),"\n",(0,r.jsx)(n.p,{children:"The workload lands on Kubernetes with Autoscaling."}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Dynamic Backends: Depending on the profile generated in Stage 3, the pod initializes either TensorRT-LLM (for throughput) or vLLM (for flexibility), or spins up a Dynamo worker for distributed inference."}),"\n",(0,r.jsx)(n.li,{children:'Data Loading: The pod either downloads the model from Cloud Storage or mounts the pre-warmed Secondary Boot Disk ("Pull from Disk").'}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Client Interaction & Observability"}),"\n",(0,r.jsx)(n.p,{children:"Finally, the LLM Inference Client executes the request."}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Prompt Injection: The client pulls the specific prompt template ID from the Registry, ensuring the exact versioned instructions are used."}),"\n",(0,r.jsx)(n.li,{children:"Streaming Response: The request is sent via gRPC, and tokens are streamed back to the user in real-time."}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Observability: Monitoring the Pulse of GenAI"}),"\n",(0,r.jsx)(n.p,{children:"In traditional microservices, success is measured by CPU utilization and request latency (p99). For Large Language Models, these metrics are insufficient. A user doesn't care if the GPU is at 80% utilization; they care about how fast the first word appears and how smoothly the rest of the sentence follows."}),"\n",(0,r.jsx)(n.p,{children:"To capture the true user experience, our platform instrumentation focuses on three critical LLM-specific metrics:"}),"\n",(0,r.jsxs)(n.ol,{children:["\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Time to First Token (TTFT)"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Definition: TTFT measures the time elapsed from the moment a request is received until the very first token is generated and streamed back to the user."}),"\n",(0,r.jsx)(n.li,{children:'Why it matters: This represents the "Prefill Phase" latency\u2014the time the model takes to process the input prompt and load weights. A high TTFT makes the application feel unresponsive or "hung."'}),"\n",(0,r.jsx)(n.li,{children:"Optimization: We closely monitor TTFT to ensure our Prefix Caching is effective (aiming for high cache hitrates), which drastically lowers this metric by skipping redundant prompt processing."}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Inter-Token Latency (ITL)"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:'Definition: ITL measures the average time interval between the generation of consecutive tokens during the "Decode Phase".'}),"\n",(0,r.jsx)(n.li,{children:'Why it matters: This defines the "perceived speed" of reading. Even if the first token is fast (low TTFT), high ITL makes the text generation look "jerky" or slow to the user.'}),"\n",(0,r.jsx)(n.li,{children:"Benchmarks: In our testing with Llama 3.1, we track p99 ITL to ensure it stays below human reading speeds to maintain a natural conversational flow."}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Token Throughput vs. Request Throughput"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"We distinguish between two types of throughput to balance system efficiency with user load:"}),"\n",(0,r.jsx)(n.li,{children:"Token Throughput (tokens/sec): The total number of tokens generated across all concurrent requests. This measures the raw compute efficiency of the GPU and the effectiveness of batching."}),"\n",(0,r.jsx)(n.li,{children:"Request Throughput (req/sec): The number of distinct user queries served per second. We use this to determine autoscaling thresholds, ensuring we scale out before the queue depth impacts ITL."}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"The Monitoring Stack"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:'Real-time Dashboards: We utilize Grafana to visualize these streaming metrics in real-time, allowing on-call engineers to spot "slow generation" incidents that generic "500 error" alerts would miss.'}),"\n",(0,r.jsx)(n.li,{children:'Request Tracing: Since Triton Inference Server does not log request payloads by default, we integrate a Helix Client to asynchronously publish request logs to Log Tables. This allows us to trace a specific "slow" request back to its prompt to understand if a complex input caused the latency spike.'}),"\n"]}),"\n"]}),"\n"]}),"\n"]}),"\n"]}),"\n",(0,r.jsx)(n.h2,{id:"supported-inference-backends-tensorrt-llm--dynamo--vllm",children:"Supported Inference backends (TensorRT LLM, Dynamo & vLLM)"}),"\n",(0,r.jsx)(n.p,{children:'Tailored for the Use Case: We do not believe in a "one-size-fits-all" approach to inference. Different use cases\u2014whether a real-time voice bot requiring ultra-lowsub-second latency or a massive reasoning task requiring huge context windows\u2014demand different runtime characteristics. Our platform is designed to be runtime-agnostic, allowing us to automatically select and tailor the best engine based on the specific requirements of the application:'}),"\n",(0,r.jsxs)(n.ol,{children:["\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"TensorRT-LLM: The High-Performance Standard"}),"\n",(0,r.jsx)(n.p,{children:"Suitable for: High-throughput production workloads where latency is critical (e.g., customer support chat, real-time voice bots)."}),"\n",(0,r.jsx)(n.p,{children:"TensorRT-LLM serves as our default backend for these scenarios. Our internal benchmarks on Llama 3.1 and 3.2 models demonstrated that a tuned TensorRT-LLM engine significantly outperforms standard runtimes, especially when utilizing INT4 AWQ and FP8 quantization ."}),"\n",(0,r.jsx)(n.p,{children:"Key optimizations we tailor for these high-load cases include:"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"Optimized execution via TensorRT engine compilation"}),"\n",(0,r.jsx)(n.li,{children:"Quantization-aware execution for reduced memory usage and improved throughput"}),"\n",(0,r.jsx)(n.li,{children:"Inflight Batching: Allowing requests to be processed continuously without waiting for the entire batch to finish, drastically improving GPU utilization ."}),"\n",(0,r.jsx)(n.li,{children:"Custom Plugins: Enabling specific NVIDIA plugins like the GEMM plugin and GPT Attention plugin to accelerate matrix multiplications and attention mechanisms ."}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"Dynamo: Distributed Inference for Reasoning Models"}),"\n",(0,r.jsx)(n.p,{children:'Suitable for: Very large "reasoning" models (70B+) or scenarios requiring massive context windows where a single GPU\'s memory is insufficient.'}),"\n",(0,r.jsx)(n.p,{children:"For these memory-bound tasks, we utilize Dynamo, a low-latency distributed inference framework . Unlike monolithic servers, Dynamo disaggregates the inference process to scale resources horizontally:"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"KV Aware Routing: A specialized router directs requests to workers that already hold the relevant Key-Value (KV) cache, minimizing redundant computation ."}),"\n",(0,r.jsx)(n.li,{children:'Prefill vs. Decode Split: The workload is divided into Prefill Workers (processing the prompt) and Decode Workers (generating tokens), allowing us to scale the compute-heavy "reading" phase independently from the memory-heavy "writing" phase .'}),"\n",(0,r.jsx)(n.li,{children:"Distributed execution across multiple GPU resources"}),"\n"]}),"\n"]}),"\n",(0,r.jsxs)(n.li,{children:["\n",(0,r.jsx)(n.p,{children:"vLLM: The Flexible Baseline"}),"\n",(0,r.jsx)(n.p,{children:"Suitable for: Rapid prototyping, testing new model architectures, or low-traffic internal tools where ease of deployment outweighs raw throughput."}),"\n",(0,r.jsx)(n.p,{children:"While TensorRT-LLM is optimized for maximum speed, vLLM provides a robust and flexible baseline ."}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"High throughput through dynamic batching and efficient memory utilization"}),"\n",(0,r.jsx)(n.li,{children:"Paged KV cache management for handling long contexts and concurrent requests"}),"\n",(0,r.jsx)(n.li,{children:"Strong support for open-source model ecosystems"}),"\n",(0,r.jsx)(n.li,{children:"Rapid Adoption: It allows us to onboard new model architectures immediately without waiting for a custom TensorRT build."}),"\n",(0,r.jsx)(n.li,{children:"Benchmarking Insight: In our internal tests, vLLM provided a strong baseline but often lacked the specific max-token optimizations present in our custom TRT engines . We use it strategically for initial testing before committing to a full TensorRT optimization pipeline."}),"\n"]}),"\n"]}),"\n"]}),"\n",(0,r.jsx)(n.h2,{id:"conclusion",children:"Conclusion"}),"\n",(0,r.jsx)(n.p,{children:"Large language model inference introduces a fundamentally new class of infrastructure challenges\u2014where performance is governed not just by raw compute, but by memory efficiency, intelligent scheduling, runtime specialization, and lifecycle automation. Unlike traditional ML serving, LLM inference requires systems that understand token-level execution, manage rapidly growing context state, and continuously balance latency, throughput, and cost under highly dynamic workloads."}),"\n",(0,r.jsx)(n.p,{children:"The LLM Inference Framework addresses these challenges by transforming inference into a fully automated, reproducible lifecycle\u2014from model onboarding and compilation to deployment, optimization, and observability. By integrating automated quantization and engine compilation, intelligent runtime selection, cold-start mitigation strategies, and LLM-specific observability metrics such as Time-to-First-Token and Inter-Token Latency, the platform ensures both high performance and operational simplicity."}),"\n",(0,r.jsx)(n.p,{children:"Equally important, the framework is designed with flexibility and future evolution in mind. Its runtime-agnostic architecture enables seamless adoption of emerging inference engines, hardware accelerators, and optimization techniques without requiring platform redesign. This ensures that teams can continuously leverage advancements in the rapidly evolving LLM ecosystem while maintaining consistent operational workflows."}),"\n",(0,r.jsx)(n.p,{children:"Ultimately, the goal of the platform is to make production-scale LLM deployment as seamless and reliable as traditional software deployment\u2014allowing teams to focus on building intelligent applications rather than managing infrastructure complexity. By combining lifecycle automation, runtime optimization, and deep observability, the LLM Inference Framework provides a scalable foundation for delivering fast, cost-efficient, and production-ready LLM experiences."}),"\n",(0,r.jsx)(n.h2,{id:"future-explorations",children:"Future Explorations"}),"\n",(0,r.jsx)(n.p,{children:"While we have achieved significant milestones in latency and throughput, the landscape of GenAI is evolving rapidly. Our roadmap focuses on increasing flexibility, reducing costs, and enhancing reliability for enterprise-grade workloads. Here is what we are building next:"}),"\n",(0,r.jsxs)(n.ul,{children:["\n",(0,r.jsx)(n.li,{children:"TPU Support: To diversify our hardware supply chain and further optimize cost-per-token, we are evaluating Google Cloud TPUs to bake it into our platform. By leveraging the JAX and PyTorch/XLA ecosystems, we aim to unlock the massive throughput potential of TPU v5e chips, particularly for our open-source Llama models. This will allow the hardware profiler to dynamically choose between NVIDIA GPUs and Google TPUs based on real-time availability and price-performance metrics."}),"\n",(0,r.jsx)(n.li,{children:'Multi-LoRA Serving (Serverless Experience): Currently, deploying a fine-tuned model requires a dedicated GPU. We are building support for Multi-LoRA serving, which will allow us to serve hundreds of unique, fine-tuned adapters on top of a single frozen base model. This will drastically reduce costs for multi-tenant applications, enabling a "serverless" experience where specific fine-tunes are hot-swapped instantly per request.'}),"\n",(0,r.jsx)(n.li,{children:"Spot Instance Orchestration: To further optimize cloud costs, we are developing fault-tolerant mechanisms to run inference workloads on Spot Instances. By implementing aggressive checkpointing and seamless request draining, we aim to leverage cheaper, preemptible compute capacity without interrupting the user's streaming experience."}),"\n",(0,r.jsx)(n.li,{children:'Semantic Caching Layer: We plan to move beyond standard Prefix Caching to implement Semantic Caching. By using a vector database to fetch responses for semantically similar queries (e.g., "How do I reset my password?" vs. "Password reset steps"), we can bypass the GPU entirely for repetitive queries, reducing latency to near-zero.'}),"\n",(0,r.jsx)(n.li,{children:"Context-Aware Autoscaling: Standard CPU/GPU utilization metrics are often insufficient signals for scaling LLMs. We are working on KV-cache pressure metrics for autoscaling. This ensures that we scale out before the memory fills up, preventing eviction-based slowdowns during traffic spikes."}),"\n",(0,r.jsx)(n.li,{children:'Online Evaluation & Guardrails: We are integrating a lightweight "Trust Layer" into the proxy. This will allow for low-latency input/output filtering (Guardrails) and asynchronous "LLM-as-a-Judge" evaluation pipelines to monitor response quality in production, not just system health.'}),"\n"]})]})}function h(e={}){const{wrapper:n}={...(0,s.R)(),...e.components};return n?(0,r.jsx)(n,{...e,children:(0,r.jsx)(d,{...e})}):d(e)}},8453:(e,n,i)=>{i.d(n,{R:()=>a,x:()=>o});var t=i(6540);const r={},s=t.createContext(r);function a(e){const n=t.useContext(s);return t.useMemo(function(){return"function"==typeof e?e(n):{...n,...e}},[n,e])}function o(e){let n;return n=e.disableParentContext?"function"==typeof e.components?e.components(r):e.components||r:a(e.components),t.createElement(s.Provider,{value:n},e.children)}}}]); No newline at end of file | |||
There was a problem hiding this comment.
Replace the “link” placeholder with an actual hyperlink.
The rendered content includes the literal word “link” in the “Image Streaming” bullet, which reads like a placeholder. Please update the source MDX to include the intended hyperlink so the docs read cleanly.
🤖 Prompt for AI Agents
In `@docs/assets/js/8cdb4121.8360b1d4.js` at line 1, The rendered bundle shows a
leftover "link" placeholder in the "Image Streaming" list item inside the
rendered tree produced by function d (JSX n.li under the "Image Streaming &
Deployment" section); update the original MDX source for the post (the content
that builds d) to replace the literal "link" placeholder with the intended
hyperlink (either a Markdown inline link or an HTML <a> tag) so the "Image
Streaming: We utilize container image streaming..." bullet reads with the proper
URL instead of the word "link".
| @@ -1 +1 @@ | |||
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[7720],{4041:e=>{e.exports=JSON.parse('{"categoryGeneratedIndex":{"title":"Trufflebox UI","description":"Trufflebox UI is a modern, feature rich UI framework for supporting MLOps. It supports Feature catalog, management, user managemnet and other adminops","slug":"/category/trufflebox-ui","permalink":"/BharatMLStack/category/trufflebox-ui","sidebar":"tutorialSidebar","navigation":{"previous":{"title":"Quick Start","permalink":"/BharatMLStack/quick-start/v1.0.0/quick-start"},"next":{"title":"User Manual","permalink":"/BharatMLStack/trufflebox-ui/v1.0.0/userguide"}}}}')}}]); No newline at end of file | |||
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[7720],{4041:e=>{e.exports=JSON.parse('{"categoryGeneratedIndex":{"title":"Trufflebox UI","description":"Trufflebox UI is a modern, feature rich UI framework for supporting MLOps. It supports Feature catalog, management, user managemnet and other adminops","slug":"/category/trufflebox-ui","permalink":"/BharatMLStack/category/trufflebox-ui","sidebar":"tutorialSidebar","navigation":{"previous":{"title":"Quick Start","permalink":"/BharatMLStack/quick-start/v1.0.0/quick-start"},"next":{"title":"v1.0.0","permalink":"/BharatMLStack/trufflebox-ui/v1.0.0"}}}}')}}]); No newline at end of file | |||
There was a problem hiding this comment.
Fix typo in category description ("managemnet").
User-facing docs copy should read “management.”
✏️ Suggested fix
-..."user managemnet and other adminops"...
+..."user management and other adminops"...📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[7720],{4041:e=>{e.exports=JSON.parse('{"categoryGeneratedIndex":{"title":"Trufflebox UI","description":"Trufflebox UI is a modern, feature rich UI framework for supporting MLOps. It supports Feature catalog, management, user managemnet and other adminops","slug":"/category/trufflebox-ui","permalink":"/BharatMLStack/category/trufflebox-ui","sidebar":"tutorialSidebar","navigation":{"previous":{"title":"Quick Start","permalink":"/BharatMLStack/quick-start/v1.0.0/quick-start"},"next":{"title":"v1.0.0","permalink":"/BharatMLStack/trufflebox-ui/v1.0.0"}}}}')}}]); | |
| "use strict";(self.webpackChunkdocs=self.webpackChunkdocs||[]).push([[7720],{4041:e=>{e.exports=JSON.parse('{"categoryGeneratedIndex":{"title":"Trufflebox UI","description":"Trufflebox UI is a modern, feature rich UI framework for supporting MLOps. It supports Feature catalog, management, user management and other adminops","slug":"/category/trufflebox-ui","permalink":"/BharatMLStack/category/trufflebox-ui","sidebar":"tutorialSidebar","navigation":{"previous":{"title":"Quick Start","permalink":"/BharatMLStack/quick-start/v1.0.0/quick-start"},"next":{"title":"v1.0.0","permalink":"/BharatMLStack/trufflebox-ui/v1.0.0"}}}}')}}]); |
🤖 Prompt for AI Agents
In `@docs/assets/js/fcf4f6ca.8b12d88e.js` at line 1, Update the exported JSON in
the module where e.exports is set (the object under categoryGeneratedIndex) to
correct the typo in the "description" value: replace "user managemnet" with
"user management" so the description string reads "...supports Feature catalog,
management, user management and other adminops"; locate the JSON assigned to
e.exports (or the categoryGeneratedIndex object) and update that description
field accordingly.
|
Semgrep found 9
Hardcoded sensitive data leaked Semgrep has detected a leak of sensitive data in this code. This secret data could be used by internal or external malicious actors. We highly recommend you change, reset, or rotate the sensitive data. A secret is hard-coded in the application. Secrets stored in source code, such as credentials, identifiers, and other types of sensitive data, can be leaked and used by internal or external malicious actors. It is recommended to rotate the secret and retrieve them from a secure secret vault or Hardware Security Module (HSM), alternatively environment variables can be used if allowed by your company policy. |
🔁 Pull Request Template – BharatMLStack
Context:
Give a brief overview of the motivation behind this change. Include any relevant discussion links (Slack, documents, tickets, etc.) that help reviewers understand the background and the issue being addressed.
Describe your changes:
Mention the changes made in the codebase.
Testing:
Please describe how you tested the code. If manual tests were performed - please explain how. If automatic tests were added or existing ones cover the change - please explain how did you run them.
Monitoring:
Explain how this change will be tracked after deployment. Indicate whether current dashboards, alerts, and logs are enough, or if additional instrumentation is required.
Rollback plan
Explain rollback plan in case of issues.
Checklist before requesting a review
📂 Modules Affected
horizon(Real-time systems / networking)online-feature-store(Feature serving infra)trufflebox-ui(Admin panel / UI)infra(Docker, CI/CD, GCP/AWS setup)docs(Documentation updates)___________✅ Type of Change
___________📊 Benchmark / Metrics (if applicable)
Summary by CodeRabbit