Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 86 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
# About CrateDB

[![Bluesky][badge-bluesky]][project-bluesky]

[![Status][badge-status]][project-pypi]
[![CI][badge-ci]][project-ci]
[![Coverage][badge-coverage]][project-coverage]
[![Downloads per month][badge-downloads-per-month]][project-downloads]

[![License][badge-license]][project-license]
[![Release Notes][badge-release-notes]][project-release-notes]

[![Status][badge-status]][project-pypi]
[![PyPI Version][badge-package-version]][project-pypi]
[![Python Versions][badge-python-versions]][project-pypi]
[![Downloads per month][badge-downloads-per-month]][project-downloads]

» [Documentation]
| [Releases]
Expand All @@ -19,6 +17,7 @@
| [License]
| [CrateDB]
| [Community Forum]
| [Bluesky]
Copy link
Member Author

@amotl amotl May 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tuned down the reference to Bluesky now. It currently points to https://bsky.app/search?q=cratedb, but the outcome isn't very pleasant.

image

The link should actually go to the CrateDB profile page https://bsky.app/profile/did:plc:kj3ndittnoqihvftw6rnn643, but that outcome would be even more unpleasant.

image

Request: Could the company finally ramp up its extrovert nature and start maintaining a proper presence on social networks where actual humans communicate and collaborate, optimally on both Bluesky and Mastodon, because Twitter is offically a thing of the past?

/cc @michaelkremmel

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is interest to improve relevant details, don't hesitate to ask how this can be made both efficient and fun. In this case, let's discuss on behalf of an out-of-band thread or conversation?


A high-level description about [CrateDB], with cross-references
to relevant resources in the spirit of a curated knowledge backbone.
Expand All @@ -30,17 +29,52 @@ to relevant resources in the spirit of a curated knowledge backbone.

## What's inside

- A few tidbits of _structured docs_.
A workbench rig for information and knowledge management,
aiming to compress content authoring and curation processes,
nothing big.

### Abstract

- **Structured documentation** based on a basic and generic [hierarchical outline].

- **Utility programs** to parse [YAML] outline files and generate outputs
(e.g., [Markdown], [llms-txt]), supporting the authoring and
production process.

- **Python API** that offers selective access to documentation
and knowledge resources by providing basic querying primitives to
inquire elements from the outline tree.

### Applied

- The `ask` subcommand uses [llms-txt] context files to answer questions
about a topic domain that would otherwise yield incomprehensible,
incomplete, or weak responses.

- The [cratedb-outline.yaml] file indexes documents about what CrateDB is
and what you can do with it.
- The compact Python API can be used by a [Model Context Protocol (MCP)]
documentation server to acquire information about the relevant topic
domain on demand.

- The [about/v1] folder includes [llms.txt] files generated from
[cratedb-outline.yaml] by expanding all links. They can be used
to provide better context for conversations about CrateDB.
### Concrete

- The outline file [cratedb-outline.yaml] file indexes documents about
what CrateDB is, what you can do with it, and how.

- Context bundle files are published to the [about/v1] folder.
They can be used to provide better context for conversations about
CrateDB, for example, by using the `cratedb-about ask` subcommand.

- The documentation subsystem of the [cratedb-mcp] package uses the
Python API to serve and consider relevant documentation resources
within its data flow procedures. It selects relevant resources mostly
based on the value of the `description` attribute of the outline
data model.

## Install

The authors recommend using the [uv] package manager. Alternative
options are to use `pipx` or `pip install --user`.

### From PyPI
```shell
uv tool install --upgrade 'cratedb-about[all]'
Expand All @@ -52,6 +86,12 @@ uv tool install --upgrade 'cratedb-about[all] @ git+https://github.com/crate/abo

## Usage

The `cratedb-about` package provides three subsystems.

- Outline: Read and inquire outline files.
- Build: Produce context output bundles from outline files.
- Query: Use context information for conversations with LLMs.

### Outline

#### CLI
Expand All @@ -72,8 +112,6 @@ environment variable.
#### API
Use the Python API to retrieve individual sets of outline items, for example,
by section name. The standard section names are: Docs, API, Examples, Optional.
The API can be used to feed information to a [Model Context Protocol (MCP)]
documentation server, for example, a subsystem of [cratedb-mcp].
```python
from cratedb_about import CrateDbKnowledgeOutline

Expand Down Expand Up @@ -132,6 +170,26 @@ cratedb-about list-questions
To configure a different context file, use the `CRATEDB_CONTEXT_URL` environment
variable. The default value is https://cdn.crate.io/about/v1/llms-full.txt.

## FAQ

- Q: Seriously, how do I use this?

A: As mentioned above, this repository includes content and a few utilities
to manage corresponding information. Users will directly use the produced
[llms.txt] and [llms-full.txt] files. Developers will install the [cratedb-about]
package to access fundamental outline information in their own programs
programmatically, or to invoke fragments of the production machinery
on their premises, either ad hoc, or by including it in automated pipelines.
Comment on lines +173 to +182
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users will directly use the produced llms.txt and llms-full.txt files.

I would like to elaborate a bit more about the "how".

While the context files are currently used by the built-in cratedb-about ask subcommand, that's actually just a workbench tool, so the files would like to see applications in actual end-user applications.

Can someone support us by providing relevant information how that would work with contemporary (desktop) applications, for example?

I don't know much about the state of the onion, but a quick search reveals the advent of those early movers in 2023/2024 already: Quora Poe, ChatGPT for Mac, Perplexity App, Claude Desktop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally to that, let's also "use" them within a lightweight chatbot example application accompanied with this package, for example using Streamlit, or anything else that is viable to ramp up such an interface without much custom programming.


- Q: It looks like the knowledge base machinery is missing important information
about CrateDB. I've asked it about matters of polymer sharding, and the answer
wasn't very insightful.

A: Well, we can understand your disappointment. To improve the situation,
we are constantly curating content, and you can support the process by giving
us hints about which fragments of information to include in the set of
curated information. To learn about what this means, see also [ABOUT-24].
Comment on lines +184 to +191
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That paragraph has been added to evangelize the need for support about further content curation.

/cc @kneth, @michaelkremmel


## Project Information

### Acknowledgements
Expand All @@ -140,17 +198,30 @@ this project is building upon.

### Contributing
The `cratedb-about` package is an open source project, and is [managed on
GitHub]. Contributions of any kind are very much appreciated.
GitHub]. Contributions of any kind are welcome and appreciated.

### Status
The software is in the pre-alpha (planning) stage. Version pinning is strongly
recommended, especially if you use it as a library.


[ABOUT-24]: https://github.com/crate/about/issues/24
[about/v1]: https://cdn.crate.io/about/v1/
[CrateDB]: https://cratedb.com/database
[cratedb-about]: https://pypi.org/project/cratedb-about/
[cratedb-mcp]: https://github.com/crate/cratedb-mcp
[cratedb-outline.yaml]: https://github.com/crate/about/blob/main/src/cratedb_about/outline/cratedb-outline.yaml
[filesystem-spec]: https://filesystem-spec.readthedocs.io/
[llms.txt]: https://llmstxt.org/
[hierarchical outline]: https://en.wikipedia.org/wiki/Outline_(list)
[llms-txt]: https://llmstxt.org/
[llms.txt]: https://cdn.crate.io/about/v1/llms.txt
[llms-full.txt]: https://cdn.crate.io/about/v1/llms-full.txt
[Markdown]: https://daringfireball.net/projects/markdown/
[Model Context Protocol (MCP)]: https://modelcontextprotocol.io/introduction
[uv]: https://docs.astral.sh/uv/
[YAML]: https://en.wikipedia.org/wiki/Yaml

[Bluesky]: https://bsky.app/search?q=cratedb
[Community Forum]: https://community.cratedb.com/
[Documentation]: https://github.com/crate/about
[Issues]: https://github.com/crate/about/issues
Expand All @@ -159,7 +230,6 @@ GitHub]. Contributions of any kind are very much appreciated.
[Source code]: https://github.com/crate/about
[Releases]: https://github.com/crate/about/releases

[badge-bluesky]: https://img.shields.io/badge/Bluesky-0285FF?logo=bluesky&logoColor=fff&label=Follow%20%40CrateDB
[badge-ci]: https://github.com/crate/about/actions/workflows/tests.yml/badge.svg
[badge-coverage]: https://codecov.io/gh/crate/about/branch/main/graph/badge.svg
[badge-downloads-per-month]: https://pepy.tech/badge/cratedb-about/month
Expand All @@ -168,7 +238,6 @@ GitHub]. Contributions of any kind are very much appreciated.
[badge-python-versions]: https://img.shields.io/pypi/pyversions/cratedb-about.svg
[badge-release-notes]: https://img.shields.io/github/release/crate/about?label=Release+Notes
[badge-status]: https://img.shields.io/pypi/status/cratedb-about.svg
[project-bluesky]: https://bsky.app/search?q=cratedb
[project-ci]: https://github.com/crate/about/actions/workflows/tests.yml
[project-coverage]: https://app.codecov.io/gh/crate/about
[project-downloads]: https://pepy.tech/project/cratedb-about/
Expand Down