diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_1.svg b/pgml-cms/docs/.gitbook/assets/pgcat_1.svg new file mode 100644 index 000000000..213b7528f --- /dev/null +++ b/pgml-cms/docs/.gitbook/assets/pgcat_1.svg @@ -0,0 +1,57 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_2.png b/pgml-cms/docs/.gitbook/assets/pgcat_2.png new file mode 100644 index 000000000..1d415069a Binary files /dev/null and b/pgml-cms/docs/.gitbook/assets/pgcat_2.png differ diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_3.png b/pgml-cms/docs/.gitbook/assets/pgcat_3.png new file mode 100644 index 000000000..5b3e36bb8 Binary files /dev/null and b/pgml-cms/docs/.gitbook/assets/pgcat_3.png differ diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_4.png b/pgml-cms/docs/.gitbook/assets/pgcat_4.png new file mode 100644 index 000000000..54fef38a3 Binary files /dev/null and b/pgml-cms/docs/.gitbook/assets/pgcat_4.png differ diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_5.png b/pgml-cms/docs/.gitbook/assets/pgcat_5.png new file mode 100644 index 000000000..c8f17eb2b Binary files /dev/null and b/pgml-cms/docs/.gitbook/assets/pgcat_5.png differ diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_6.png b/pgml-cms/docs/.gitbook/assets/pgcat_6.png new file mode 100644 index 000000000..201184d9d Binary files /dev/null and b/pgml-cms/docs/.gitbook/assets/pgcat_6.png differ diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_7.png b/pgml-cms/docs/.gitbook/assets/pgcat_7.png new file mode 100644 index 000000000..58ad2a818 Binary files /dev/null and b/pgml-cms/docs/.gitbook/assets/pgcat_7.png differ diff --git a/pgml-cms/docs/README.md b/pgml-cms/docs/README.md index 6e14c3258..bc5ff5462 100644 --- a/pgml-cms/docs/README.md +++ b/pgml-cms/docs/README.md @@ -6,17 +6,17 @@ description: The key concepts that make up PostgresML. PostgresML is a complete MLOps platform built on PostgreSQL. Our operating principle is: -> _Move the models to the database, rather than constantly moving the data to the models._ +> _Move models to the database, rather than constantly moving data to the models._ -The data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move the models to the database, rather than continuously moving data to the models. +Data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move models to the database, rather than continuously moving data to the models. ## AI engine PostgresML allows you to take advantage of the fundamental relationship between data and models, by extending the database with the following capabilities: * **Model Serving** - GPU accelerated inference engine for interactive applications, with no additional networking latency or reliability costs -* **Model Store** - Access to open-source models including state of the art LLMs from HuggingFace, and track changes in performance between versions -* **Model Training** - Train models with your application data using more than 50 algorithms for regression, classification or clustering tasks; fine tune pre-trained models like LLaMA and BERT to improve performance +* **Model Store** - Access to open-source models including state of the art LLMs from Hugging Face, and track changes in performance between versions +* **Model Training** - Train models with your application data using more than 50 algorithms for regression, classification or clustering tasks; fine tune pre-trained models like Llama and BERT to improve performance * **Feature Store** - Scalable access to model inputs, including vector, text, categorical, and numeric data: vector database, text search, knowledge graph and application data all in one low-latency system
Machine Learning Infrastructure (2.0) by a16z

PostgresML handles all of the functions described by a16z

@@ -34,14 +34,14 @@ The PostgresML team also provides [native language SDKs](https://github.com/post While using the SDK is completely optional, SDK clients can perform advanced machine learning tasks in a single SQL request, without having to transfer additional data, models, hardware or dependencies to the client application. -Use cases include: +Some of the use cases include: * Chat with streaming responses from state-of-the-art open source LLMs * Semantic search with keywords and embeddings * RAG in a single request without using any third-party services * Text translation between hundreds of languages * Text summarization to distill complex documents -* Forecasting timeseries data for key metrics with and metadata +* Forecasting time series data for key metrics with and metadata * Anomaly detection using application data ## Our mission diff --git a/pgml-cms/docs/SUMMARY.md b/pgml-cms/docs/SUMMARY.md index c278fadb1..3588b3f1d 100644 --- a/pgml-cms/docs/SUMMARY.md +++ b/pgml-cms/docs/SUMMARY.md @@ -3,7 +3,7 @@ ## Introduction * [Overview](README.md) -* [Getting Started](introduction/getting-started/README.md) +* [Getting started](introduction/getting-started/README.md) * [Create your database](introduction/getting-started/create-your-database.md) * [Connect your app](introduction/getting-started/connect-your-app.md) * [Import your data](introduction/getting-started/import-your-data/README.md) @@ -52,12 +52,12 @@ ## Product -* [Cloud Database](product/cloud-database/README.md) +* [Cloud database](product/cloud-database/README.md) * [Serverless](product/cloud-database/serverless.md) * [Dedicated](product/cloud-database/dedicated.md) * [Enterprise](product/cloud-database/plans.md) -* [Vector Database](product/vector-database.md) -* [PgCat Proxy](product/pgcat/README.md) +* [Vector database](product/vector-database.md) +* [PgCat pooler](product/pgcat/README.md) * [Features](product/pgcat/features.md) * [Installation](product/pgcat/installation.md) * [Configuration](product/pgcat/configuration.md) diff --git a/pgml-cms/docs/introduction/getting-started/README.md b/pgml-cms/docs/introduction/getting-started/README.md index ec0997468..cde0c6d3a 100644 --- a/pgml-cms/docs/introduction/getting-started/README.md +++ b/pgml-cms/docs/introduction/getting-started/README.md @@ -2,16 +2,18 @@ description: Setup a database and connect your application to PostgresML --- -# Getting Started +# Getting started -A PostgresML deployment consists of multiple components working in concert to provide a complete Machine Learning platform. We provide a fully managed solution in [our cloud](create-your-database), and document a self-hosted installation in [Developer Docs](/docs/resources/developer-docs/quick-start-with-docker). +A PostgresML deployment consists of multiple components working in concert to provide a complete Machine Learning platform: -* PostgreSQL database, with `pgml`, `pgvector` and many other extensions installed, including backups, metrics, logs, replicas and high availability -* PgCat pooler to provide secure access and model load balancing across thousands of clients -* A web application to manage deployed models and share experiments and analysis in SQL notebooks +* PostgreSQL database, with `pgml`, `pgvector` and many other extensions that add features useful in day-to-day and machine learning use cases +* [PgCat pooler](/docs/product/pgcat/) to load balance thousands of concurrenct client requests across several database instances +* A web application to manage deployed models and share experiments analysis with SQL notebooks -
PostgresML architecture
+We provide a fully managed solution in [our cloud](create-your-database), and document a self-hosted installation in the [Developer Docs](/docs/resources/developer-docs/quick-start-with-docker). + +
PostgresML architecture
By building PostgresML on top of a mature database, we get reliable backups for model inputs and proven scalability without reinventing the wheel, so that we can focus on providing access to the latest developments in open source machine learning and artificial intelligence. -This guide will help you get started with a generous free account, that includes access to GPU accelerated models and 5 GB of storage, or you can skip to our [Developer Docs](/docs/resources/developer-docs/quick-start-with-docker) to see how to run PostgresML locally with our Docker image. +This guide will help you get started with a generous [free account](create-your-database), that includes access to GPU accelerated models and 5 GB of storage, or you can skip to our [Developer Docs](/docs/resources/developer-docs/quick-start-with-docker) to see how to run PostgresML locally with our Docker image. diff --git a/pgml-cms/docs/product/pgcat/README.md b/pgml-cms/docs/product/pgcat/README.md index 04fdd76a2..f92de63bd 100644 --- a/pgml-cms/docs/product/pgcat/README.md +++ b/pgml-cms/docs/product/pgcat/README.md @@ -2,10 +2,48 @@ description: Nextgen PostgreSQL Pooler --- -# PgCat +# PgCat pooler -PgCat is PostgreSQL connection pooler and proxy which scales PostgresML deployments. It supports read/write query separation, multiple replicas, automatic traffic distribution and load balancing, sharding, and many more features expected out of high availability enterprise grade Postgres databases. +
+
+
+ PgCat logo +
+
+
+
+

PgCat is PostgreSQL connection pooler and proxy which scales PostgreSQL (and PostgresML) databases beyond a single instance.

+

+ It supports replicas, load balancing, sharding, failover, and many more features expected out of high availability enterprise-grade PostgreSQL deployment. +

+

+ Written in Rust using Tokio, it takes advantage of multiple CPUs and the safety and performance guarantees of the Rust language. +

+
+
-Written in Rust and powered by Tokio, it takes advantage of multiple CPUs, and the safety and performance guarantees of the Rust language. -PgCat, like PostgresML, is free and open source, distributed under the MIT license. It's currently running in our Cloud, powering both Serverless and Dedicated databases. +PgCat, like PostgresML, is free and open source, distributed under the MIT license. It's currently running in our [cloud](https://postgresml.org/signup), powering both Serverless and Dedicated databases. + +## [Features](features) + +PgCat implements the PostgreSQL wire protocol and can understand and optimally route queries & transactions based on their characteristics. For example, if your database deployment consists of a primary and replica, PgCat can send all `SELECT` queries to the replica, and all other queries to the primary, creating a read/write traffic separation. + +
+ PgCat architecture +
PgCat deployment at scale
+
+ +
+ +If you have more than one primary, sharded with either the Postgres hashing algorithm or a custom sharding function, PgCat can parse queries, extract the sharding key, and route the query to the correct shard without requiring any modifications on the client side. + +PgCat has many more features which are more thoroughly described in the [PgCat features](features) section. + +## [Installation](installation) + +PgCat is open source and available from our [GitHub repository](https://github.com/postgresml/pgcat) and, if you're running Ubuntu 22.04, from our Aptitude repository. You can read more about how to install PgCat in the [installation](installation) section. + +## [Configuration](configuration) + +PgCat, like many other PostgreSQL poolers, has its own configuration file format (it's written in Rust, so of course we use TOML). The settings and their meaning are documented in the [configuration](configuration) section. diff --git a/pgml-cms/docs/product/pgcat/features.md b/pgml-cms/docs/product/pgcat/features.md index 6cedd3e05..df09649cb 100644 --- a/pgml-cms/docs/product/pgcat/features.md +++ b/pgml-cms/docs/product/pgcat/features.md @@ -1,44 +1,96 @@ -# Features +# PgCat features PgCat has many features currently in various stages of readiness and development. Most of its features are used in production and at scale. -### Query load balancing +### Query load balancing -PgCat is able to load balance Postgres queries against multiple replicas automatically. Clients connect to a single PgCat instance, which pretends to be a single Postgres database, and can issue as many queries as they need. The queries are then evenly distributed to all available replicas using configurable load balancing strategies. +
+
+
+ PgCat load balancing +
+
+
+

PgCat can automatically load balance Postgres queries between multiple replicas. Clients connect to a single PgCat instance, which pretends to be a Postgres database, while the pooler manages its own connections to the replicas.

+

The queries are evenly distributed to all available servers using one of the three supported load balancing strategies: random, round robin, or least active connections.

+

Random load balancing picks a replica using a random number generator. Round robin counts queries and sends them to replicas in order. Least active connections picks the replica with the least number of actively running queries.

+
+
-### High availability +Which load balancing strategy to choose depends on the workload and the number of replicas. Random, on average, is the most fair strategy, and we recommended it for most workloads. -Just like any other modern load balancer, PgCat supports healthchecks and failover. PgCat maintains an internal map of healthy and unhealthy replicas, and routes traffic only to the healthy ones. +Round robin assumes all queries have equal cost and all replicas have equal capacity to service requests. If that's the case, round robin can improve workload distribution over random query distribution. -All replicas are periodically checked, and if they are responding, placed into the healthy pool. If the healthcheck fails, they are removed from that pool for a configurable amount of time, until they are checked again. This allows PgCat to run independently of any other Postgres management system and make decisions based on its own internal knowledge or configuration. +Least active connections assumes queries have different costs and replicas have different capacity, and could improve performance over round robin, by evenly spreading the load across replicas of different sizes. -### Read/write query separation +### High availability -Postgres is typically deployed in a one primary and many replicas architecture, where write queries go to a single primary, and read queries are distributed to either all machines or just the read replicas. PgCat can inspect incoming queries, parse the SQL to determine if the query intends to read or write, and route the query to either the primary or the replicas, as needed. +
+
+
+ PgCat high availability +
+
+
+

Just like any other modern load balancer, PgCat supports health checks and failover. It maintains an internal map of healthy and unavailable replicas, and makes sure queries are only routed to healthy instances.

+

If a replica fails a health check, it is banned from serving additional traffic for a configurable amount of time. This significantly reduces errors in production when instance hardware inevitably fails.

+

Broken replicas are checked again after the traffic ban expires, and if they continue to fail, are prevented from serving queries. If a replica is permanently down, it's best to remove it from the configuration to avoid any intermittent errors.

+
+
+ +High availability is important for production deployments because database errors are typically not recoverable. The only way to have a working application is to have a running database; placing PgCat in front of multiple machines increases the overall availability of the system. -This allows for much simpler application configuration and opens up at scale deployments to all application frameworks, which currently require developers to manually route queries (e.g. Rails, Django, and others). +### Read/write query separation -### Multithreading +
+
+
+ PgCat read/write separation +
+
+
+

A typical application reads data much more frequently than writes it. To help scale read workloads, PostgreSQL deployments add read replicas which can serve SELECT queries.

+

PgCat is able to inspect queries and determine if the query is a SELECT which, most of the time, will read data, or a write query like an INSERT or UPDATE.

+

If PgCat is configured with both the primary and replicas, it will route all read queries to the replicas, while making sure write queries are sent to the primary.

+
+
+ +Removing read traffic from the primary can help scale it beyond its normal capacity, and can also help with high availability, as the primary is typically the most loaded instance in a deployment. No application modifications are required to take advantage of this functionality, so ORMs like Rails, Django and others don't need any special configuration or query annotations. -PgCat is written in Rust using Tokio, which gives it the ability to use as many CPUs as are available. This simplifies deployments in environments with large transactional workloads, by requiring only one instance of PgCat per hardware instance. +### Sharding -This architecture allows to offload more work to the pooler which would otherwise would have to be implemented in the clients, without blocking them from accessing the database. For example, if we wanted to perform some CPU-intensive workload per query, we would be able to do so for multiple connections at a time. +
+
+
+ PgCat read/write separation +
+
+
+

Sharding allows to horizontally scale database workloads of all kinds, including writes. The data is evenly split into pieces and each piece is placed onto a different server. The query traffic is then equally split between the shards, as the application usage increases over time.

+

Since PgCat inspects every query, it's able to extract the sharding key (typically a table column) from the query and route the query to the right shard.

+

Both read and write queries are supported, as long as the sharding key is specified. If that's not the case, PgCat will execute queries against all shards in parallel, combine the results, and return all of them as part of the same request.

+
+
-### Sharding +While multi-shard queries are generally not recommended to scale typical workloads, they can be very useful in scatter-gather algorithms, like vector similarity search and ranking. Having the ability to talk to multiple servers simultaneously can scale database performance linearly with the size of the data. -Sharding allows to horizontally scale write queries, something that wasn't possible with typical Postgres deployments. PgCat is able to inspect incoming queries, extract the sharding key, hash it, and route the query to the correct primary, without requiring clients to modify their code. +If the sharding key is not readily available, query metadata can be added to instruct PgCat to route the query to a specific shard. This requires the client to add annotations manually, which isn't scalable but can be a good workaround when no other option is available. -PgCat also accepts a custom SQL syntax to override its sharding decisions, e.g. when the clients want to talk to a specific shard and, when clients want full control over sharding, a query comment indicating the desired shard for that query. +### Multithreading -Since PgCat is a proxy, it makes decisions only based on configuration and its internal knowledge of the architecture. Therefore, it doesn't move data around and reshard Postgres clusters. It works in tandem with other tools that shard Postgres, and supports multiple hashing and routing functions, depending on the sharding tool. +PgCat is written in Rust using Tokio, which allows it to use all the CPU cores if more than one is available. This simplifies deployments in environments with large transactional workloads, by requiring only one instance of PgCat per machine. -### Standard features +This architecture allows to offload more work to the pooler which otherwise would have to be implemented in the clients, without blocking access the database. For example, if we wanted to perform some CPU-intensive workload for some queries, we are able to do so for multiple client queries, concurrently. + +### Additional standard features In addition to novel features that PgCat introduces to Postgres deployments, it supports all the standard features expected from a pooler: -* authentication, multiple users and databases +* Authentication, multiple users and databases * TLS encryption -* live configuration reloading -* statistics and an admin database for pooler management -* transaction and session mode +* Zero downtime configuration changes +* Statistics and an admin database for monitoring and management +* Transaction and session query mode + +and many more. For a full list, take a look at our [GitHub repository](https://github.com/postgresml/pgcat). diff --git a/pgml-cms/docs/product/pgcat/installation.md b/pgml-cms/docs/product/pgcat/installation.md index e7458402b..07248ba4d 100644 --- a/pgml-cms/docs/product/pgcat/installation.md +++ b/pgml-cms/docs/product/pgcat/installation.md @@ -1,39 +1,47 @@ -# Installation +# PgCat installation -If you're using our Cloud, Dedicated databases come with the latest stable version of PgCat, managed deployments, and automatic configuration. +If you're using our [cloud](https://postgresml.org/signup), you're already using PgCat. All databases are using the latest and greatest PgCat version, with automatic updates and monitoring. You can connect directly with your PostgreSQL client libraries and applications, and PgCat will take care of the rest. -PgCat is free and open source, distributed under the MIT license. You can obtain its source code from our [repository](https://github.com/postgresml/pgcat) in GitHub. It can be installed by building it from source, by installing it from our APT repository, or by running it using our Docker image. +## Open source + +PgCat is free and open source, distributed under the MIT license. You can obtain its source code from our [repository in GitHub](https://github.com/postgresml/pgcat). PgCat can be installed by building it from source, by downloading it from our Aptitude repository, or by using our Docker image. ### Installing from source -To install PgCat from source, you'll need a recent version of the Rust compiler. Once setup, compiling PgCat is as simple as: +To install PgCat from source, you'll need a recent version of the Rust compiler and the C/C++ build toolchain to compile dependencies, like `pg_query`. If you have those installed already, compiling PgCat is as simple as: ``` cargo build --release ``` -which will produce the executable in `target/release/pgcat`. That executable can be placed into a system directory like `/usr/local/bin` and ran as a service or directly via a shell. +This will produce the executable in `target/release/pgcat` directory which can be placed into a system directory like `/usr/local/bin` and ran as a Systemd service, or directly via a shell command. -### Installing from APT +### Installing from Aptitude -We are currently building and distributing a Debian package for Ubuntu 22.04 LTS as part of our release process. If you're using that version of Ubuntu, you can add our APT repository into your sources and install PgCat with `apt`: +As part of our regular release process, we are building and distributing a Debian package for Ubuntu 22.04 LTS. If you're using that version of Ubuntu, you can add our Aptitude repository into your sources and install PgCat with `apt`: ``` +echo "deb [trusted=yes] https://apt.postgresml.org $(lsb_release -cs) main" | \ +sudo tee -a /etc/apt/sources.list && \ sudo apt install pgcat ``` -This will install the executable, a Systemd service called `pgcat`, and a configuration file template `/etc/pgcat.toml.example` which can be modified to your needs. +The Debian package will install the following items: + +- The PgCat executable, placed into `/usr/bin/pgcat` +- A Systemd service definition, placed into `/usr/systemd/system/pgcat.service` +- A configuration file template, placed into `/etc/pgcat.example.toml` -By default, the `pgcat` service will expect a `/etc/pgcat.toml` configuration file, which should be placed there by the user before the service can successfully start. +By default, the `pgcat` service will expect the configuration file to be located in `/etc/pgcat.toml`, so make sure to either write your own, or modify and rename the template before starting the service. ### Running with Docker -We automatically build and release a Docker image with each commit in the `main` branch of our GitHub repository. This image can be used as-is, but does require the user to provide a `pgcat.toml` configuration file. +With each commit to the `main` branch of our [GitHub repository](https://github.com/postgresml/pgcat), we build and release a Docker image. This image can be used as-is, but does require the user to provide a `pgcat.toml` configuration file. -Assuming you have a `pgcat.toml` file in your current working directory, you can run the latest version of PgCat with just one command: +Assuming you have `pgcat.toml` in your current working directory, you can run the latest version of PgCat with just one command: ```bash docker run \ - -v $(pwd)/pgcat.toml:/etc/pgcat/pgcat.toml \ - ghcr.io/postgresml/pgcat:latest + -v $(pwd)/pgcat.toml:/etc/pgcat/pgcat.toml \ +ghcr.io/postgresml/pgcat:latest ```