Skip to content

Commit

Permalink
Highlight create_distributed_function (#892)
Browse files Browse the repository at this point in the history
  • Loading branch information
jonels-msft committed Jan 10, 2020
1 parent 2bef61b commit cdb1272
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 5 deletions.
2 changes: 2 additions & 0 deletions get_started/concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@ When you use Citus, the coordinator node you connect to and interact with is a r

Creating standard PostgreSQL tables is easy because it's the default. It’s what you get when you run CREATE TABLE. In almost every Citus deployment we see standard PostgreSQL tables co-existing with distributed and reference tables. Indeed, Citus itself uses local tables to hold cluster metadata, as mentioned earlier.

.. _shards:

Shards
------

Expand Down
51 changes: 46 additions & 5 deletions get_started/tutorial_multi_tenant.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,9 @@ In this case, we will shard all the tables on the :code:`company_id`.

.. code-block:: sql
-- before distributing tables, enable some extra features
SET citus.replication_model = 'streaming';
SELECT create_distributed_table('companies', 'id');
SELECT create_distributed_table('campaigns', 'company_id');
SELECT create_distributed_table('ads', 'company_id');
Expand Down Expand Up @@ -147,17 +150,55 @@ If you want to double the budget for all the campaigns of a company, you can run
WHERE company_id = 5;
Another example of such an operation would be to run transactions which span multiple tables. Let's
say you want to delete a campaign and all its associated ads, you could do it atomically by running.
say you want to delete a campaign and all its associated ads, you could do it atomically by running:

.. code-block:: sql
BEGIN;
DELETE from campaigns where id = 46 AND company_id = 5;
DELETE from ads where campaign_id = 46 AND company_id = 5;
DELETE FROM campaigns WHERE id = 46 AND company_id = 5;
DELETE FROM ads WHERE campaign_id = 46 AND company_id = 5;
COMMIT;
Other than transactional operations, you can also run analytics queries on this data using standard SQL.
One interesting query for a company to run would be to see details about its campaigns with maximum budget.
Each statement in a transactions causes roundtrips between the coordinator and
workers in Citus. For multi-tenant workloads, it's more efficient to run
transactions in distributed functions. The efficiency gains become more
apparent for larger transactions, but we can use the small transaction above as
an example.

First create a function that does the deletions:

.. code-block:: postgres
CREATE OR REPLACE FUNCTION
delete_campaign(company_id int, campaign_id int)
RETURNS void LANGUAGE plpgsql AS $fn$
BEGIN
DELETE FROM campaigns
WHERE id = $2 AND campaigns.company_id = $1;
DELETE FROM ads
WHERE ads.campaign_id = $2 AND ads.company_id = $1;
END;
$fn$;
Next use :ref:`create_distributed_function` to instruct Citus to run the
function directly on workers rather than on the coordinator. It will run the
function on whatever worker holds the :ref:`shards` for tables ``ads`` and
``campaigns`` corresponding to the value ``company_id``. (This feature
requires the ``citus.replication_model`` change we made earlier.)

.. code-block:: sql
SELECT create_distributed_function(
'delete_campaign(int, int)', 'company_id',
colocate_with := 'campaigns'
);
-- you can run the function as usual
SELECT delete_campaign(5, 46);
Besides transactional operations, you can also run analytics queries using
standard SQL. One interesting query for a company to run would be to see
details about its campaigns with maximum budget.

.. code-block:: sql
Expand Down

0 comments on commit cdb1272

Please sign in to comment.