rstudio · machow · Mar 27, 2023 · Mar 15, 2023 · Mar 15, 2023 · Mar 15, 2023
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -169,7 +169,7 @@ jobs:
       - uses: actions/setup-python@v2
 
         with:
-          python-version: 3.8
+          python-version: "3.10"
       - name: Install dependencies
         run: |
           python -m pip install --upgrade pip

diff --git a/docs/_toc.yml b/docs/_toc.yml
@@ -5,4 +5,5 @@ format: jb-book
 root: intro
 chapters:
 - file: getting_started
+- file: articles/index.rst
 - file: api/index.rst
diff --git a/docs/articles/customize-pins-metadata.Rmd b/docs/articles/customize-pins-metadata.Rmd
@@ -0,0 +1,97 @@
+---
+jupyter:
+  jupytext:
+    text_representation:
+      extension: .Rmd
+      format_name: rmarkdown
+      format_version: '1.2'
+      jupytext_version: 1.13.6
+  kernelspec:
+    display_name: venv-pins-python
+    language: python
+    name: venv-pins-python
+---
+
+# Using custom metadata
+
+
+
+The `metadata` argument in pins is flexible and can hold any kind of metadata that you can formulate as a `dict` (convertable to JSON).
+In some situations, you may want to read and write with _consistent_ customized metadata;
+you can create functions to wrap `pin_write()` and `pin_read()` for your particular use case.
+
+We'll begin by creating a temporary board for demonstration:
+
+```{python setup}
+import pins
+import pandas as pd
+
+from pprint import pprint
+
+board = pins.board_temp()
+```
+
+
+# A function to store pandas Categoricals
+
+Say you want to store a pandas Categorical object as JSON together with the _categories_ of the categorical in the metadata.
+
+For example, here is a simple categorical and its categories:
+
+```{python}
+some_cat = pd.Categorical(["a", "a", "b"])
+
+some_cat.categories
+```
+
+Notice that the categories attribute is just the unique values in the categorical.
+
+We can write a function wrapping `pin_write()` that holds the categories in metadata, so we can easily re-create the categorical with them.
+
+```{python}
+def pin_write_cat_json(
+    board,
+    x: pd.Categorical,
+    name,
+    **kwargs
+):
+    metadata = {"categories": x.categories.to_list()}
+    json_data = x.to_list()
+    board.pin_write(json_data, name = name, type = "json", metadata = metadata, **kwargs)
+```
+
+We can use this new function to write a pin as JSON with our specific metadata:
+
+```{python}
+some_cat = pd.Categorical(["a", "a", "b", "c"])
+pin_write_cat_json(board, some_cat, name = "some-cat")
+```
+
+## A function to read categoricals
+
+It's possible to read this pin using the regular `pin_read()` function, but the object we get is no longer a categorical!
+
+```{python}
+board.pin_read("some-cat")
+```
+
+However, notice that if we use `board.pin_meta()`, the information we stored on categories is in the `.user` field.
+
+```{python}
+pprint(
+    board.pin_meta("some-cat")
+)
+```
+
+This enables us to write a special function for reading, to reconstruct the categorical, using the categories stashed in metadata:
+
+```{python}
+def pin_read_cat_json(board, name, version=None, hash=None, **kwargs):
+  data = board.pin_read(name = name, version = version, hash = hash, **kwargs)
+  meta = board.pin_meta(name = name, version = version, **kwargs)
+  return pd.Categorical(data, categories=meta.user["categories"])
+
+pin_read_cat_json(board, "some-cat")
+```
+
+For an example of how this approach is used in a real project, look at look at how the vetiver package wraps these functions to [write](https://github.com/rstudio/vetiver-python/blob/main/vetiver/pin_read_write.py) and [read](https://github.com/rstudio/vetiver-python/blob/main/vetiver/vetiver_model.py) model binaries as pins.
diff --git a/docs/articles/index.rst b/docs/articles/index.rst
@@ -0,0 +1,5 @@
+Articles
+========
+
+.. toctree::
+   customize-pins-metadata.Rmd