From ab14ff299579e80221dab7ff4626b516c599e7f1 Mon Sep 17 00:00:00 2001 From: Bharath Swamy Date: Sun, 18 May 2025 17:13:52 -0700 Subject: [PATCH 1/5] add(notebooks): python udf template --- notebooks/python-udf-template/meta.toml | 13 + notebooks/python-udf-template/notebook.ipynb | 258 +++++++++++++++++++ 2 files changed, 271 insertions(+) create mode 100644 notebooks/python-udf-template/meta.toml create mode 100644 notebooks/python-udf-template/notebook.ipynb diff --git a/notebooks/python-udf-template/meta.toml b/notebooks/python-udf-template/meta.toml new file mode 100644 index 0000000..65660ec --- /dev/null +++ b/notebooks/python-udf-template/meta.toml @@ -0,0 +1,13 @@ +[meta] +authors=["singlestore"] +title="Run your first Python UDF" +description="""\ + Learn how to connect to create and\ + publish a python UDF and call it in SQL. + """ +icon="browser" +difficulty="beginner" +tags=["starter", "notebooks", "python"] +lesson_areas=[] +destinations=["spaces"] +minimum_tier="free-shared" diff --git a/notebooks/python-udf-template/notebook.ipynb b/notebooks/python-udf-template/notebook.ipynb new file mode 100644 index 0000000..08f8583 --- /dev/null +++ b/notebooks/python-udf-template/notebook.ipynb @@ -0,0 +1,258 @@ +{ + "cells": [ + { + "id": "1ae3a481", + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + "
\n", + " \n", + "
\n", + "
\n", + "
SingleStore Notebooks
\n", + "

Run your first Python UDF

\n", + "
\n", + "
" + ] + }, + { + "id": "bd0ae268", + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " \n", + "
\n", + "

Note

\n", + "

This notebook can be run on a Free Starter Workspace. To create a Free Starter Workspace navigate to Start using the left nav. You can also use your existing Standard or Premium workspace with this Notebook.

\n", + "
\n", + "
" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "bcb6e6a7", + "metadata": {}, + "source": [ + "This Jupyter notebook will help you build your first Python UDF using Notebooks, registering it with your database and calling it as part of SQL query." + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "5776ded1", + "metadata": {}, + "source": [ + "## Create some simple tables\n", + "\n", + "This setup establishes a basic relational structure to store some reviews for restaurants. Ensure you have selected a database." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "2bbf6a44", + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "DROP TABLE IF EXISTS reviews;\n", + "\n", + "CREATE TABLE IF NOT EXISTS\n", + "reviews (\n", + " review_id INT PRIMARY KEY,\n", + " store_name VARCHAR(255) NOT NULL,\n", + " review TEXT NOT NULL\n", + ");" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Insert sample data" + ], + "id": "3aace2e9" + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "%%sql INSERT into reviews (review_id, store_name, review) values\n", + "(\"1\", \"Single Pizza\", \"The staff were very respectful and made thoughtful suggestions. I will definitely go again. 10/10!\"),\n", + "(\"2\", \"Single Pizza\", \"The food was absolutely amazing and the service was fantastic!\"),\n", + "(\"3\", \"Single Pizza\", \"The experience was terrible. The food was cold and the waiter was rude.\"),\n", + "(\"4\", \"Single Pizza\", \"I loved the ambiance and the desserts were out of this world!\"),\n", + "(\"5\", \"Single Pizza\", \"Not worth the price. I expected more based on the reviews\");" + ], + "id": "0a123cd7" + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "d58c8382", + "metadata": {}, + "source": [ + "## Define PythonUDF functions\n", + "\n", + "Next, we will be Python UDF function using the `@udf` annotation. We will be using the `VADER` model of `nltk` library to perform sentiment analysis on the review text." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install nltk" + ], + "id": "1556ad3c" + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "f3f3b047", + "metadata": {}, + "outputs": [], + "source": [ + "from singlestoredb.functions import udf\n", + "import nltk\n", + "from nltk.sentiment import SentimentIntensityAnalyzer\n", + "\n", + "nltk.download('vader_lexicon')\n", + "sia = SentimentIntensityAnalyzer()\n", + "\n", + "@udf\n", + "def review_sentiment(review: str) -> str:\n", + " print(\"review:\" + review)\n", + " scores = sia.polarity_scores(review)\n", + " sentiment = (\n", + " \"Positive\" if scores['compound'] > 0.05 else\n", + " \"Negative\" if scores['compound'] < -0.05 else\n", + " \"Neutral\"\n", + " )\n", + " print(\"sentiment:\" + sentiment)\n", + " return sentiment" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "40e2ad59", + "metadata": {}, + "source": [ + "## Start the Python UDF server\n", + "\n", + "This will start the server as well as register all the functions annotated with `@udf` as external user defined functions on your selected database." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "ed4b22cd", + "metadata": {}, + "outputs": [], + "source": [ + "import singlestoredb.apps as apps\n", + "connection_info = await apps.run_udf_app(replace_existing=True)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## List all registered UDFs\n", + "\n", + "In interactive notebooks, the udf function will be suffixed with `_test` to differentiate it from the published version" + ], + "id": "b53cd3d1" + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "%%sql\n", + "SHOW functions" + ], + "id": "6008982d" + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Call the UDF from SQL\n", + "\n", + "You will now be able to run queries like\n", + "\n", + "```\n", + "SELECT review_id, store_name, review, review_sentiment_test(review) from reviews order by review_id;\n", + "```\n", + "from the SQL editor or any other SQL client." + ], + "id": "58560b03" + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "4a825f0d", + "metadata": {}, + "source": [ + "## Publish Python UDF\n", + "\n", + "After validating the Python UDF interactively, you can publish it and access it like\n", + "\n", + "```\n", + "%%sql\n", + "SELECT review_id, store_name, review, review_sentiment(review) from reviews order by review_id\n", + "```\n", + "\n", + "enriching your data exploration experience seamlessly!" + ] + }, + { + "id": "b6c75678", + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + "
" + ] + } + ], + "metadata": { + "jupyterlab": { + "notebooks": { + "version_major": 6, + "version_minor": 4 + } + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From 0b3818ca494d4f9f7b8649dbced852f5064b4f1f Mon Sep 17 00:00:00 2001 From: Bharath Swamy Date: Sun, 18 May 2025 17:32:52 -0700 Subject: [PATCH 2/5] add(author): entry for bharath --- authors/bharath-swamy.toml | 4 ++++ 1 file changed, 4 insertions(+) create mode 100644 authors/bharath-swamy.toml diff --git a/authors/bharath-swamy.toml b/authors/bharath-swamy.toml new file mode 100644 index 0000000..1889ad9 --- /dev/null +++ b/authors/bharath-swamy.toml @@ -0,0 +1,4 @@ +name="Bharath Swamy" +title="Product Team" +image="singlestore" +external=false From 1343c11d9ec4e368a923bd5f13b59fd235799b5c Mon Sep 17 00:00:00 2001 From: Bharath Swamy Date: Sun, 18 May 2025 17:36:24 -0700 Subject: [PATCH 3/5] update(python-udf-template): author --- notebooks/python-udf-template/meta.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/python-udf-template/meta.toml b/notebooks/python-udf-template/meta.toml index 65660ec..b0d2528 100644 --- a/notebooks/python-udf-template/meta.toml +++ b/notebooks/python-udf-template/meta.toml @@ -1,5 +1,5 @@ [meta] -authors=["singlestore"] +authors=["bharath-swamy"] title="Run your first Python UDF" description="""\ Learn how to connect to create and\ From c0231812c7f7180ca9b76879a087f1f22973199a Mon Sep 17 00:00:00 2001 From: Kevin D Smith Date: Mon, 19 May 2025 09:16:46 -0500 Subject: [PATCH 4/5] Update notebook.ipynb --- notebooks/python-udf-template/notebook.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/python-udf-template/notebook.ipynb b/notebooks/python-udf-template/notebook.ipynb index 08f8583..a01962b 100644 --- a/notebooks/python-udf-template/notebook.ipynb +++ b/notebooks/python-udf-template/notebook.ipynb @@ -98,7 +98,7 @@ "id": "d58c8382", "metadata": {}, "source": [ - "## Define PythonUDF functions\n", + "## Define Python UDF functions\n", "\n", "Next, we will be Python UDF function using the `@udf` annotation. We will be using the `VADER` model of `nltk` library to perform sentiment analysis on the review text." ] From 87ac873f100eb808e158f143ea10e18fedd63277 Mon Sep 17 00:00:00 2001 From: Bharath Swamy Date: Mon, 19 May 2025 12:35:02 -0700 Subject: [PATCH 5/5] update(python-udf-template): add directions on how to run the udf --- notebooks/python-udf-template/notebook.ipynb | 26 ++++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/notebooks/python-udf-template/notebook.ipynb b/notebooks/python-udf-template/notebook.ipynb index a01962b..7bd7c73 100644 --- a/notebooks/python-udf-template/notebook.ipynb +++ b/notebooks/python-udf-template/notebook.ipynb @@ -71,15 +71,16 @@ { "attachments": {}, "cell_type": "markdown", + "id": "3aace2e9", "metadata": {}, "source": [ "## Insert sample data" - ], - "id": "3aace2e9" + ] }, { "cell_type": "code", "execution_count": 2, + "id": "0a123cd7", "metadata": {}, "outputs": [], "source": [ @@ -89,8 +90,7 @@ "(\"3\", \"Single Pizza\", \"The experience was terrible. The food was cold and the waiter was rude.\"),\n", "(\"4\", \"Single Pizza\", \"I loved the ambiance and the desserts were out of this world!\"),\n", "(\"5\", \"Single Pizza\", \"Not worth the price. I expected more based on the reviews\");" - ], - "id": "0a123cd7" + ] }, { "attachments": {}, @@ -106,12 +106,12 @@ { "cell_type": "code", "execution_count": 3, + "id": "1556ad3c", "metadata": {}, "outputs": [], "source": [ "!pip install nltk" - ], - "id": "1556ad3c" + ] }, { "cell_type": "code", @@ -165,28 +165,29 @@ { "attachments": {}, "cell_type": "markdown", + "id": "b53cd3d1", "metadata": {}, "source": [ "## List all registered UDFs\n", "\n", "In interactive notebooks, the udf function will be suffixed with `_test` to differentiate it from the published version" - ], - "id": "b53cd3d1" + ] }, { "cell_type": "code", "execution_count": 6, + "id": "6008982d", "metadata": {}, "outputs": [], "source": [ "%%sql\n", "SHOW functions" - ], - "id": "6008982d" + ] }, { "attachments": {}, "cell_type": "markdown", + "id": "58560b03", "metadata": {}, "source": [ "## Call the UDF from SQL\n", @@ -196,9 +197,8 @@ "```\n", "SELECT review_id, store_name, review, review_sentiment_test(review) from reviews order by review_id;\n", "```\n", - "from the SQL editor or any other SQL client." - ], - "id": "58560b03" + "from the SQL editor or any other SQL client. Try it out by opening another notebook, selecting the current Database and running this query in a new cell." + ] }, { "attachments": {},