Merge pull request #16 from hablapps/feature/prefix_suffix_std_skew_c…

…ount Addition of skew, add_prefix, add_suffix, count, std functions
KxSystems · Jan 19, 2024 · 0877f15 · 0877f15
2 parents 720bb41 + cbff0f2
commit 0877f15
Show file tree

Hide file tree

Showing 4 changed files with 531 additions and 0 deletions.
diff --git a/docs/user-guide/advanced/Pandas_API.ipynb b/docs/user-guide/advanced/Pandas_API.ipynb
@@ -646,6 +646,110 @@
     "tab.mode(dropna=False)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "f5c66579",
+   "metadata": {},
+   "source": [
+    "### Table.std()\n",
+    "\n",
+    "```\n",
+    "Table.std(axis=0, skipna=True, numeric_only=False, ddof=0)\n",
+    "```\n",
+    "\n",
+    "Return sample standard deviation over requested axis. Normalized by N-1 by default. This can be changed using the ddof argument.\n",
+    "\n",
+    "\n",
+    "**Parameters:**\n",
+    "\n",
+    "| Name         | Type | Description                                                                      | Default |\n",
+    "| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
+    "| axis         | int  | The axis to calculate the sum across 0 is columns, 1 is rows.                    | 0       |\n",
+    "| skipna       | bool | not yet implemented                                           | True    |\n",
+    "| numeric_only | bool | Only use columns of the table that are of a numeric data type.                   | False   |\n",
+    "| ddof    | int  | Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. | 1 |\n",
+    "\n",
+    "**Returns:**\n",
+    "\n",
+    "| Type               | Description                                                          |\n",
+    "| :----------------: | :------------------------------------------------------------------- |\n",
+    "| Table         | The std across each row / column with the key corresponding to the row number or column name. |"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c2767afd",
+   "metadata": {},
+   "source": [
+    "**Examples:**\n",
+    "\n",
+    "Calculate the std across the columns of a table"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "87b94fd0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab = kx.Table(data=\n",
+    "    {\n",
+    "        'a': [1, 2, 2, 4],\n",
+    "        'b': [1, 2, 6, 7],\n",
+    "        'c': [7, 8, 9, 10],\n",
+    "        'd': [7, 11, 14, 14]\n",
+    "    }\n",
+    ")\n",
+    "tab"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3e54d557",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.std()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "14950833",
+   "metadata": {},
+   "source": [
+    "Calculate the std across the rows of a table"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f19161ed",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.std(axis=1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a8ea5a38",
+   "metadata": {},
+   "source": [
+    "Calculate std accross columns with ddof=0:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6361dcb7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.std(ddof=0)"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "7e2813b4",
@@ -1813,6 +1917,136 @@
     "df.astype({'c4':kx.SymbolVector, 'c5':kx.SymbolVector})"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "0f8813a0",
+   "metadata": {},
+   "source": [
+    "### Table.add_prefix()\n",
+    "\n",
+    "```\n",
+    "Table.add_prefix(columns)\n",
+    "```\n",
+    "\n",
+    "Rename columns adding a prefix in a table and return the resulting Table object.\n",
+    "\n",
+    "**Parameters:**\n",
+    "\n",
+    "| Name    | Type            | Description                                                         | Default    |\n",
+    "| :-----: | :-------------: | :------------------------------------------------------------------ | :--------: |\n",
+    "| prefix    | str  | The string that will be concatenated with the name of the columns  | _required_ |\n",
+    "| axis    | int                    | Axis to add prefix on.     | 0          |\n",
+    "\n",
+    "**Returns:**\n",
+    "\n",
+    "| Type  | Description                                                        |\n",
+    "| :---: | :----------------------------------------------------------------- |\n",
+    "| Table | A table with the given column(s) renamed adding a prefix.                          |"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9186ed86",
+   "metadata": {},
+   "source": [
+    "**Examples:**\n",
+    "\n",
+    "The initial table to which a prefix will be added to its columns"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5f20131b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "73c2b08f",
+   "metadata": {},
+   "source": [
+    "Add \"col_\" to table columns:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "926c8295",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.add_prefix(prefix=\"col_\").head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0a4abc8c",
+   "metadata": {},
+   "source": [
+    "### Table.add_suffix()\n",
+    "\n",
+    "```\n",
+    "Table.add_suffix(columns)\n",
+    "```\n",
+    "\n",
+    "Rename columns adding a suffix in a table and return the resulting Table object.\n",
+    "\n",
+    "**Parameters:**\n",
+    "\n",
+    "| Name    | Type            | Description                                                         | Default    |\n",
+    "| :-----: | :-------------: | :------------------------------------------------------------------ | :--------: |\n",
+    "| suffix    | str  | The string that will be concatenated with the name of the columns  | _required_ |\n",
+    "| axis    | int                    | Axis to add suffix on.     | 0          |\n",
+    "\n",
+    "**Returns:**\n",
+    "\n",
+    "| Type  | Description                                                        |\n",
+    "| :---: | :----------------------------------------------------------------- |\n",
+    "| Table | A table with the given column(s) renamed adding a suffix.                          |"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c22262b8",
+   "metadata": {},
+   "source": [
+    "**Examples:**\n",
+    "\n",
+    "The initial table to which a suffix will be added to its columns"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "55c1f504",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b4687851",
+   "metadata": {},
+   "source": [
+    "Add \"_col\" to table columns:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e00d0f5c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.add_suffix(suffix=\"_col\").head()"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "718584f8",
@@ -2507,6 +2741,82 @@
     "tab.prod(numeric_only=True)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "c87d4f95",
+   "metadata": {},
+   "source": [
+    "### Table.count()\n",
+    "\n",
+    "```\n",
+    "Table.count(axis=0, numeric_only=False)\n",
+    "```\n",
+    "\n",
+    "Returns the count of non null values across the given axis.\n",
+    "\n",
+    "**Parameters:**\n",
+    "\n",
+    "| Name         | Type | Description                                                                      | Default |\n",
+    "| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
+    "| axis         | int  | The axis to count elements across 1 is columns, 0 is rows.                | 0       |\n",
+    "| numeric_only | bool | Only use columns of the table that are of a numeric data type.                   | False   |\n",
+    "\n",
+    "**Returns:**\n",
+    "\n",
+    "| Type               | Description                                                          |\n",
+    "| :----------------: | :------------------------------------------------------------------- |\n",
+    "| Dictionary         | A dictionary where the key represent the column name / row number and the values are the result of calling `count` on that column / row. |"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6520c195",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.count()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ce85797d",
+   "metadata": {},
+   "source": [
+    "### Table.skew()\n",
+    "\n",
+    "```\n",
+    "Table.skew(axis=0, skipna=True, numeric_only=False)\n",
+    "```\n",
+    "\n",
+    "Returns the skewness of all values across the given axis.\n",
+    "\n",
+    "**Parameters:**\n",
+    "\n",
+    "| Name         | Type | Description                                                                      | Default |\n",
+    "| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
+    "| axis         | int  | The axis to calculate the skewness across 0 is columns, 1 is rows.                | 0       |\n",
+    "| skipna       | bool | Ignore any null values along the axis.                                           | True    |\n",
+    "| numeric_only | bool | Only use columns of the table that are of a numeric data type.                   | False   |\n",
+    "\n",
+    "\n",
+    "**Returns:**\n",
+    "\n",
+    "| Type               | Description                                                          |\n",
+    "| :----------------: | :------------------------------------------------------------------- |\n",
+    "| Dictionary         | A dictionary where the key represent the column name / row number and the values are the result of calling `skew` on that column / row. |"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3fb5dce1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tab.skew(numeric_only=True)"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "499025cb",

diff --git a/src/pykx/pandas_api/pandas_indexing.py b/src/pykx/pandas_api/pandas_indexing.py
@@ -454,6 +454,32 @@ def rename(self, labels=None, index=None, columns=None, axis=0,
 
         return t
 
+    def add_suffix(self, suffix, axis=0):
+        t = self
+        if axis == 1:
+            t = q('''{[s;t]
+                  c:$[99h~type t;cols value@;cols] t;
+                  (c!`$string[c],\\:string s) xcol t
+                  }''', suffix, t)
+        elif axis == 0:
+            raise ValueError('nyi')
+        else:
+            raise ValueError(f'No axis named {axis}')
+        return t
+
+    def add_prefix(self, prefix, axis=0):
+        t = self
+        if axis == 1:
+            t = q('''{[s;t]
+                  c:$[99h~type t;cols value@;cols] t;
+                  (c!`$string[s],/:string[c]) xcol t
+                  }''', prefix, t)
+        elif axis == 0:
+            raise ValueError('nyi')
+        else:
+            raise ValueError(f'No axis named {axis}')
+        return t
+
     def sample(self, n=None, frac=None, replace=False, weights=None,
                random_state=None, axis=None, ignore_index=False):
         if n is None and frac is None: