Skip to content

Commit

Permalink
Merge pull request #16 from hablapps/feature/prefix_suffix_std_skew_c…
Browse files Browse the repository at this point in the history
…ount

Addition of skew, add_prefix, add_suffix, count, std functions
  • Loading branch information
cmccarthy1 committed Jan 19, 2024
2 parents 720bb41 + cbff0f2 commit 0877f15
Show file tree
Hide file tree
Showing 4 changed files with 531 additions and 0 deletions.
310 changes: 310 additions & 0 deletions docs/user-guide/advanced/Pandas_API.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -646,6 +646,110 @@
"tab.mode(dropna=False)"
]
},
{
"cell_type": "markdown",
"id": "f5c66579",
"metadata": {},
"source": [
"### Table.std()\n",
"\n",
"```\n",
"Table.std(axis=0, skipna=True, numeric_only=False, ddof=0)\n",
"```\n",
"\n",
"Return sample standard deviation over requested axis. Normalized by N-1 by default. This can be changed using the ddof argument.\n",
"\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
"| axis | int | The axis to calculate the sum across 0 is columns, 1 is rows. | 0 |\n",
"| skipna | bool | not yet implemented | True |\n",
"| numeric_only | bool | Only use columns of the table that are of a numeric data type. | False |\n",
"| ddof | int | Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. | 1 |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :----------------: | :------------------------------------------------------------------- |\n",
"| Table | The std across each row / column with the key corresponding to the row number or column name. |"
]
},
{
"cell_type": "markdown",
"id": "c2767afd",
"metadata": {},
"source": [
"**Examples:**\n",
"\n",
"Calculate the std across the columns of a table"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "87b94fd0",
"metadata": {},
"outputs": [],
"source": [
"tab = kx.Table(data=\n",
" {\n",
" 'a': [1, 2, 2, 4],\n",
" 'b': [1, 2, 6, 7],\n",
" 'c': [7, 8, 9, 10],\n",
" 'd': [7, 11, 14, 14]\n",
" }\n",
")\n",
"tab"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3e54d557",
"metadata": {},
"outputs": [],
"source": [
"tab.std()"
]
},
{
"cell_type": "markdown",
"id": "14950833",
"metadata": {},
"source": [
"Calculate the std across the rows of a table"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f19161ed",
"metadata": {},
"outputs": [],
"source": [
"tab.std(axis=1)"
]
},
{
"cell_type": "markdown",
"id": "a8ea5a38",
"metadata": {},
"source": [
"Calculate std accross columns with ddof=0:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6361dcb7",
"metadata": {},
"outputs": [],
"source": [
"tab.std(ddof=0)"
]
},
{
"cell_type": "markdown",
"id": "7e2813b4",
Expand Down Expand Up @@ -1813,6 +1917,136 @@
"df.astype({'c4':kx.SymbolVector, 'c5':kx.SymbolVector})"
]
},
{
"cell_type": "markdown",
"id": "0f8813a0",
"metadata": {},
"source": [
"### Table.add_prefix()\n",
"\n",
"```\n",
"Table.add_prefix(columns)\n",
"```\n",
"\n",
"Rename columns adding a prefix in a table and return the resulting Table object.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :-----: | :-------------: | :------------------------------------------------------------------ | :--------: |\n",
"| prefix | str | The string that will be concatenated with the name of the columns | _required_ |\n",
"| axis | int | Axis to add prefix on. | 0 |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :---: | :----------------------------------------------------------------- |\n",
"| Table | A table with the given column(s) renamed adding a prefix. |"
]
},
{
"cell_type": "markdown",
"id": "9186ed86",
"metadata": {},
"source": [
"**Examples:**\n",
"\n",
"The initial table to which a prefix will be added to its columns"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f20131b",
"metadata": {},
"outputs": [],
"source": [
"tab.head()"
]
},
{
"cell_type": "markdown",
"id": "73c2b08f",
"metadata": {},
"source": [
"Add \"col_\" to table columns:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "926c8295",
"metadata": {},
"outputs": [],
"source": [
"tab.add_prefix(prefix=\"col_\").head()"
]
},
{
"cell_type": "markdown",
"id": "0a4abc8c",
"metadata": {},
"source": [
"### Table.add_suffix()\n",
"\n",
"```\n",
"Table.add_suffix(columns)\n",
"```\n",
"\n",
"Rename columns adding a suffix in a table and return the resulting Table object.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :-----: | :-------------: | :------------------------------------------------------------------ | :--------: |\n",
"| suffix | str | The string that will be concatenated with the name of the columns | _required_ |\n",
"| axis | int | Axis to add suffix on. | 0 |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :---: | :----------------------------------------------------------------- |\n",
"| Table | A table with the given column(s) renamed adding a suffix. |"
]
},
{
"cell_type": "markdown",
"id": "c22262b8",
"metadata": {},
"source": [
"**Examples:**\n",
"\n",
"The initial table to which a suffix will be added to its columns"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "55c1f504",
"metadata": {},
"outputs": [],
"source": [
"tab.head()"
]
},
{
"cell_type": "markdown",
"id": "b4687851",
"metadata": {},
"source": [
"Add \"_col\" to table columns:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e00d0f5c",
"metadata": {},
"outputs": [],
"source": [
"tab.add_suffix(suffix=\"_col\").head()"
]
},
{
"cell_type": "markdown",
"id": "718584f8",
Expand Down Expand Up @@ -2507,6 +2741,82 @@
"tab.prod(numeric_only=True)"
]
},
{
"cell_type": "markdown",
"id": "c87d4f95",
"metadata": {},
"source": [
"### Table.count()\n",
"\n",
"```\n",
"Table.count(axis=0, numeric_only=False)\n",
"```\n",
"\n",
"Returns the count of non null values across the given axis.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
"| axis | int | The axis to count elements across 1 is columns, 0 is rows. | 0 |\n",
"| numeric_only | bool | Only use columns of the table that are of a numeric data type. | False |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :----------------: | :------------------------------------------------------------------- |\n",
"| Dictionary | A dictionary where the key represent the column name / row number and the values are the result of calling `count` on that column / row. |"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6520c195",
"metadata": {},
"outputs": [],
"source": [
"tab.count()"
]
},
{
"cell_type": "markdown",
"id": "ce85797d",
"metadata": {},
"source": [
"### Table.skew()\n",
"\n",
"```\n",
"Table.skew(axis=0, skipna=True, numeric_only=False)\n",
"```\n",
"\n",
"Returns the skewness of all values across the given axis.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
"| axis | int | The axis to calculate the skewness across 0 is columns, 1 is rows. | 0 |\n",
"| skipna | bool | Ignore any null values along the axis. | True |\n",
"| numeric_only | bool | Only use columns of the table that are of a numeric data type. | False |\n",
"\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :----------------: | :------------------------------------------------------------------- |\n",
"| Dictionary | A dictionary where the key represent the column name / row number and the values are the result of calling `skew` on that column / row. |"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3fb5dce1",
"metadata": {},
"outputs": [],
"source": [
"tab.skew(numeric_only=True)"
]
},
{
"cell_type": "markdown",
"id": "499025cb",
Expand Down
26 changes: 26 additions & 0 deletions src/pykx/pandas_api/pandas_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,32 @@ def rename(self, labels=None, index=None, columns=None, axis=0,

return t

def add_suffix(self, suffix, axis=0):
t = self
if axis == 1:
t = q('''{[s;t]
c:$[99h~type t;cols value@;cols] t;
(c!`$string[c],\\:string s) xcol t
}''', suffix, t)
elif axis == 0:
raise ValueError('nyi')
else:
raise ValueError(f'No axis named {axis}')
return t

def add_prefix(self, prefix, axis=0):
t = self
if axis == 1:
t = q('''{[s;t]
c:$[99h~type t;cols value@;cols] t;
(c!`$string[s],/:string[c]) xcol t
}''', prefix, t)
elif axis == 0:
raise ValueError('nyi')
else:
raise ValueError(f'No axis named {axis}')
return t

def sample(self, n=None, frac=None, replace=False, weights=None,
random_state=None, axis=None, ignore_index=False):
if n is None and frac is None:
Expand Down
Loading

0 comments on commit 0877f15

Please sign in to comment.