Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of skew, add_prefix, add_suffix, count, std functions #16

Merged
310 changes: 310 additions & 0 deletions docs/user-guide/advanced/Pandas_API.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -646,6 +646,110 @@
"tab.mode(dropna=False)"
]
},
{
"cell_type": "markdown",
"id": "f5c66579",
"metadata": {},
"source": [
"### Table.std()\n",
"\n",
"```\n",
"Table.std(axis=0, skipna=True, numeric_only=False, ddof=0)\n",
"```\n",
"\n",
"Return sample standard deviation over requested axis. Normalized by N-1 by default. This can be changed using the ddof argument.\n",
"\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
"| axis | int | The axis to calculate the sum across 0 is columns, 1 is rows. | 0 |\n",
"| skipna | bool | not yet implemented | True |\n",
"| numeric_only | bool | Only use columns of the table that are of a numeric data type. | False |\n",
"| ddof | int | Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. | 1 |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :----------------: | :------------------------------------------------------------------- |\n",
"| Table | The std across each row / column with the key corresponding to the row number or column name. |"
]
},
{
"cell_type": "markdown",
"id": "c2767afd",
"metadata": {},
"source": [
"**Examples:**\n",
"\n",
"Calculate the std across the columns of a table"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "87b94fd0",
"metadata": {},
"outputs": [],
"source": [
"tab = kx.Table(data=\n",
" {\n",
" 'a': [1, 2, 2, 4],\n",
" 'b': [1, 2, 6, 7],\n",
" 'c': [7, 8, 9, 10],\n",
" 'd': [7, 11, 14, 14]\n",
" }\n",
")\n",
"tab"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3e54d557",
"metadata": {},
"outputs": [],
"source": [
"tab.std()"
]
},
{
"cell_type": "markdown",
"id": "14950833",
"metadata": {},
"source": [
"Calculate the std across the rows of a table"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f19161ed",
"metadata": {},
"outputs": [],
"source": [
"tab.std(axis=1)"
]
},
{
"cell_type": "markdown",
"id": "a8ea5a38",
"metadata": {},
"source": [
"Calculate std accross columns with ddof=0:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6361dcb7",
"metadata": {},
"outputs": [],
"source": [
"tab.std(ddof=0)"
]
},
{
"cell_type": "markdown",
"id": "7e2813b4",
Expand Down Expand Up @@ -1813,6 +1917,136 @@
"df.astype({'c4':kx.SymbolVector, 'c5':kx.SymbolVector})"
]
},
{
"cell_type": "markdown",
"id": "0f8813a0",
"metadata": {},
"source": [
"### Table.add_prefix()\n",
"\n",
"```\n",
"Table.add_prefix(columns)\n",
"```\n",
"\n",
"Rename columns adding a prefix in a table and return the resulting Table object.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :-----: | :-------------: | :------------------------------------------------------------------ | :--------: |\n",
"| prefix | str | The string that will be concatenated with the name of the columns | _required_ |\n",
"| axis | int | Axis to add prefix on. | 0 |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :---: | :----------------------------------------------------------------- |\n",
"| Table | A table with the given column(s) renamed adding a prefix. |"
]
},
{
"cell_type": "markdown",
"id": "9186ed86",
"metadata": {},
"source": [
"**Examples:**\n",
"\n",
"The initial table to which a prefix will be added to its columns"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f20131b",
"metadata": {},
"outputs": [],
"source": [
"tab.head()"
]
},
{
"cell_type": "markdown",
"id": "73c2b08f",
"metadata": {},
"source": [
"Add \"col_\" to table columns:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "926c8295",
"metadata": {},
"outputs": [],
"source": [
"tab.add_prefix(prefix=\"col_\").head()"
]
},
{
"cell_type": "markdown",
"id": "0a4abc8c",
"metadata": {},
"source": [
"### Table.add_suffix()\n",
"\n",
"```\n",
"Table.add_suffix(columns)\n",
"```\n",
"\n",
"Rename columns adding a suffix in a table and return the resulting Table object.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :-----: | :-------------: | :------------------------------------------------------------------ | :--------: |\n",
"| suffix | str | The string that will be concatenated with the name of the columns | _required_ |\n",
"| axis | int | Axis to add suffix on. | 0 |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :---: | :----------------------------------------------------------------- |\n",
"| Table | A table with the given column(s) renamed adding a suffix. |"
]
},
{
"cell_type": "markdown",
"id": "c22262b8",
"metadata": {},
"source": [
"**Examples:**\n",
"\n",
"The initial table to which a suffix will be added to its columns"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "55c1f504",
"metadata": {},
"outputs": [],
"source": [
"tab.head()"
]
},
{
"cell_type": "markdown",
"id": "b4687851",
"metadata": {},
"source": [
"Add \"_col\" to table columns:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e00d0f5c",
"metadata": {},
"outputs": [],
"source": [
"tab.add_suffix(suffix=\"_col\").head()"
]
},
{
"cell_type": "markdown",
"id": "718584f8",
Expand Down Expand Up @@ -2507,6 +2741,82 @@
"tab.prod(numeric_only=True)"
]
},
{
"cell_type": "markdown",
"id": "c87d4f95",
"metadata": {},
"source": [
"### Table.count()\n",
"\n",
"```\n",
"Table.count(axis=0, numeric_only=False)\n",
"```\n",
"\n",
"Returns the count of non null values across the given axis.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
"| axis | int | The axis to count elements across 1 is columns, 0 is rows. | 0 |\n",
"| numeric_only | bool | Only use columns of the table that are of a numeric data type. | False |\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :----------------: | :------------------------------------------------------------------- |\n",
"| Dictionary | A dictionary where the key represent the column name / row number and the values are the result of calling `count` on that column / row. |"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6520c195",
"metadata": {},
"outputs": [],
"source": [
"tab.count()"
]
},
{
"cell_type": "markdown",
"id": "ce85797d",
"metadata": {},
"source": [
"### Table.skew()\n",
"\n",
"```\n",
"Table.skew(axis=0, skipna=True, numeric_only=False)\n",
"```\n",
"\n",
"Returns the skewness of all values across the given axis.\n",
"\n",
"**Parameters:**\n",
"\n",
"| Name | Type | Description | Default |\n",
"| :----------: | :--: | :------------------------------------------------------------------------------- | :-----: |\n",
"| axis | int | The axis to calculate the skewness across 0 is columns, 1 is rows. | 0 |\n",
"| skipna | bool | Ignore any null values along the axis. | True |\n",
"| numeric_only | bool | Only use columns of the table that are of a numeric data type. | False |\n",
"\n",
"\n",
"**Returns:**\n",
"\n",
"| Type | Description |\n",
"| :----------------: | :------------------------------------------------------------------- |\n",
"| Dictionary | A dictionary where the key represent the column name / row number and the values are the result of calling `skew` on that column / row. |"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3fb5dce1",
"metadata": {},
"outputs": [],
"source": [
"tab.skew(numeric_only=True)"
]
},
{
"cell_type": "markdown",
"id": "499025cb",
Expand Down
26 changes: 26 additions & 0 deletions src/pykx/pandas_api/pandas_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,32 @@ def rename(self, labels=None, index=None, columns=None, axis=0,

return t

def add_suffix(self, suffix, axis=0):
t = self
if axis == 1:
t = q('''{[s;t]
c:$[99h~type t;cols value@;cols] t;
(c!`$string[c],\\:string s) xcol t
}''', suffix, t)
elif axis == 0:
raise ValueError('nyi')
else:
raise ValueError(f'No axis named {axis}')
return t

def add_prefix(self, prefix, axis=0):
t = self
if axis == 1:
t = q('''{[s;t]
c:$[99h~type t;cols value@;cols] t;
(c!`$string[s],/:string[c]) xcol t
}''', prefix, t)
elif axis == 0:
raise ValueError('nyi')
else:
raise ValueError(f'No axis named {axis}')
return t

def sample(self, n=None, frac=None, replace=False, weights=None,
random_state=None, axis=None, ignore_index=False):
if n is None and frac is None:
Expand Down
Loading