-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #1165 drop on axis #1177
Closes #1165 drop on axis #1177
Conversation
…lity for dropping rows in new function called by drop. Moved column drop to new function called by drop. Updated the drop row algorithm to set the new column values in place using the superclass's __setitem__. Updated docstrings to match new functionality. Set default axis to 0 to conform to pandas.
e6fa179
to
72228f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ok to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things I'd like you to take a look at but I think the overall functionality looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
* Add UsedModules.cfg to .gitignore (#1179) The UsedModules.cfg file, which is generated from --saveUsedModules was not being ignored by git, and I don't think we would ever want a UsedModules.cfg pushed into the repo, so moving it into the gitignore will resolve any annoyance from having to selectively not include that file. * Add line to pass Arrow float to Chapel side This PR adds a check for the Arrow float type on the Chapel side, which was left out previously, allowing float data types to be read with `ak.read_parquet()`. With this addition, the file `alltypes_plain.parquet` from the Apache website now runs through `ak.read_parquet()` without fail. * Closes #1165 drop on axis (#1177) * Modified drop() to allow for row and column dropping. Added functionality for dropping rows in new function called by drop. Moved column drop to new function called by drop. Updated the drop row algorithm to set the new column values in place using the superclass's __setitem__. Updated docstrings to match new functionality. Set default axis to 0 to conform to pandas. * Added ak.DataFrame.drop() documentation. * Resolving Merge Conflicts * Added test to validate axis can only be 1 or 0. * Updated code based on PR comments. Removed unneeded if blocks and a few other style changes. * Updating unknown axis error. * Correcting typo * Update Arkouda to work with CTypes (#1185) In Chapel 1.26, various C types from 'SysCTypes', 'SysBasic' and 'CPtr' are being brought together into a single (new) module, 'CTypes'. In this PR, I'm adding a new user-level 'CTypes' compatibility module that brings the pieces that Arkouda relies upon together so that it can use the new names and organization yet still be compiled with older compilers. --- Signed-off-by: Brad Chamberlain <bradcray@users.noreply.github.com> * Catch all Parquet errors and report them to the client (#1188) * Catch all Parquet errors and report them to the client In the initial Parquet error handling code, the only errors that were being reported were errors that came from status codes of calls to the Parquet API, but all errors in the C++ code went unhandled and would result in a server crash. This PR wraps all the C++ functions in a try/catch to catch all errors to be reported back to the client rather than crashing the server. This approach seems to align with the philosophy of "The server should never crash, errors should just be reported to the client". Additionally, a bug was identified with throwing errors in `forall` loops within a function, which would cause a try/catch wrapping the throwing function to not catch the resulting `TaskErrors` error throw, but a workaround for that is to wrap `forall` loops in a try/catch which just throws the error, which then allows the overarching try/catch to catch and report the error, where previously it would crash the server. * Add unsupported type error message to getType function * Remove the ARROWUNDEFINED variable and add some throws * add parquet test files with license and readme * added and skipped test against parquet standard files * add option for running parquet file tests * Update marker name to match decorator * updated error message string in ParquetMsg with wrong format specifier and updated columns1 to pass assertion against ans Co-authored-by: Ben McDonald <46734217+bmcdonald3@users.noreply.github.com> Co-authored-by: Ben McDonald <mcdobe100@gmail.com> Co-authored-by: reuster986 <reuster986@gmail.com> Co-authored-by: Ethan-DeBandi99 <16845933+Ethan-DeBandi99@users.noreply.github.com> Co-authored-by: Brad Chamberlain <bradcray@users.noreply.github.com> Co-authored-by: Pierce Hayes <pierce314159@users.noreply.github.com>
* add parquet test files with license and readme * added and skipped test against parquet standard files * add option for running parquet file tests * Update marker name to match decorator (#1193) Co-authored-by: Pierce Hayes <pierce314159@users.noreply.github.com> * Changes to PR#1183: use apache/parquet-testing files (#1194) * Add UsedModules.cfg to .gitignore (#1179) The UsedModules.cfg file, which is generated from --saveUsedModules was not being ignored by git, and I don't think we would ever want a UsedModules.cfg pushed into the repo, so moving it into the gitignore will resolve any annoyance from having to selectively not include that file. * Add line to pass Arrow float to Chapel side This PR adds a check for the Arrow float type on the Chapel side, which was left out previously, allowing float data types to be read with `ak.read_parquet()`. With this addition, the file `alltypes_plain.parquet` from the Apache website now runs through `ak.read_parquet()` without fail. * Closes #1165 drop on axis (#1177) * Modified drop() to allow for row and column dropping. Added functionality for dropping rows in new function called by drop. Moved column drop to new function called by drop. Updated the drop row algorithm to set the new column values in place using the superclass's __setitem__. Updated docstrings to match new functionality. Set default axis to 0 to conform to pandas. * Added ak.DataFrame.drop() documentation. * Resolving Merge Conflicts * Added test to validate axis can only be 1 or 0. * Updated code based on PR comments. Removed unneeded if blocks and a few other style changes. * Updating unknown axis error. * Correcting typo * Update Arkouda to work with CTypes (#1185) In Chapel 1.26, various C types from 'SysCTypes', 'SysBasic' and 'CPtr' are being brought together into a single (new) module, 'CTypes'. In this PR, I'm adding a new user-level 'CTypes' compatibility module that brings the pieces that Arkouda relies upon together so that it can use the new names and organization yet still be compiled with older compilers. --- Signed-off-by: Brad Chamberlain <bradcray@users.noreply.github.com> * Catch all Parquet errors and report them to the client (#1188) * Catch all Parquet errors and report them to the client In the initial Parquet error handling code, the only errors that were being reported were errors that came from status codes of calls to the Parquet API, but all errors in the C++ code went unhandled and would result in a server crash. This PR wraps all the C++ functions in a try/catch to catch all errors to be reported back to the client rather than crashing the server. This approach seems to align with the philosophy of "The server should never crash, errors should just be reported to the client". Additionally, a bug was identified with throwing errors in `forall` loops within a function, which would cause a try/catch wrapping the throwing function to not catch the resulting `TaskErrors` error throw, but a workaround for that is to wrap `forall` loops in a try/catch which just throws the error, which then allows the overarching try/catch to catch and report the error, where previously it would crash the server. * Add unsupported type error message to getType function * Remove the ARROWUNDEFINED variable and add some throws * add parquet test files with license and readme * added and skipped test against parquet standard files * add option for running parquet file tests * Update marker name to match decorator * updated error message string in ParquetMsg with wrong format specifier and updated columns1 to pass assertion against ans Co-authored-by: Ben McDonald <46734217+bmcdonald3@users.noreply.github.com> Co-authored-by: Ben McDonald <mcdobe100@gmail.com> Co-authored-by: reuster986 <reuster986@gmail.com> Co-authored-by: Ethan-DeBandi99 <16845933+Ethan-DeBandi99@users.noreply.github.com> Co-authored-by: Brad Chamberlain <bradcray@users.noreply.github.com> Co-authored-by: Pierce Hayes <pierce314159@users.noreply.github.com> Co-authored-by: pierce314159 <48131946+pierce314159@users.noreply.github.com> Co-authored-by: Pierce Hayes <pierce314159@users.noreply.github.com> Co-authored-by: Ben McDonald <46734217+bmcdonald3@users.noreply.github.com> Co-authored-by: Ben McDonald <mcdobe100@gmail.com> Co-authored-by: Ethan-DeBandi99 <16845933+Ethan-DeBandi99@users.noreply.github.com> Co-authored-by: Brad Chamberlain <bradcray@users.noreply.github.com>
This PR (closes #1165):
Pandas
DataFrame.drop()
utilizes anaxis
parameter to trigger drops on rows or columns. This functionality has been added to Arkouda here.axis
should be 0 or 1. 0 = rows, 1 = columns..drop()
forDataFrame
.For testing I added functions to build an
ak.DataFrame
and apd.DataFrame
. I only implemented this for drop because there is an existing issue in to handle this for all tests.This functionality is currently only available in-place at this time.