-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Column renaming #3313
[ENH] Column renaming #3313
Conversation
@oleksiyskononenko early stages on this PR; the idea is formalised; however, your review will be helpful and guide me in the right direction. I have also noticed some issues regarding the methods, which I'll bring up in this PR after your review |
Do you see any benefits renaming |
base_frame.add_column(wf.retrieve_column(i), | ||
std::string(), | ||
gmode); | ||
base_frame.rename(names_[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that .rename()
should be able to rename a set of columns to different names. Right now this method only accepts one name for all columns and in some cases it actually adds a prefix...
@oleksiyskononenko As always, I am open to learning how to do things, so pls feel free to guide me |
Co-authored-by: oleksiyskononenko <35204136+oleksiyskononenko@users.noreply.github.com>
Co-authored-by: oleksiyskononenko <35204136+oleksiyskononenko@users.noreply.github.com>
Yeah, you're right. If we introduce a function At the same time we already have a couple of functions with the same names as the python's built-ins: Unfortunately, I don't think it is possible to have such a wrapper for a python's keyword. |
If we can implement it as a method only on |
By the way, @oleksiyskononenko is there a difference between FExpr and Expr. I always thought they were the same, but it seems there is an Expr object and FExpr |
Also, the |
@samukweku yes, you can think of |
@samukweku No worries, we will figure out how to move forward with this PR. Let's first finalize #3310 and #3311, then we will come back and implement a proper column renaming. |
Adjust our custom theme in a way similar to `sphinx_rtd_theme`, see readthedocs/sphinx_rtd_theme#1021. This fixes the search functionality for sphinx `4.*`. We can take care of sphinx `5.*`, that was recently released later, if needed. Closes #3299
Remove unused `gby` in the case when `dt.unique()` is called in the group by context.
Cosmetic improvements of docs for `cumcount()` and `ngroups()`. WIP for #3279
Improve "Using datatable" section by adding more consistency to the code and fixing the text. In future, we may also want to add a sample "in.csv" file , so that all the code examples could really be copy-pasted to python for execution.
…3324) Label ids for both FTRL and LinearModel are stored as `int32` column, so It makes no sense to use `ARR64` rowindex to identify the new labels. In this PR we safely change `RowIndex::ARR64` to `RowIndex::ARR32` when creating new labels for classification problems.
…ils` (#3321) Currently our Jenkins is using macOS Big Sur, and in order to make OS coverage as large as possible we switch AppVeyor to use macOS Monterey. Also, in this PR we replace the deprecated `distutils` module with `sysconfig` in order to get the proper platform tag. Closes #3322 Closes #3177
…els (#3327) Support for `manylinux2010` image that we're using to build datatable on `x86_64` is about to be dropped (pypa/manylinux#1281), so we switch to `manylinux2014` that we're already using on `ppc64le`. Also, Python 3.7 will reach its end of life soon, hence we switch to Python 3.8 when generating debug wheels. In principle, we can generate debug wheels for all the supported Python versions, however, this will significantly slow down our building pipeline.
In this PR we adjust AppVeyor builds to - enable `pyarrow` tests; - on Windows, enable C++ tests by testing debug wheels for Python 3.9; - on Windows, fix builds to properly report failures; - for consistency, rename `DTTEST` to `DT_TEST` and namespace `dttest` to `dt::tests`. Note, that, when enabled, the C++ tests [redefine](https://github.com/h2oai/datatable/blob/main/src/core/utils/tests.h#L91-L100) `protected` and `private` keywords to `public`. This is a pretty dangerous approach, that we might need to reconsider, because this redefinition only happens in the files which include `utils/tests.h`. On Windows, for instance, this caused a pile of linking errors due to the fact that some methods were expected to be `public`, but were declared as `protected` or `private`.
…3332) - make signatures of the functions referenced in the `FExpr` API section to be consistent with the actual signatures of the `dt.*()` functions; - couple of other minor fixes.
11b352c
to
d8507e0
Compare
bungled this, closing this and creating a new one : #3333 |
by
, especially for boolean expressions #2504Implementation for column renaming