Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform a Column #255

Closed
patrikguempel opened this issue Apr 28, 2023 · 2 comments · Fixed by #270
Closed

Transform a Column #255

patrikguempel opened this issue Apr 28, 2023 · 2 comments · Fixed by #270
Assignees
Labels
enhancement 💡 New feature or request released Included in a release

Comments

@patrikguempel
Copy link
Contributor

patrikguempel commented Apr 28, 2023

Is your feature request related to a problem?

A Table can already be transformed by calling transform_table or transform_column. It would be nice to have this capability on a Column too.

Desired solution

  • Add a method transform to Column:
class Column(Sequence[_T]):
    def transform(self, transformer: Callable[[_T], _O]) -> Column[_O]:
        ...
  • Map transformer over the values of the column and collect the results
  • As usual, it should not change the receiving instance but return a new Column.

Possible alternatives (optional)

No response

Screenshots (optional)

No response

Additional Context (optional)

The apply function of pandas could be useful here.

@patrikguempel patrikguempel added the enhancement 💡 New feature or request label Apr 28, 2023
@lars-reimann
Copy link
Member

lars-reimann commented Apr 28, 2023

Table already has transform_table and transform_column. A Row has heterogeneous values, so I don't see much use for it. That leaves only a transform function for column, which would be a nice addition. I've adjusted the OP accordingly.

@lars-reimann lars-reimann changed the title Forward panda's 'apply'-function to modify data in Row, Column and Table Transform a Column Apr 28, 2023
@patrikguempel patrikguempel linked a pull request May 5, 2023 that will close this issue
@patrikguempel patrikguempel self-assigned this May 5, 2023
lars-reimann added a commit that referenced this issue May 6, 2023
Closes #255 .

### Summary of Changes

Users are now able to call Column#transform to apply a function to every
data point within the column

Co-authored-by: 92090487-PhilipGutberlet@useres.noreply.github.com

---------

Co-authored-by: megalinter-bot <129584137+megalinter-bot@users.noreply.github.com>
Co-authored-by: Lars Reimann <mail@larsreimann.com>
lars-reimann pushed a commit that referenced this issue May 11, 2023
## [0.12.0](v0.11.0...v0.12.0) (2023-05-11)

### Features

* add `learning_rate` to AdaBoost classifier and regressor. ([#251](#251)) ([7f74440](7f74440)), closes [#167](#167)
* add alpha parameter to `lasso_regression` ([#232](#232)) ([b5050b9](b5050b9)), closes [#163](#163)
* add parameter `lasso_ratio` to `ElasticNetRegression` ([#237](#237)) ([4a1a736](4a1a736)), closes [#166](#166)
* Add parameter `number_of_tree` to `RandomForest` classifier and regressor ([#230](#230)) ([414336a](414336a)), closes [#161](#161)
* Added `Table.plot_boxplots` to plot a boxplot for each numerical column in the table ([#254](#254)) ([0203a0c](0203a0c)), closes [#156](#156) [#239](#239)
* Added `Table.plot_histograms` to plot a histogram for each column in the table ([#252](#252)) ([e27d410](e27d410)), closes [#157](#157)
* Added `Table.transform_table` method which returns the transformed Table ([#229](#229)) ([0a9ce72](0a9ce72)), closes [#110](#110)
* Added alpha parameter to `RidgeRegression` ([#231](#231)) ([1ddc948](1ddc948)), closes [#164](#164)
* Added Column#transform ([#270](#270)) ([40fb756](40fb756)), closes [#255](#255)
* Added method `Table.inverse_transform_table` which returns the original table ([#227](#227)) ([846bf23](846bf23)), closes [#111](#111)
* Added parameter `c` to `SupportVectorMachines` ([#267](#267)) ([a88eb8b](a88eb8b)), closes [#169](#169)
* Added parameter `maximum_number_of_learner` and `learner` to `AdaBoost` ([#269](#269)) ([bb5a07e](bb5a07e)), closes [#171](#171) [#173](#173)
* Added parameter `number_of_trees` to `GradientBoosting` ([#268](#268)) ([766f2ff](766f2ff)), closes [#170](#170)
* Allow arguments of type pathlib.Path for file I/O methods ([#228](#228)) ([2b58c82](2b58c82)), closes [#146](#146)
* convert `Schema` to `dict` and format it nicely in a notebook ([#244](#244)) ([ad1cac5](ad1cac5)), closes [#151](#151)
* Convert between Excel file and `Table` ([#233](#233)) ([0d7a998](0d7a998)), closes [#138](#138) [#139](#139)
* convert containers for tabular data to HTML ([#243](#243)) ([683c279](683c279)), closes [#140](#140)
* make `Column` a subclass of `Sequence` ([#245](#245)) ([a35b943](a35b943))
* mark optional hyperparameters as keyword only ([#296](#296)) ([44a41eb](44a41eb)), closes [#278](#278)
* move exceptions back to common package ([#295](#295)) ([a91172c](a91172c)), closes [#177](#177) [#262](#262)
* precision metric for classification ([#272](#272)) ([5adadad](5adadad)), closes [#185](#185)
* Raise error if an untagged table is used instead of a `TaggedTable` ([#234](#234)) ([8eea3dd](8eea3dd)), closes [#192](#192)
* recall and F1-score metrics for classification ([#277](#277)) ([2cf93cc](2cf93cc)), closes [#187](#187) [#186](#186)
* replace prefix `n` with `number_of` ([#250](#250)) ([f4f44a6](f4f44a6)), closes [#171](#171)
* set `alpha` parameter for regularization of `ElasticNetRegression` ([#238](#238)) ([e642d1d](e642d1d)), closes [#165](#165)
* Set `column_names` in `fit` methods of table transformers to be required ([#225](#225)) ([2856296](2856296)), closes [#179](#179)
* set learning rate of Gradient Boosting models ([#253](#253)) ([9ffaf55](9ffaf55)), closes [#168](#168)
* Support vector machine for regression and for classification ([#236](#236)) ([7f6c3bd](7f6c3bd)), closes [#154](#154)
* usable constructor for `Table` ([#294](#294)) ([56a1fc4](56a1fc4)), closes [#266](#266)
* usable constructor for `TaggedTable` ([#299](#299)) ([01c3ad9](01c3ad9)), closes [#293](#293)

### Bug Fixes

* OneHotEncoder no longer creates duplicate column names ([#271](#271)) ([f604666](f604666)), closes [#201](#201)
* selectively ignore one warning instead of all warnings ([#235](#235)) ([3aad07d](3aad07d))
@lars-reimann
Copy link
Member

🎉 This issue has been resolved in version 0.12.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@lars-reimann lars-reimann added the released Included in a release label May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💡 New feature or request released Included in a release
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants