Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(website): add custom transformation to website #116

Merged
merged 1 commit into from
May 30, 2024

Conversation

jitingxu1
Copy link
Collaborator

add custom transformation to website

It is part of #69 and #92

@jitingxu1 jitingxu1 requested a review from deepyaman May 30, 2024 21:54
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.54%. Comparing base (26b704a) to head (618e2c9).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #116   +/-   ##
=======================================
  Coverage   84.54%   84.54%           
=======================================
  Files          24       24           
  Lines        1863     1863           
=======================================
  Hits         1575     1575           
  Misses        288      288           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jitingxu1 jitingxu1 mentioned this pull request May 30, 2024
1 task
@jitingxu1 jitingxu1 added the documentation Improvements or additions to documentation label May 30, 2024
@jitingxu1 jitingxu1 merged commit 43fff05 into ibis-project:main May 30, 2024
4 checks passed
contents:
- id: How_to_create_transformer
title: "How to create your own transformer"
href: tutorial/How_to_create_your_own_transformer.qmd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use hyphens instead; also, don't capitalize "How"




ibisML comes with a variety of built-in transformation steps like `OneHotEncode`, `ImputeMean`, `DiscretizeKBins`, and many [others](https://ibis-project.github.io/ibis-ml/reference/steps-outlier.html). However, there are times when you might need to create your own custom preprocessing transformations. This guide will walk you through how to define a custom transformation in ibisML.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ibisML -> IbisML

Copy link
Collaborator

@deepyaman deepyaman Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does "others" link to steps-outlier; should it link to https://ibis-project.github.io/ibis-ml/reference/#steps instead?

from typing import Iterable, Any
```

## Implementation Outlines
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use sentence case for headings

Here's how to begin defining the `__init__` method with these considerations:

```{python}
def __init__(self, inputs: SelectionType):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all these blocks need to get executed? You can use # exec: false annotation (will need to look it up) to make sure these don't get run.

...

Also, when I quickly read through this, I read all of the individual steps, and then I read it again in the final implementation. Is there a better way to present this?


```{python}
# Instantiate CustomRobustScale transformer with the specified columns to scale
# # Select only one column: "int_col"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double comments


### Additional Considerations

Certainly! Here are additional checks and considerations to ensure the transformer handles unexpected data types or conditions gracefully:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads like ChatGPT


Certainly! Here are additional checks and considerations to ensure the transformer handles unexpected data types or conditions gracefully:

- Check for numeric olumns: Ensure that selected columns are numeric before calculating statistics. This prevents errors when trying to calculate statistics on non-numeric data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

olumns -> columns


- Check for numeric olumns: Ensure that selected columns are numeric before calculating statistics. This prevents errors when trying to calculate statistics on non-numeric data.
- Check for zero Interquartile Range (IQR): Verify that the IQR (the difference between the 75th and 25th percentiles) is not zero. A zero IQR indicates that all values in the column are the same, making standardization impossible.
- Backend compatibility: Validate if [operators](https://ibis-project.org/backends/support/matrix) used by ibisML are supported by your chosen backend. This ensures seamless integration and execution of transformations across different environments.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a consideration for people creating a step?

Comment on lines +260 to +262
## Conclusion

Custom transformers offer a high degree of flexibility and control over data preprocessing tasks. They excel at encapsulating specific steps within the data processing pipeline, which greatly enhances code manageability. If you haven't already, I highly recommend exploring their capabilities and integrating them into your workflow. They can be a valuable asset in streamlining and optimizing your data preprocessing processes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most FAQs don't have conclusions


Custom transformers offer a high degree of flexibility and control over data preprocessing tasks. They excel at encapsulating specific steps within the data processing pipeline, which greatly enhances code manageability. If you haven't already, I highly recommend exploring their capabilities and integrating them into your workflow. They can be a valuable asset in streamlining and optimizing your data preprocessing processes.

## <span style="color:red">🚀 Contribution Welcome!</span>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this randomly red with an emoji


## <span style="color:red">🚀 Contribution Welcome!</span>

Feel free to contribute to our transformations by implementing your own custom transformers or suggesting ones that you find essential. You can do so by checking our transformation [priorities](https://github.com/ibis-project/ibis-ml/issues/32), discussing ideas through creating [issues](https://github.com/ibis-project/ibis-ml/issues), or submitting pull requests (PRs) with your implementations. We welcome collaboration and value input from all contributors. Your ideas and implementations can enrich our library of transformations, making it more comprehensive and useful for everyone involved in data preprocessing tasks. Let's collaborate to enhance the efficiency and effectiveness of our data processing workflows together.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, docs should usually have a separate page (or pages) on contribution guidelines, not as part of a different FAQ.

It should also include all the information like how to set up your development environment, etc. at some point. This may not be an immediate priority (can check).

@deepyaman
Copy link
Collaborator

Can you re-raise this PR, taking into account the feedback, when you get a chance? Temporarily reverted, as getting some more people to view the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants