Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the Dataset Object Documentation #500

Merged
merged 19 commits into from
Jan 7, 2022

Conversation

benisraeldan
Copy link
Contributor

Reference Issues/PRs

Resolves #484

@benisraeldan benisraeldan added the documentation modification of the documentation / readme's label Jan 5, 2022
@shir22 shir22 marked this pull request as ready for review January 5, 2022 11:29
Copy link
Member

@ItayGabbay ItayGabbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the writing style!

Some comments:

  • I didn't comment on wordings
  • I think we need to show more functionalities about the dataset, for example functions like copy, from_numpy, datetime parsing, passing labels as series, and more.
  • We can take inspiration from a not-so-related library which in my opinion has a great quickstart:
    https://numpy.org/doc/stable/user/quickstart.html

@ItayGabbay ItayGabbay added kind/feature and removed documentation modification of the documentation / readme's labels Jan 5, 2022
@benisraeldan benisraeldan force-pushed the dc-484-dataset-object-documentation branch from 74609a6 to 0b4f0df Compare January 5, 2022 15:49
- features
List of column names. This is the features that are passed to the model. If not defined, columns not defined as something else is considered a feature.
- cat_features
List of column names. A subset of the features. Categorical features normally require some preprocessing before being passed to the model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
List of column names. A subset of the features. Categorical features normally require some preprocessing before being passed to the model.
List of column names. A subset of the features. Categorical features normally require some preprocessing before being passed to the model. If not specified, the categorical features are inferred automatically from the data itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also we should line here to the categorical inference heuristic detailed below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my general note above about phrasing

List of column names. This is the features that are passed to the model. If not defined, columns not defined as something else is considered a feature.
- cat_features
List of column names. A subset of the features. Categorical features normally require some preprocessing before being passed to the model.
- label
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, when reading the doc I think we should rename the label parameter to target.
What do you guys think?
@shir22 @noamzbr @benisraeldan @matanper @nirhutnik @JKL98ISR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree that target feels a bit smoother

docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
@ItayGabbay ItayGabbay added feature Feature update or code change to the package and removed kind/feature labels Jan 5, 2022
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
List of column names. This is the features that are passed to the model. If not defined, columns not defined as something else is considered a feature.
- cat_features
List of column names. A subset of the features. Categorical features normally require some preprocessing before being passed to the model.
- label
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree that target feels a bit smoother

docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
- features
List of column names. This is the features that are passed to the model. If not defined, columns not defined as something else is considered a feature.
- cat_features
List of column names. A subset of the features. Categorical features normally require some preprocessing before being passed to the model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my general note above about phrasing

shir22
shir22 previously requested changes Jan 5, 2022
docs/source/user-guide/dataset_object.rst Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
docs/source/user-guide/dataset_object.rst Outdated Show resolved Hide resolved
@shir22 shir22 self-requested a review January 7, 2022 00:05
@shir22 shir22 enabled auto-merge (squash) January 7, 2022 01:27
@shir22 shir22 merged commit 03d0408 into main Jan 7, 2022
@delete-merged-branch delete-merged-branch bot deleted the dc-484-dataset-object-documentation branch January 7, 2022 15:37
ItayGabbay pushed a commit that referenced this pull request Jan 7, 2022
Update to user guide
Small updates to readme and index
ItayGabbay added a commit that referenced this pull request Jan 9, 2022
* 0.2.0 version bump

* docs fixes

* Fix PerfectModel when data have nulls in label (#526)

* - Adding iris to the datasets section (#530)

- changing quickstart to use iris from the datasets section

* Update the Dataset Object Documentation (#500)

Update to user guide
Small updates to readme and index

* version bump

Co-authored-by: matanper <matan@deepchecks.com>
Co-authored-by: DBI <42312361+benisraeldan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature update or code change to the package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOCS] Update Dataset Object in docs
4 participants