03 Dec 23:03

github-actions

v0.1.0b3

00467dc

v0.1.0b3 Pre-release

Pre-release

This release contains an overhaul of the data_summary feature and minor bug fixes.

Changes

Updated the contributing guide @haishiro (#377) (#368)

Features

Reworked data summary (see below) @haishiro (#383)
Added progress bar when fitting topic model @truongc2 (#393)
Added support for Python 3.6 @haishiro (#369)

Bug Fixes

Fixed backend recursion bug @haishiro (#396)
Removed Extra Cell in User Guide @zack-soenen (#394)
Added kwargs to text preprocessing functions: filter_dictionary, create_doc_term_matrix, and create_tfidf_matrix @truongc2 (#386)

Maintenance

Disabled checks on draft PRs @truongc2 (#399)
Updated actions/setup-python requirement to v2.1.4 @dependabot (#421)
Added workflow dispatch events for manual workflow triggers @haishiro (#423)
Bumped peaceiris/actions-gh-pages from v3.7.0-8 to v3.7.3 @dependabot (#422)
Added dependabot @haishiro (#416)
Added script to rerun notebooks in CI prior to unit tests @truongc2 (#224)

Data Summary

An additional display (DataFrame) of row count, column count, and size in memory was added
The orientation of the summary table has been transposed so that the data columns are in rows. The motivation is for this change is that it is intended to scale better on datasets with a large amount of columns.
Improved the performance of data_summary when using the pandas backend. The prior implementation using pandas .agg() resulted in very long computation times even for small datasets.
Added Unique metric - the number of unique values
Changed the ordering of metrics. The motivation is to present the metrics in a more logical order of inspection.
Added additional display options:
- as_percentage: Format any count metrics (zeroes, nulls, top frequency) as a percentage over the total row count instead.
- auto_float: Attempted to add sensible defaults when displaying floats by avoiding scientific notation and excessive precision. Set this option to False to disable the new formatting.

Assets 2

14 Oct 22:12

github-actions

v0.1.0b2

794e889

v0.1.0b2 Pre-release

Pre-release

This patch focuses on addressing errors related to installation of data-describe.

Bug Fixes

Fixed backend logic when unsupported data types are given @haishiro (#347)
Updated setup() metadata for PyPI @haishiro (#348)
Resolved errors when missing IPython and importlib.metadata semi-optional dependencies @haishiro (#346)
Data Heatmap: Added legend label and moved to object-oriented mpl API @haishiro (#343)

Maintenance

Updated CI Github Action @haishiro (#355)
Added codecov.io for coverage checks @haishiro (#350)

Assets 2

11 Oct 22:45

github-actions

v0.1.0b1

8798cfc

v0.1.0b1 Pre-release

Pre-release

Changes

Standardized or updated documentation and naming conventions @haishiro (#328)
Moved backend implementations back into core @haishiro (#306)
Improved dependency management @haishiro (#302)

Features

Cleaned up docker (Resolves #176) @haishiro (#205)

Bug Fixes

Fixed statsmodels being required when it should be optional @haishiro (#340)
Fixed pyscagnostics being required when should be optional @haishiro (#339)
Prevented modin import on data-describe import @haishiro (#336)
Fixed presidio import on data-describe import @haishiro (#334)
Added random_state default to topic model @haishiro (#313)
Updated seaborn usage for upcoming 0.12 API @haishiro (#305)

Maintenance

Added exclude label for Release Drafter @haishiro (#337)
Disabled creation of alpha docs @haishiro (#326)
Added local api docs build directory to gitignore @haishiro (#335)
Enabled pypi release @haishiro (#327)
Added black to pre-commit checks @haishiro (#318)
Limited publish of latest docs on relevant paths @haishiro (#316)
Updated github cache action to v2 @haishiro (#315)

Assets 2

28 Sep 03:27

github-actions

v0.1.0a2

a10e12d

v0.1.0a2 Pre-release

Pre-release

This release includes multiple changes and bugfixes for the alpha testing period.

Changes

sklearn requirement bumped to 0.23 @haishiro (#279)
seaborn requirement bumped to 0.11 to use new displot function @haishiro (#287)
Documentation and build workflows now trigger on release published event instead of created @haishiro (#304)

Features

Added more details to example notebooks @haishiro (#282)

Bug Fixes

Fixed data_summary when a column is entirely null @haishiro (#301)
Fixed data heatmap ordering @haishiro (#283)
Fixed correlation matrix style to be more consistent (Resolves #236) @haishiro (#277)
Fixed link to contributing guide (Fixes #163) @haishiro (#280)
General improvements to stability @haishiro (#274) (#275) (#273)
Renamed references to data describe in documentation to be more consistent with branding @haishiro (#259)

Maintenance

Fixed and improved auto-generated documentation @haishiro (#252)
Fixed PyPI release pipeline @haishiro (#253)
Added Release Drafter for automated release notes @haishiro (#286)
Simplified and updated issue templates @haishiro (#261)

Assets 2

26 Aug 22:04

dvdjlaw

v0.1.0a1

b590f41

v0.1.0a1 Pre-release

Pre-release

v0.1.0a1

First release for private beta testing

New Features

Clustering
Correlation
Data Heatmap
Data Summary
Distributions
Scatter plots
Feature importance
Time series analysis
Text preprocessing
Topic Modeling
Sensitive data (privacy)
Dimensionality Reduction

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes

Features

Bug Fixes

Maintenance

Data Summary

Bug Fixes

Maintenance

Changes

Features

Bug Fixes

Maintenance

Changes

Features

Bug Fixes

Maintenance

v0.1.0a1

New Features

Releases: data-describe/data-describe

v0.1.0b3

Changes

Features

Bug Fixes

Maintenance

Data Summary

v0.1.0b2

Bug Fixes

Maintenance

v0.1.0b1

Changes

Features

Bug Fixes

Maintenance

v0.1.0a2

Changes

Features

Bug Fixes

Maintenance

v0.1.0a1

v0.1.0a1

New Features