Update FAQ with Woodwork#1649
Conversation
tamargrey
left a comment
There was a problem hiding this comment.
Noted some entity/entities usage
Codecov Report
@@ Coverage Diff @@
## woodwork-integration #1649 +/- ##
=======================================================
Coverage ? 98.26%
=======================================================
Files ? 138
Lines ? 15287
Branches ? 0
=======================================================
Hits ? 15022
Misses ? 265
Partials ? 0 Continue to review full report at Codecov.
|
| @@ -1345,8 +1360,8 @@ | |||
| " index=\"id\",\n", | |||
There was a problem hiding this comment.
Woodwork isn't able to infer the group column's logical type, maybe we should add additional repeated values so it gets inferred as categorical
There was a problem hiding this comment.
In this example, all group values must be the same in order for Woodwork to infer it as catgorical.
thehomebrewnerd
left a comment
There was a problem hiding this comment.
Don't have much more to add to the comments that have already been made, but noticed one more place we could clean up wording.
thehomebrewnerd
left a comment
There was a problem hiding this comment.
Looks ok to me outside of cleaning up a link, which we will need to confirm works later anyway.
thehomebrewnerd
left a comment
There was a problem hiding this comment.
Oh, we also should make a release notes entry for this update as well.
* Remove add interesting values from Entity (#1269) * move add interesting values to EntitySet * update release notes * add test for verbose output * update test for better coverage * coverage update * remove outdated comments * rename entity to datatable * fix release notes * update logger in test * fix merge conflicts * rename datatable_id to entity_id * update release notes * Move set_secondary_time_index to EntitySet (#1280) * move set_secondary_time_index to entityset * update release notes * break long line * merge fixes * update docs * fix release notes formatting * Refactor Relationship Creation (#1370) * update Relationship init * refactor add_relationship * update dostring * update release notes * revert files * test coverage fix * restrict smart-open * update code terminology * allow relationship object for adding relationships * test error * add breaking changes to release notes * update relationship construction * pr clean up * update schema version to 6.0.0 * add code examples to release notes * lots of renaming * update docs * Move update_data method from Entity to EntitySet (#1398) * Move default variable description logic to generate_description (#1403) * Move time type check (#1400) * Replace Entity with Woodwork DataFrame (#1405) * Create separate files for ww changes * comement out unecessary methods for now * Allow initalizing an entityset with woodwork dataframes * Allow adding a dataframe with params * Get getitem working * test add dataframe directly * update repr * get relationship init working no real checks * update relationship path methods * start working on normalize dataframe * Get secondary_time_index working * Get normalize dataframe working * cleanup df usage * clean up time index usage * cleanup comments * Get update dataframe working * Add comment * Move changes to regular entityset file and comment out es tests * start converting tests to use woodwork * continue moving over tests * more tests * use logical types instead of vtypes * Use string dtypes for default dtype values * more test changes * remove uneccessary files * clean up comments * fix rest of non behavior change tests * get make_ecommerce_entityset and fixture working * start using es tests - broken koalas * have child and parent columns have woodwork info * relationship tests * Use woodwork typing for time type * performe inference on column if necessary * convert remaining possible tests * fix koalas fixture to handle nans * start working on last time indexes * use ww syntax in last_time_indexes * update names * small fixes * fix datetime conversion error * use ww for test_last_time_index * get some lti tests running * Get last time index tests working apart from koalas make index * cleanup comments * Cleanup imports * fix matching index and time index tests * Only use first column as index if woodwork not initialized * stop allowing non string column names * warn if performing type inference on dask and koalas * xfail koalas make index tests * Update index reordering test to not care about reordering * Change logical type of foreign key if it doesn't match the index's * Continue replacing Entity (#1416) * warn for extra parameters * update es_metadata tests - raising a lot of warnings?? * update timedelta tests * Update dask es tests * Update koalas es tests * test update dataframe better * sort at update_dataframe if necessary * update column dtype properly * allow woodwork initialized dataframe at update dataframe * update sizeof * update demo functions * update docstrings * Fix warnings in tests * update error messages * use latlong test with dask and koalas * clean up comments * use relationship attrs instead of woodwork name * use get_df_tags better in tests * fix reordering of columns in update dataframe * remove unecessary latlong index setting * start responding to PR comments * use relationship attrs in entityset instead of woodwork attrs * update foreign key usage in koalas and dask test * More pr comments * Keep original schema in update dataframe even if ww initialized * create public and private set secondary time index methods * fix update_dataframe docstring * add test for external dataframe set secondary time index * remove unecessary tests * Clean up replace Entity Woodwork integration (#1427) * remove woodwork index tags on relationship cols comment * remove unecessary metadata setting * cleanup conftest * Add time type tests * lint fix for testing * clean up normalize dataframe * clean up variable usage in tests * Add time type test with double and integer * reverse order of time type checks * Add check that primary time index is set on a dataframe before adding secondary time index * include column metadata and descriptions in normalization * Store interesting values on column metadata (#1421) * interesting values work * update tests * lint fix * add test and lint fix * fix test * update docstring variable -> column * update comment * refactor finding where-able cols * update comment * lint fix again * update docstring * lint fix * expand flight ordinal order * expand docstring of set_Secondary_time_index * change _parent_dataframe_id to _parent_dataframe_name * change _child_dataframe_id to _child_dataframe_name * change _child_column_id to _child_column_name * change _parent_column_id to _parent_column_name * change dataframe_id to dataframe_name * more id to name changes * change remaining id mentions * update docstrings in entityset * consolidate copy and additional columns validation * look at copy columns for make time index * lint fix * confirm column doesnt get removed in copy columns * Add breaking changes and update release notes * Revert "lint fix" because of incorrect linting This reverts commit a585c1f. * lint fix * Change woodwork requirements * remove duplicate error message in dask and koalas tests * make dataframe_name an optional parameter and require it if woodwork not initialized * Update demo and mock entitysets to have optional dataframe name * update remaining tests to use optional dataframe name * Add parameter check to ltype comparison warning * raise error for conflicting df names but allow same df name * Change conflicting name error msg Co-authored-by: Nate Parsons <4307001+thehomebrewnerd@users.noreply.github.com> * Use woodwork 0.4.0 (#1451) * use latest woodwork version * update woodwork requirement * lint fix * fix ltype parameter test * Add release note * fix reelease notes * remove release note * Update EntitySet.plot to use Woodwork (#1468) * implement plot on entityset * messy column schema for columns string * simpler type string * format column types better for plot * Add release note * Add note to docstring about woodwork typing * Store last time index as column on DataFrame (#1456) * set lti as a column at end * convert last_time_index tests * update cdata lti test - broken * store last time indexes in dictionary * small updates * clean up comments * cleanup tests * Make sure to remove existing ltis for any dfs added to queue * finish fixing tests * lint fix * add broken test * Add release note * Add tests and init with correct time type * add merge to handle dask and koalas * keep fixing dask and koalas merge * fix index issues * add time type consistency back in * fix duplicate last time index popping * clean up comments * lint fix * update sizeof * clean up test * fix dask and koalas tests and expand int_es to use dask and koalas * explain why apply is needed for time type conversion * remove lti column in update dataframe if not on new dataframe * Ochange lti column name and error if user placed * Add comments * make lti name a global variable * expand recalculate lti tests * pr comment * Implement deep copy on EntitySet to retain Woodwork typing info (#1465) * set lti as a column at end * convert last_time_index tests * update cdata lti test - broken * store last time indexes in dictionary * small updates * clean up comments * cleanup tests * Make sure to remove existing ltis for any dfs added to queue * finish fixing tests * lint fix * add broken test * Add release note * Add tests and init with correct time type * add merge to handle dask and koalas * keep fixing dask and koalas merge * fix index issues * add time type consistency back in * fix duplicate last time index popping * clean up comments * lint fix * update sizeof * clean up test * fix dask and koalas tests and expand int_es to use dask and koalas * explain why apply is needed for time type conversion * remove lti column in update dataframe if not on new dataframe * Ochange lti column name and error if user placed * Add comments * make lti name a global variable * expand recalculate lti tests * pr comment * TEMP * initial deepcopy implementation missing attrs * fix entityset equality check * implement deepcopy on entityset * use deepcopy on fixture * expand fixture for copy test * move tests to test_es * add release notes * remove comments * fix spelling error * bump woodwork version (#1478) * Replace list_variable_types with list_logical_types (#1477) * Allow deep equality check on EntitySet (#1480) * papply deep keyword to entityset equality check * stick with woodwork equality * Add release note * pr comments * Update query_by_values for Woodwork Integration (#1467) * initial query_by_values update for ww * update release notes * revert accidental concat changes * pr comments - update wording * update warning and add test for warning * dt -> schema in _handle_time * update variable names * lint-fix * qbv test update * update lti tests * update test * lint fix * WW/FT Serialization Updates (#1452) * initialize serialization updates * serialization test updates * remove commented code * remove entity wording * update tests * koalas update * remove comments * pr comment updates * recreate category dtypes * update comment * Merge latest changes from main (#1493) * Add function to list semantic tags (#1486) * Update EntitySet.concat to use Woodwork (#1490) * update concat, broken * get simple concat working * uncomment long concat test * start fixing test * finish updating test * start expanding tests * test sorting entityset * finish test coverage * fix warning * cleanup comments * add release note * lint fix * clean up test * fix sort index test * use dataframe type * lint fix * implement deepcopy that works with koalas to use for all concat tests * add checks to sort index tests * clean up xfails for concat entityset test * split up large test * Replace entity_from_dataframe with add_dataframe (#1504) * update conftest * update test_feature_set_calculator * use add_dataframe * use add_dataframe * use add_dataframe in docs * use logical_types * use dataframe_name * lint fix * use logical types param in docs * replace variable_types with logical_types * use logical types * replace variable types with logical types * replace variable types with logical types * change to integer * remove index semantic tag * replace vtypes with ltypes * replace vtypes with ltypes * declare logical type for id * Rename target entity to target dataframe (#1506) * replace target_entity with target_dataframe * fix glossary * fix handle time parameters * use references from woodwork * use index from woodwork * get index and time index from ww accessor * fix docstring * get index from ww * revert compose changes * update get_valid_primitives * update glossary * use dataframe_dict * Primitives use Woodwork ColumnSchema for input types and return type (#1411) * update input and return types for agg primitives * update binary transform primitive variable types to use woodwork * replace input and return types with ColumnSchema in cum_transform_feature.py * update transform primitives to use column schema for input and return type * replace input type and return types in test files * replace _get_names_valid_inputs with _get_unique_input_types * remove entity references in primitive tests * lint * add BooleanNullable to some primitives * update MulitplyBoolean, And, Or * update more input/return types * fix add_dataframe argument order * add ordinal order for datetime transformations * lint * update docstrings of make_x_primitive functions * fix Not input_types * remove unused Numeric import * specify order for Weekday primitive return type * Woodwork Integration - Features (#1501) * update input and return types for agg primitives * update binary transform primitive variable types to use woodwork * replace input and return types with ColumnSchema in cum_transform_feature.py * update transform primitives to use column schema for input and return type * replace input type and return types in test files * replace _get_names_valid_inputs with _get_unique_input_types * remove entity references in primitive tests * lint * add BooleanNullable to some primitives * update MulitplyBoolean, And, Or * update more input/return types * fix add_dataframe argument order * add ordinal order for datetime transformations * lint * update docstrings of make_x_primitive functions * remove entity.py and variables.py * update FeatureBase to use dataframes * update feature descriptions to use dataframes * update aggregation primitive base to use dataframe terminology * update generate_name in Count primitive * update tests to use new feature parameters * add feature_base/utils.py * a couple more additions to FeatureBase * ensure cohort_name is categorical * add category tag to ordinal return types * update feature visualizer to use dataframe terms * fix Not input_types * update primitive tests * use set operations to simplify check for index columns * simplify getting index name in get_aggregation_groupby * fix category semantic tag check in variable_filter * move replace_latlong_nan out of entity_utils * fix comparison in test_copy * check logical type in test_return_type_inference_index * update var names in test_multi_output_features * use ColumnSchemas in test_return_variable_types * more specific TODO for _check_cutoff_time_type * update __mul__ logic for boolean * boolean * rename utils.entity_utils to utils.latlong_utils * update _check_againt_time_column * update _check_time_against_column * correct schema access in _check_time_against_column * Add make_index functionality to Featuretools (#1507) * initial make index updates * add make_index logic back to Featuretools * lint fix * undo accidental file deletion * fix file * update check for warning * PR feedback updates * use es.dataframe_type * fix outdated info in release notes (#1522) * Remove entity tests (#1521) * remove check time type * remove commented out tests * remove variable ordering test * remove variable tests * remove test * remove commented out imports * remove file * Revert "remove check time type" This reverts commit 6b3c5d3. * update _check_time_type * update docstring * Standardize imports for Woodwork in codebase (#1526) * direct imports of woodworks in codebase * sort imports * use ww_type_system * Update DFS primitive matching to use ColumnSchema (#1523) * update dfs primitive matching * work on test_deep_feature_synthesis tests * work on dfs tests * fix more tests * exclude foriegn key cols from transform feats * fix Trend * more test updates * more test work * remove files * fix dfs to match old features * lint fix * remove old print statement * more naming updates * update handling of foreign key columns * lots of naming updates * fix test names * even more naming updates * more cleanup and test fixes * rename return_variable_types to return_types * fix broken entityset tests * pr naming updates * remove unnecssary primitive * add new _schemas_equal conditions * lint fix * Update doc page on using entity sets with Woodwork (#1532) * refactor to jupyter notebook * use dataframes * use dataframe in comments * remove rst file * use EntitySet * add link to Woodwork * use target_dataframe_name * update comment on adding relationships * Updates from featuretools v0.26.0 (#1539) * Change to use GitHub Token rather than GitHub PAT (#1402) * Update dependency_check.yml * Update release_notes.rst * Update dependency_check.yml * Use builtin secret token with create pull request (#1407) * Use builtin secret token with create pull request * Update release_notes.rst * Use repo scoped token again (#1409) * Use repo scoped token again * Update release_notes.rst * Update latest_dependencies.txt (#1410) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Lower max depth to 1 if single entity (#1412) * lower DFS max depth to 1 if single entity * add test for including seed features with depth greater than max depth * test max depth=0 doesn't create depth 1 features on single table * update release notes * change log entry to user warning and test for warning * lint * fix max depth warning in docs * remove outdated comment * rework single table assertions to be more readable * use feature_with_name helper in seed_features test * lint * add max_depth=None and max_depth=-1 cases to single table test * move helper function def out of loop; remove invalid max_depth=None case * lint * Drop Python 3.6 support (#1413) * remove py36 from CI test matrix * remove warning when importing featuretools about dropping 3.6 support * remove python 3.6 from setup.py * remove py36 from list of supported version in installation docs * remove py36 constraint on dependency * update release notes * v0.24.0 (#1414) * bump version number * update release notes * Update latest_dependencies.txt (#1415) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Separate workflows and unit tests (#1422) * separate workflows * update release notes * fix incorrect word * update numpy req * separate link check * remove dask separation * copy from main * release notes * Add minimum dependency generator GitHub Action (#1428) * add min deps checker * update release notes * fix filename * generate auto PR * update latest dep check * file rename * better release notes * move to 1 folder * fix fastparquet? * Update minimum dependencies (#1431) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Bump pyyaml from 3.12 to 5.4 in /featuretools/tests/requirement_files (#1433) * Bump pyyaml from 3.12 to 5.4 in /featuretools/tests/requirement_files Bumps [pyyaml](https://github.com/yaml/pyyaml) from 3.12 to 5.4. - [Release notes](https://github.com/yaml/pyyaml/releases) - [Changelog](https://github.com/yaml/pyyaml/blob/master/CHANGES) - [Commits](yaml/pyyaml@3.12...5.4) Signed-off-by: dependabot[bot] <support@github.com> * Update requirements.txt * Update release_notes.rst * Update release_notes.rst Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * Update nbsphinx version to resolve docs build issue (#1436) * update release note for test * update release notes * pin markupsafe version * update nbsphinx version and remove markupsafe * update release notes * Update latest dependencies (#1437) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update latest dependencies (#1439) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Bump psutil requirement (#1438) * Bump psutil requirement * Update release_notes.rst * Update minimum dependencies (#1443) * Add unit tests against minimum dependencies (#1432) * Fix numpy installation for minimum unit tests (#1445) * Update latest dependencies (#1446) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update latest dependencies (#1448) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * v0.24.1 (#1450) * Update latest dependencies (#1454) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update latest dependencies (#1455) * Bump urllib3 from 1.26.4 to 1.26.5 in /featuretools/tests/requirement_files (#1457) * Bump urllib3 in /featuretools/tests/requirement_files Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.4 to 1.26.5. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](urllib3/urllib3@1.26.4...1.26.5) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * Update test-requirements.txt * Update release_notes.rst Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * Update alteryx_open_src_update_checker to 2.0.0 (#1460) * Update setup.py * Update __init__.py * Update release_notes.rst * Update setup.py * Update install_test.yml * double for loop * Update latest dependencies (#1464) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Add get_valid_primitives function (#1462) * add function skeleton * add tests * add get_valid_primitives and update tests * add test and fix typo * update release notes * add test for non-str invalid primitive * remove unused code from custom primitives * lint * remove unused var names and avoid erroring due to compatibility * rework compatibility check * make ft.get_valid_primitives callable, add to API reference, add note to docstring * make get_entityset_type private * Bump minimum pip from 19.0.2 to 21.1.2 (#1475) * Bump pip from 19.0.2 to 19.2 in /featuretools/tests/requirement_files Bumps [pip](https://github.com/pypa/pip) from 19.0.2 to 19.2. - [Release notes](https://github.com/pypa/pip/releases) - [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst) - [Commits](pypa/pip@19.0.2...19.2) --- updated-dependencies: - dependency-name: pip dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * Update test-requirements.txt * Update test-requirements.txt * Update minimum_test_requirements.txt * Update release_notes.rst Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> Co-authored-by: Roy Wedge <roy.wedge@alteryx.com> * Add dataframe_type property to EntitySet (#1473) * add dataframe_type property * remove _get_entityset_type * update if not pandas entityset checks in tests * add docstring to dataframe_type * update release notes * rework dataframe_type logic * add test cases * use dataframe_type in more tests * remove some unused ks imports * more test updates * fix faulty comparison in tests * v0.25.0 (#1485) * bump version number * update release notes * Update latest dependencies (#1487) * Update latest dependencies (#1499) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fix docs to avoid logging demos (#1498) * set testing header to prevent logging * add library to url * release notes * release notes Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * Update latest dependencies (#1500) * Update latest dependencies (#1502) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update latest dependencies (#1503) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Add replace_inf_values util function (#1505) * add replace_inf_values util function * update release notes * fix release notes * add optional columns parameter to function * lint fix * Test compatibility with upcoming pandas release 1.3.0 (#1492) * update requirements * comment at local error * fix test_transform error * fix boolean conversion error * remove requirements change * fix timezone warning * fix astype warning and use view * Add release note * Update latest dependencies (#1520) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Add URL and Email Address primitives (#1508) * Update latest dependencies (#1524) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Primitive options include entities overrides exclude entities (#1518) * update ignore_entity for primitive * update variable_filter to return True if entity in include_entities * update release notes * Update TLD list and add license for email file (#1531) * add license to primitive data * update TLD list * update release notes * typo * update TLD list * v0.26.0 (#1525) * bump version * update release notes * make underline longer * alphabetize contributors * Update docs/source/release_notes.rst * Update docs/source/release_notes.rst Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * Update latest dependencies (#1534) * uncomment future release * replace target_entity in a few tests * delete test_entity.py again * fix include_over_exclude test * put Fixes section back in the changelog Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> Co-authored-by: machineFL <49695056+machineFL@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Nate Parsons <4307001+thehomebrewnerd@users.noreply.github.com> Co-authored-by: Jeff Hernandez <12969559+jeff-hernandez@users.noreply.github.com> Co-authored-by: Frances Hartwell <frances.hartwell@alteryx.com> Co-authored-by: Tamar Grey <64278226+tamargrey@users.noreply.github.com> Co-authored-by: Ethan Tu <34871276+tuethan1999@users.noreply.github.com> * fix entityset tests (#1548) * Update DFS page to use Woodwork (#1557) * update dfs doc page * remove old rst file * clear outputs and remove outdated entity reference * Update Feature Primitives page to use woodwork (#1556) * update primitives docs * revert graphs * update references * add link to woodwork type and tag guide * Refactor add_interesting_values to leverage Woodwork (#1550) * refactor add_interesting_values * update getting semantic tags * update verbose msg * move 0 check * use append instead of concat * Update calculate feature matrix to work with Woodwork (#1533) * update dfs primitive matching * work on test_deep_feature_synthesis tests * work on dfs tests * fix more tests * exclude foriegn key cols from transform feats * fix Trend * more test updates * more test work * remove files * fix dfs to match old features * lint fix * remove old print statement * more naming updates * update handling of foreign key columns * lots of naming updates * fix test names * even more naming updates * more cleanup and test fixes * rename return_variable_types to return_types * initial exploration * work on tests in test_feature_set_calculator * fix test_feature_set.py * several more test fixes * fix more tests * fix topn test and lint fix * fix latlong tests * fix int_es tests * Revert "fix int_es tests" This reverts commit ae4d5a6. * update synthesis tests * lint fix * fix primitive tests * lint fix * replace usage of entity * lint fix * remove more references to entities * update variable naming * update nmostcommon related tests * update cfm docstring * pr updates * sub_dataframe -> sub_dataframe_name * update dask test to use null val * add new test for n_most_common * update verbose msg * fix typo * wording updates * dataframe_df_trie -> dataframe_trie * Update handling time guide to use Woodwork (#1552) * add handling time guide notebook with woodwork * clean up notebook to use woodwork * fix cfm lines and images * suppress cells * rename notebook and remove old rst file * use reST format for raw cells * update func link * PR comments * hide cells with nbsphinx * PR comments * Synthesis test fixes (#1580) * update entityset time_type value * fix two edge cases with encode features * Fix remaining broken primitive tests (#1568) * fix test_direct_features * initial work on test_agg_feats * fix test_groupby_transform_primitives * fix test_features_serializer.py * some work to fix dask tests * bump min reqs * Preserve EntitySet Woodwork schemas on pickling (#1581) * fix entityset pickle to preserve woodwork schema * add tests * lint * add cluster fixture and use constant for schema key * use update instead of resetting _dict_ * update fixture name * Update advanced custom primitives guide to use Woodwork (#1587) * update custom primitives guide * use woodwork specific typing info * add link to ww typing guide and refence ColumnSchema * Update Deployment docs page (#1588) * update deployement.rst to ipynb * add function links * update link to rst instead of html * Update Improving Performance Guide (#1591) * update performance guide to use jupyter notebook and woodwork * remove entity references and replace entity set with entityset * Update Using Dask EntitySets Guide (#1590) * update using Dask EntitySets guide * PR feedback updates * Update Specifying Primitive Options guide for Woodwork (#1593) * update specifying primitive options doc * clear notebook outputs * update indentation * Add Woodwork Typing Guide (#1589) * create ww typing guide and start writing * flesh out semantic tags section * clean up ww types guide * move to getting started * revamp guide * shorten guide * add links * pare down and proofread * replace table usage with DataFrame * PR comments * Add to getting started rst * rework semantic tags section * Add release note * clean up wording * pr comments * fix typo * Update api reference to match new api (#1600) * Updates for WW 0.6.0 and fix other failing tests (#1597) * update requirements * fix selection tests * fix test_ww_es.py * fix logical type comparisons * fix test_encode_features.py * fix test_deep_feature_synthesis.py * fix test_feature_set_calculator.py * fix test_dask_es.py * fix test in test_es.py * fix selection test * add test skips * update requirements * update requirements * update dask reqs * more requirements updates * bump pandas min version * bump koalas version * eliminate _schemas_equal func * fix post merge issues * fix release notes * Update index doc to use Woodwork (#1602) * Create notebook and add contents from index.rst * walk through cells and update to use markdown and woodwork * hide and format raw cells * remove index rst file * Pr comments * Fix DFSTransformer Documentation (#1605) * update featuretools-sklearn-transformer to install from branch * update version to 1.0.0.dev0 * Update feature description guide to use Woodwork (#1603) * create feature description notebook and move contents from rst file * superficial updating of code and language * make sure outputs are as expected and that language makes more sense for woodwork * format links and headers * remove comments and hide cell * remove rst file * PR comments * reword warning about getitem usage * Update Koalas Guide to use Woodwork (#1604) * Add koalas guide notebook and add contents from rst file * hide cell and show rst dropdown in metadata * update to use woodwork language * clean up * remove rst file * change varable_name to column_name * PR Comments * fix link to woodwork guide * remove extra spaces * Update Glossary with Woodwork terms (#1608) * update glossary page * add logical type and semantic tags to glossary * small updates and add ColumnSchema * remove old todos * woodwork's column -> woodwork column's * Update tuning dfs guide to use Woodwork (#1610) * Move contents of rst file into jupyter notebook * get code running * update wording to use woodwork * clean up * Proof read * remove rst file * use lower case feature * update nlp-primitives requirement (#1609) * Remove more references to entity, entities, variable, var (#1612) * update flight.py * update retail.py * update entity and entityset wording * more entity updates * rename variable to colum * lint fix * rename var to col * remove variable usage * more wording updates * remove graph variable types related code * add comment back * Fix small formatting issues around Woodwork docs (#1607) * fix code in docstrings * Fix dataframes dict formatting in docstrings * fix links in handling time * fix link in primitives doc * fix link to ww guide in advanced custom primitives * fix linkss from rreferencing rst files to notebooks * fix add lti docstring * use ref anchor instead of doc * fix formatting * fix typo * lint fix * remove variables doc and reference to variables (#1629) * Remove categorical encoding library and CI test (#1632) * remove categorical_encoding * update release notes * Remove autonormalize add-on library and CI test (#1636) * Update install_test.yml * Update install.rst * Update setup.py * Update release_notes.rst * Update dev-requirements.txt * remove faq autonormalize q * Update release_notes.rst * Remove tsfresh, nlp_primitives, sklearn_transformer add-on library and CI test (#1638) * remove add ons * Update dev-requirements.txt * Update release_notes.rst * remove docs DFStransformer * Update api_reference.rst * Use make index to re-create index on new DataFrame in EntitySet.replace_dataframe (#1630) * Add ability to create index at updat_dataframe * change update_dataframe to replace_dataframe * update docstrings * expand docstring * fix release notes * dont raise warning if index is present and split test * Revert changes to Equal and NotEqual primitives (#1640) * update flight.py * update retail.py * revert changes to Equal and NotEqual primitives * Update Feature Selection Page with Woodwork Dataframe (#1618) * Changes to the feature selection doc * cleared outputs * Fixed woodwork initialization issues * update docs * clear notebook * stop skipping correlated check * test woodwork init in highly correlated * update release notes * PR comments * add note about ww init to docstring * change to rst cell Co-authored-by: Tamar Grey <tamar.grey@alteryx.com> * Merge in latest from main branch (#1643) * Specify conda channel and Windows exe in graphviz installation instructions (#1611) * Update latest dependencies (#1615) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update latest dependencies (#1616) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Remove GA from documentation layout html (#1622) * Update layout.html * Update release_notes.rst * v0.26.2 (#1628) * v0.26.2 * Update release_notes.rst Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * update release notes Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> Co-authored-by: machineFL <49695056+machineFL@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Pranav Simha <pranav0521@gmail.com> * Update to use Woodwork 0.7.1 (#1648) * update requirements to 0.7.0 * Add comments and use full schema init * switch back to partial schema init with unique index bug and test mismatched index bug fix * remove redundant dataframe operations in replace_dataframe * init in last time index with partial schema * use AgeFractional primitive * update requirements.txt * update min requirements * PR comments * Update CumCount primitive (#1651) * update CumCount primitive * use IntegerNullable return type * update release notes * Create features from Woodwork columns (#1582) * add reference * track dataframe name * refactor feature base * fix refactor * remove extra line * update api for identity feature * use api for identity feature * refactor api for feature * refactor api for feature class * refactor api for feature class * fix syntax * add es ref to df * fix feature base init arg * fix syntax * fix transform super init df * refactor to private method * refactor api for identity feature * fix api for direct feature * add references when updating df * refactor api for feature * store entityset reference keys in metadata * refactor feature check * check feature in groupby transform * add references after last time index * refactor api for feature * update entityset ref * refactor api for feature * lint fix * use global es ref * use _validate_base_features * update feature base docstring * reference column and foreign key * remove double call of feature * use column reference * update notebook cells * update notebook cell * add release notes entry (include missing) * Update FAQ with Woodwork (#1649) * update cells to new api * use new api * add question to faq * use woodwork references * update comments * remove usage of entity and entities * fix typo * update faq question * fix bullet points * fix link * fix grammar * shorten sentence * update answer for dask df * update to numeric and boolean values * remove then * include count in agg and where primitives * update sentence * reorder cells * fix grammar * add comment about semantic tags * add placeholder link * update error message * clarify sentence * update comments * fix grammar * update comments * add link * fix link * add release note entry * Add missing release notes (#1663) * Add pr number for breaking changes * start adding missing release notes * finish adding missing release notes * PR Comments * Merge updates from main (#1666) * Specify conda channel and Windows exe in graphviz installation instructions (#1611) * Update latest dependencies (#1615) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update latest dependencies (#1616) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Remove GA from documentation layout html (#1622) * Update layout.html * Update release_notes.rst * v0.26.2 (#1628) * v0.26.2 * Update release_notes.rst Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * Remove add on libraries from install docs, setup.py and CI tests (#1644) * remove add ons * update release notes * fix dev req * fix api ref * add scikit-learn to dev reqs * Update latest dependency checker with proper install comment (#1652) * Update latest_dependency_checker.yml * Update release_notes.rst * Update release_notes.rst * Update latest_dependency_checker.yml * Isort 5 (#1654) * Update isort requirement * no more isort --recursive (deprecated) * Update release_notes.rst * Update latest dependencies (#1653) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * Update primitives loading to be more robust (#1662) * Update primitives loading to be more robust * Emit warning when primitives entry point throws error * Prevent overwriting of names in primitive namespace * Update release notes * Add no cover to entry_point loop * v0.27.0 (#1665) * bump version * update release notes * add contributor * fix release notes * fix release notes Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> Co-authored-by: machineFL <49695056+machineFL@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Pranav Simha <pranav0521@gmail.com> Co-authored-by: Roy Wedge <roy.wedge@alteryx.com> Co-authored-by: David Sanders <david.sanders@alteryx.com> Co-authored-by: Jeff Hernandez <12969559+jeff-hernandez@users.noreply.github.com> * Add Version 1.0 Transition Guide (#1627) * update flight.py * update retail.py * start working on guide * move guide to resources * more transition guide work on entitysets * update primitives and other info sections * update feature section * update dfs and cfm section * final draft updates * fix various spelling and capitalization issues * improve wording and hide cell * more pr clean up * additional context * relationship wording * various PR fixes and additions * Update docs/source/resources/transition_to_ft_v1.0.ipynb Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * Update docs/source/resources/transition_to_ft_v1.0.ipynb Co-authored-by: Jeff Hernandez <12969559+jeff-hernandez@users.noreply.github.com> * update link * move mapping and add link * update release notes * update why make these changes * update what has changed section * remove old comment * add table of significant changes * remove blank line * remove links * remove code formatting * fix table * uncomment code * remove code formatting from tables Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> Co-authored-by: Jeff Hernandez <12969559+jeff-hernandez@users.noreply.github.com> * update release notes * update release notes * Calculate feature matrix returns woodwork dataframe (#1664) * initialize ww on feature matrix * lint * add origin attribute * init ww on each partial feature matrix; update test answers * update primitive return logical types * fix some approximate tests * fix for Koalas append issue * bin cutoff times expects ww dataframe * make helper and fix concat in parallel_calcluate_chunks * make a copy of semantic tags before modifying * various test fixes + lint fix + revert to_pandas changes * copy semantic tags * fix test_boolean_multiply * fix test_koalas_dfs.py * make CumCount intergernullable * add more to_pandas casts to test * fix test_no_data_for_cutoff_time * lint fix * init ww on encoded feature matrix * lint fix * use None return type for NMostCommon * fix test_init_and_name * swap possible input order of PerentTrue to fix test * make answer data a dataframe for easier comparison * fix dask single table test * fix test_transform_consistency * fix test_dask_entityset_secondary_time_index * fix test_approximate_features test * fix test_features_only * fix category dtype check * new s3 urls for serialized objects * update to Week ordinal to account for 53 week years * add docstring and default args to get_ww_types * update woodwork syntax in get_ww_types_from_features * ordinal prims: fix order, specify order param * calculate_chunk: single concat and then init ww * update release notes * fix range in hour primitive * encode_features: use defaults in get_ww_types_from_features * update label leakage example in docs faq * make IsWeekend return type BooleanNullable * enable dask test for test_concat_with_lti * remove skip from koalas test * include labels in test feature matrix Co-authored-by: Nate Parsons <nate.parsons@alteryx.com> * Fix typos in transition guide (#1672) * fix typos in transition guide * Add release note * Fix foreign_key tag bug (#1675) * Add banner to all docs pages about upcoming 1.0 release (#1669) * add banner about FT1.0 * update release notes * Update docs/source/templates/layout.html Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * update banner message * update transition guide link * update wording Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * v0.27.1 (#1671) * v0.27.1 * update setup.py * fix foreign key tag bug * update release notes Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> * Merge code from main removing old code (#1679) * Automated Latest Dependency Updates (#1673) * Remove old categorical code (#1677) * remove old code * update release notes * spelling error * fix merge issue * reorg release notes * lint fix? * move future release Co-authored-by: machineFL <49695056+machineFL@users.noreply.github.com> * Remove unused utility functions (#1683) * remove unused _dataframes_equal function * remove unused camel_to_snake func * update release notes * lint fix * Update WW version to 0.8.0 (#1689) * bump woodwork requirement * update release notes * update other requirements files * Encode features - remove typecasting loop now handled by ww (#1694) * remove redundant coercion; woodwork init will cover this * update release notes * Update DFS to not build features on last time index columns (#1695) * bump woodwork requirement * don't build features on lti columns * update release notes * skip lti col in _add_identity_features instead * Review comments and commented code and clean up (#1701) * initial comment clean up * more clean up * update release notes * spelling fix * fix vlaues * Encode Features - prefer runtime over space if not inplace (#1699) * encode feats - concat once and skip drop if not inplace * use existing ww schema to skip infer on unchanged columns * update release notes * request the columns dictionary once * Bump Woodwork min version to 0.8.1 (#1702) * bump ww min version to 0.8.1 * update release notes * fix koalas file * combine release notes sections (#1703) * fix README dfs param Co-authored-by: Jeff Hernandez <12969559+jeff-hernandez@users.noreply.github.com> Co-authored-by: Roy Wedge <roy.wedge@alteryx.com> Co-authored-by: Tamar Grey <64278226+tamargrey@users.noreply.github.com> Co-authored-by: Gaurav Sheni <gvsheni@gmail.com> Co-authored-by: machineFL <49695056+machineFL@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Frances Hartwell <frances.hartwell@alteryx.com> Co-authored-by: Ethan Tu <34871276+tuethan1999@users.noreply.github.com> Co-authored-by: Pranav Simha <pranav0521@gmail.com> Co-authored-by: Tamar Grey <tamar.grey@alteryx.com> Co-authored-by: David Sanders <david.sanders@alteryx.com>
Closes #1578