Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrected Issue with normalized time index and updated testing #768

Merged
merged 26 commits into from
Oct 17, 2019

Conversation

ablacke-ayx
Copy link
Contributor

Pull Request Description

Fixes #750 - Stop user from removing time index from base_entity using the additional_variables


After creating the pull request: in order to pass the changelog_updated check you will need to update the "Future Release" section of docs/source/changelog.rst to include this pull request.

@CLAassistant
Copy link

CLAassistant commented Oct 9, 2019

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link

codecov bot commented Oct 9, 2019

Codecov Report

Merging #768 into master will increase coverage by <.01%.
The diff coverage is 91.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #768      +/-   ##
==========================================
+ Coverage   97.97%   97.97%   +<.01%     
==========================================
  Files         119      119              
  Lines       10772    10779       +7     
==========================================
+ Hits        10554    10561       +7     
  Misses        218      218
Impacted Files Coverage Δ
featuretools/entityset/entityset.py 95.96% <100%> (+0.29%) ⬆️
featuretools/tests/entityset_tests/test_es.py 99.8% <85.71%> (-0.2%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 235f065...0cdea56. Read the comment docs.

make_time_index='signup_date',
additional_variables=['signup_date'],
copy_variables=[])
assert "signup_date" in es["customers"].df.columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will not get hit due to the previous one raising an error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

Copy link
Contributor

@rwedge rwedge Oct 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalize_entity call will raise an error, which stops code execution within the with statement. If we wanted the final assert statement to be run after the error to confirm we didn't remove the signup date column, we would need to move the assert out of the with pytest.raises block


for v in additional_variables:
if v == base_entity.time_index:
raise ValueError("Not moving {} as it is the base time index variable.".format(v))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's suggest ways the user can resolve the error, either by setting make_time_index to True so it copies the base entity's time index by default, or by including the variable in copy_variables

@@ -13,7 +14,7 @@ Changelog
* Testing Changes

Thanks to the following people for contributing to this release:
:user:`kmax12`, :user:`rwedge`
:user:`kmax12`, :user:`rwedge`,:user:`ablacke-ayx`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a space between the comma and your user tag

Copy link
Contributor

@rwedge rwedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you commit a version of the frequently asked questions notebook with the cell outputs cleared?

It's easier to track changes to the contents of the cells when the output is removed.

@ablacke-ayx
Copy link
Contributor Author

ablacke-ayx commented Oct 11, 2019 via email

@@ -238,7 +234,7 @@
" new_entity_id=\"sessions\",\n",
" index=\"session_id\",\n",
" make_time_index=\"session_start\",\n",
" additional_variables=[\"session_start\"])\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is ok to use additional_variables here since in this example "session_start" is not the time index of the base entity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.

if isinstance(make_time_index, str):
if make_time_index not in base_entity.df.columns:
raise ValueError("'make_time_index' must be a variable in the base entity")
elif make_time_index not in additional_variables + copy_variables:
raise ValueError("'make_time_index' must specified in 'additional_variables' or 'copy_variables'")
elif make_time_index not in copy_variables:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change may prevent moving a column that is not the base entity's time index to the new entity and making it the new time index, which should be allowed

@@ -142,7 +142,7 @@ To finish preparing this dataset, create a "customers" entity using the same met
new_entity_id="customers",
index="customer_id",
make_time_index="join_date",
additional_variables=["zip_code", "join_date"])
additional_variables=["zip_code","join_date"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be a space after the comma on this line

additional_variables=[],
copy_variables=['cancel_date'])

es = copy.deepcopy(es)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes a copy of the newly modified entityset that has a cancellations entity

Copy link
Contributor

@rwedge rwedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@ablacke-ayx ablacke-ayx merged commit c4c7f94 into master Oct 17, 2019
@rwedge rwedge mentioned this pull request Oct 31, 2019
@rwedge rwedge deleted the Issue#750 branch February 19, 2021 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Normalize entity bug: putting base entity time index in additional_variables
3 participants