New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure relationship variables are the same underlying dtype #155

Merged
merged 5 commits into from May 24, 2018

Conversation

Projects
None yet
3 participants
@bschreck
Contributor

bschreck commented May 24, 2018

If a parent entity has an index with integer data, and the linking variable in the child has object/string data, this will convert the child variable to integers. This is necessary for the new version of pandas (0.23), because merges on mixed dtypes are not allowed.

bschreck added some commits May 24, 2018

@bschreck bschreck requested a review from rwedge May 24, 2018

@codecov-io

This comment has been minimized.

codecov-io commented May 24, 2018

Codecov Report

Merging #155 into master will increase coverage by 0.03%.
The diff coverage is 93.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #155      +/-   ##
==========================================
+ Coverage   92.59%   92.62%   +0.03%     
==========================================
  Files          72       72              
  Lines        7722     7729       +7     
==========================================
+ Hits         7150     7159       +9     
+ Misses        572      570       -2
Impacted Files Coverage Δ
featuretools/tests/testing_utils/mock_ds.py 87.21% <100%> (+0.19%) ⬆️
featuretools/entityset/entity.py 90.8% <100%> (ø) ⬆️
featuretools/tests/entityset_tests/test_es.py 98.9% <100%> (+3.24%) ⬆️
featuretools/entityset/base_entityset.py 90.76% <87.5%> (-0.26%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ee28119...2ab6766. Read the comment docs.

if ((is_object_dtype(parent_e.df[parent_v]) or
is_string_dtype(parent_e.df[parent_v])) and
is_numeric_dtype(child_e.df[child_v])):
parent_e.df[parent_v] = pd.to_numeric(self[parent_e.id].df[parent_v])

This comment has been minimized.

@rwedge

rwedge May 24, 2018

Contributor

The self[parent_e.id] here can be changed to parent_e.df

target_entity="sessions",
agg_primitives=["mean", "sum", "mode"],
trans_primitives=["month", "hour"],
max_depth=2)

This comment has been minimized.

@rwedge

rwedge May 24, 2018

Contributor

A little weird that this test has no assertions. Maybe assert that the normalize successfully converted both parent and child values to matching types? Or maybe the test should go in test_es:test_add_relationship_converts_types

bschreck and others added some commits May 24, 2018

@bschreck bschreck merged commit b75bc92 into master May 24, 2018

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
license/cla Contributor License Agreement is signed.
Details

@rwedge rwedge referenced this pull request May 30, 2018

Merged

v0.1.21 #160

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment