New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle non-string column names #255

Merged
merged 6 commits into from Sep 20, 2018

Conversation

Projects
None yet
3 participants
@Seth-Rothschild
Contributor

Seth-Rothschild commented Sep 13, 2018

close #254

This is an implementation of option 2 from #254 where we raise an error whenever we see a non str column name. There is also unit test with an integer column name to see that the error is raised.

Since this is one of a few possible solutions, it's marked as WIP until we determine which path we want to follow.

@Seth-Rothschild

This comment has been minimized.

Contributor

Seth-Rothschild commented Sep 13, 2018

Removing WIP. Note that the main check

if not isinstance(c, (''.__class__, u''.__class__)):

is one of many ways we might want to check for strings in both Python 2 and 3. @rwedge has suggested that we have used

if type(ret) != str:
    ret = ret.encode("utf-8")

elsewhere in Featuretools to ensure that we're consistently encoding things. This solution used fewer lines of code than a try/except block, but I don't have a strong preference for the way we should be doing this.

@Seth-Rothschild Seth-Rothschild changed the title from (WIP) Handle non-string column names to Handle non-string column names Sep 13, 2018

Seth-Rothschild added some commits Sep 13, 2018

@Seth-Rothschild Seth-Rothschild force-pushed the str-column-names branch from 4d69008 to cc846de Sep 20, 2018

@@ -1211,6 +1211,9 @@ def _import_from_dataframe(self,
r.child_entity.id == entity_id]
for c in dataframe.columns:
if not is_string(c):
raise ValueError("All column names must be strings. (Column has name {})".format(c))

This comment has been minimized.

@kmax12

kmax12 Sep 20, 2018

Member

Change language to "All column names must be strings (Column {} is not a string)"

@codecov-io

This comment has been minimized.

codecov-io commented Sep 20, 2018

Codecov Report

Merging #255 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #255      +/-   ##
==========================================
+ Coverage   94.15%   94.15%   +<.01%     
==========================================
  Files          71       71              
  Lines        7644     7652       +8     
==========================================
+ Hits         7197     7205       +8     
  Misses        447      447
Impacted Files Coverage Δ
featuretools/entityset/entityset.py 93.63% <100%> (+0.02%) ⬆️
featuretools/tests/entityset_tests/test_es.py 99.35% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6306975...b7c0062. Read the comment docs.

@kmax12

This comment has been minimized.

Member

kmax12 commented Sep 20, 2018

Looks good. Merging

@kmax12 kmax12 merged commit bb53faf into master Sep 20, 2018

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
license/cla Contributor License Agreement is signed.
Details

@Seth-Rothschild Seth-Rothschild deleted the str-column-names branch Sep 20, 2018

@kmax12 kmax12 referenced this pull request Sep 28, 2018

Merged

v0.3.1 #270

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment