Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle non-string column names #255

Merged
merged 6 commits into from Sep 20, 2018
Merged

Handle non-string column names #255

merged 6 commits into from Sep 20, 2018

Conversation

Seth-Rothschild
Copy link
Contributor

close #254

This is an implementation of option 2 from #254 where we raise an error whenever we see a non str column name. There is also unit test with an integer column name to see that the error is raised.

Since this is one of a few possible solutions, it's marked as WIP until we determine which path we want to follow.

@Seth-Rothschild
Copy link
Contributor Author

Removing WIP. Note that the main check

if not isinstance(c, (''.__class__, u''.__class__)):

is one of many ways we might want to check for strings in both Python 2 and 3. @rwedge has suggested that we have used

if type(ret) != str:
    ret = ret.encode("utf-8")

elsewhere in Featuretools to ensure that we're consistently encoding things. This solution used fewer lines of code than a try/except block, but I don't have a strong preference for the way we should be doing this.

@Seth-Rothschild Seth-Rothschild changed the title (WIP) Handle non-string column names Handle non-string column names Sep 13, 2018
@@ -1211,6 +1211,9 @@ def _import_from_dataframe(self,
r.child_entity.id == entity_id]

for c in dataframe.columns:
if not is_string(c):
raise ValueError("All column names must be strings. (Column has name {})".format(c))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change language to "All column names must be strings (Column {} is not a string)"

@codecov-io
Copy link

codecov-io commented Sep 20, 2018

Codecov Report

Merging #255 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #255      +/-   ##
==========================================
+ Coverage   94.15%   94.15%   +<.01%     
==========================================
  Files          71       71              
  Lines        7644     7652       +8     
==========================================
+ Hits         7197     7205       +8     
  Misses        447      447
Impacted Files Coverage Δ
featuretools/entityset/entityset.py 93.63% <100%> (+0.02%) ⬆️
featuretools/tests/entityset_tests/test_es.py 99.35% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6306975...b7c0062. Read the comment docs.

@kmax12
Copy link
Contributor

kmax12 commented Sep 20, 2018

Looks good. Merging

@kmax12 kmax12 merged commit bb53faf into master Sep 20, 2018
@Seth-Rothschild Seth-Rothschild deleted the str-column-names branch September 20, 2018 19:33
@kmax12 kmax12 mentioned this pull request Sep 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integer column names in dataframe
3 participants