Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added methods Table.with_relabeling and Table.with_column #126

Merged
merged 5 commits into from
Oct 14, 2015
Merged

Conversation

ericz82
Copy link
Contributor

@ericz82 ericz82 commented Oct 12, 2015

@SamLau95
Copy link
Member

Cool! Couple things:

  • Let's add documentation and tests for these methods
  • Let's make these methods also be able to take in a sequence as well as an individual value. (Eg. with_relabeling(['A', 'B', 'C'], ['a', 'b', 'c']) should relabel all the columns specified.)

@ericz82
Copy link
Contributor Author

ericz82 commented Oct 12, 2015

How do I add tests?

@mDibyo
Copy link
Contributor

mDibyo commented Oct 12, 2015

you can add tests at tests/test_tables.py

@@ -395,6 +395,13 @@ def append_column(self, label, values):
self._columns[label] = values
return self

def with_column(self, label, values):
"""Returns a new table with new column included."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring is rather vague.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what does it add above append_column? If this is just to make a copy, we should rather implement table.copy().

@ericz82
Copy link
Contributor Author

ericz82 commented Oct 13, 2015

Should I change Table.relabel to also take in a sequence of arguments?

@stefanv
Copy link
Contributor

stefanv commented Oct 13, 2015

@ericz82 I think it already does (according to the example you gave). It should simply be documented that way.

@ericz82
Copy link
Contributor Author

ericz82 commented Oct 13, 2015

@stefanv Hmm, I'm not sure what example you are referring to. I wrote a test for Table.relabel taking in a list for column_label and a list for new_label which does not seem to be passing. I believe Table.relabel should be changed accordingly.

@stefanv
Copy link
Contributor

stefanv commented Oct 13, 2015

Ah! But there you go. You should just rework relabel to have the functionality you want, instead of adding a whole new method.

@ericz82
Copy link
Contributor Author

ericz82 commented Oct 13, 2015

Yeah, that's what I decided on doing. Everything should be good now.

Table.relabel
Table.with_relabeling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These belong in the Transformation section because they return a new table instead of mutating the original!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SamLau95 Should I also change the locations of these methods in tables.py?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Please do that.

@stefanv
Copy link
Contributor

stefanv commented Oct 13, 2015

So, API question: is the new method added here then equivalent to:

t1= t0.copy()
t1.relabel(['a','b','c'], ['A','B','C'])

? If so, is it worth expanding the API? (I have no idea how common such an operation is.)

if isinstance(column_label, str) and isinstance(new_label, str):
column_label, new_label = [column_label], [new_label]
d = dict(zip(column_label, new_label)) # dictioanry to map old labels to new ones
for col in d.keys():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this dictionary old_to_new instead of d since it'll help someone who's reading this for the first time. Also, dictionaries iterate over their keys by default so you can more succinctly write for col in old_to_new:.

nit: You misspelled dictioanry!

@SamLau95
Copy link
Member

@stefanv I think it is worth expanding the API in this case because method chaining is much easier with these two methods. For example, this is now possible:

table.select(['x', 'y'])\
    .with_column('z', 0)\
    .with_relabeling(['x', 'y', 'z'], ['a', 'b', 'c'])\
    .group(...) # etc.

In addition, I believe we're going to deprecate the mutation methods (or at least not teach them) in favor of these new ones because of the issues we face with students running cells multiple times. See #114

@stefanv
Copy link
Contributor

stefanv commented Oct 13, 2015

@SamLau95 Thanks, that makes sense. I think there are a couple of sneaky places in which data mutation can still accidentally happen. If this is a decided policy (never to mutate), then we can start hunting for those.

@SamLau95
Copy link
Member

@stefanv I'm not sure if it's a decided policy but it definitely helps a lot with reasoning about tables, especially for students that are just starting to learn programming. Originally we were fine with mutation but I think the de facto policy now is just to replace mutation with a pure function alternative as much as we can.

@SamLau95
Copy link
Member

Thanks for the hard work on this @ericz82 !

SamLau95 added a commit that referenced this pull request Oct 14, 2015
added methods Table.with_relabeling and Table.with_column
@SamLau95 SamLau95 merged commit 7369d60 into master Oct 14, 2015
@SamLau95 SamLau95 deleted the eric/docs branch October 14, 2015 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants