Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP/ENH: adding support for categorial factors #527

Closed
wants to merge 2 commits into from

Conversation

dengemann
Copy link
Contributor

This relates to our recent discussion on the mailing list

  • added privat _recode function that internally recodes the x factor
  • added additional positional argument, a dict that allows the user to specify the remapping done by _recode
  • i would have prefered a kwarg but this however messes up the ax.plot below. The options i see whithin this approach are a) allowing users to optinoally pass a tuple like (x, x_levels). so no additional positional argument is required and b) explicitly passing a dict with plotting parameters instead of **kwargs

Wdyt?

- added privat _recode function that internally recodes the x factor
- added additional positional argument, a dict that allows the user to specify the remapping done by _recode
- i would have prefered a kwarg but this however messes up the ax.plot below. The options i see whithin this approach are a) allowing users to optinoally pass a tuple like (x, x_levels). so no additional positional argument is required and b) explicitly passing a dict with plotting parameters instead of **kwargs

Wdyt?
@@ -4,7 +4,7 @@
import utils


def interaction_plot(x, trace, response, func=np.mean, ax=None, plottype='b',
def interaction_plot(x, trace, response, x_levels, func=np.mean, ax=None, plottype='b',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we okay with adding args like this? I don't much mind, but it breaks backwards compatibility.

@jseabold
Copy link
Member

Do you think we really need the x_levels argument? Couldn't we just check the dtype in the plot and call _recode with some default levels e.g., range(n_unique)? Thoughts?

@dengemann
Copy link
Contributor Author

On 24.10.2012, at 23:21, Skipper Seabold notifications@github.com wrote:

Hi,

Do you think we really need the x_levels argument?

would be happy to drop it -- feels unnatural to me. although on the other hand side it's really explicit --- less magic

Couldn't we just check the dtype in the plot and call _recode with some default levels e.g., range(n_unique)? Thoughts?

yes, makes sense. the thing that made me hesitate with something like this was that users might associate certain levels with 'hierarchical' meaning, which might not always be obvious from looking at the data. Fo instance you might want to put your control condition on zero (left) and your treatment condition on one (right).
makes sense?

D


Reply to this email directly or view it on GitHub.

@jseabold
Copy link
Member

Sure. We can update the ticklabels with the categories though, so this may alleviate some of this - they'll never see the levels. Now if you really want to control treatment on left, etc. you might be better off rolling your own plot?

@dengemann
Copy link
Contributor Author

On 25.10.2012, at 00:02, Skipper Seabold notifications@github.com wrote:

Sure. We can update the ticklabels with the categories though, so this may alleviate some of this - they'll never see the levels.

yes, sure -- setting ticklabels from categorials would rock
Now if you really want to control treatment on left, etc. you might be better off rolling your own plot?

point taken.

Reply to this email directly or view it on GitHub.

@josef-pkt
Copy link
Member

just a generic comment:

It takes me 5 minutes to understand what the argument names mean, even with reading the doc string.

@dengemann
Copy link
Contributor Author

indeed, something got messed up in the doc string.

i'll update the commit in the course of the next days to reflect the current state of the discussion.
thanks!

On 25.10.2012, at 18:55, Josef Perktold notifications@github.com wrote:

just a generic comment:

It takes me 5 minutes to understand what the argument names mean, even with reading the doc string.


Reply to this email directly or view it on GitHub.

@dengemann
Copy link
Contributor Author

... or did you refer to the arg names in general (pre-commit)?

On 25.10.2012, at 18:55, Josef Perktold notifications@github.com wrote:

just a generic comment:

It takes me 5 minutes to understand what the argument names mean, even with reading the doc string.


Reply to this email directly or view it on GitHub.

@josef-pkt
Copy link
Member

in general, I think already before your changes.
Mainly I didn't understand what "trace" means, why we have a letter x, but y is "response"

(factor1, factor2, response)
(x1, x2_levels, response)
(endog, exog, groups)

in general: x1 could be continuous if we have continuous-categorical interaction.

I'm reading the function completely out of context and never tried it, so it's not obvious to me what this means, except for the basic doc string example.

I don't have a comment about the pull request directly, since I haven't figured out the levels and labels yet. (busy with other things.)

@dengemann
Copy link
Contributor Author

Closing this one, continued on clean PR.

@dengemann dengemann closed this Nov 14, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants