-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data transformation from aes strings #188
Comments
#136 should also be fixed when working on this... |
I'm not familiar with what you're trying to do here, so hard to make recommendations :-) What's the intended semantics of a string like "a - np.max(a)" or "a + 1"? Is it supposed to be an arithmetic transformation, or a way to list several variables (like in patsy formulas) or...? If you want something that has the abstract structure of a formula -- i.e., a set of individual "terms", each of which is an "interaction" of "factors" -- then patsy is probably the way to go, and if you want to avoid categorical coding and such you can work with the lower level interfaces like ModelDesc directly, instead of using dmatrix. If these strings are supposed to be arithmetic operations, then the easiest approach is probably not to go through patsy's formula parser, and not to try transforming the code either, but instead just use Python's
Now Anyway, then when we want to apply a transformation, we do
and we get the value of that expression, where variables are first looked for in ...does any of this help? |
@njsmith the last para sounds exactly what I want to do :-) Thanks a lot! |
reminder: look at #171 for examples which should work |
Data transformation in aes (`aes(x="np.log(column)")' now uses patsy.eval.EvalEnvironment. This should enable things like `np.log(column)`. Closes: yhat#188 The changes also let some bug in the current unittests show up: setting a aes mapping (`aes(fill=True)`) was considered equivalent to setting this values in the geom `geom_density(fill=True)`). Now this will result in the same weired result as in ggplot (if we would have already implemented fill... -> yhat#191). The affected unittests (test_basic.py, test_readme_examples.py) were changed. Also implement `__depcopy__()` for `aes` and ´ggplot` to not deepcopy the needed eval environment as deepcopy failed with the above change. ggplot deepcopy now does *not* copy the dataframe, so this should result in some speedups. Also adjusted the unittest in test_geom.py to fit this new model. Closes: yhat#145 Added unittests (test_ggplot_internals.py) to make sure that the original data is not changed and also that no data is changed after a geom addition.
Thsi was merged... |
Data transformation in aes (`aes(x="np.log(column)")' now uses patsy.eval.EvalEnvironment. This should enable things like `np.log(column)`. Closes: yhat/ggpy#188 The changes also let some bug in the current unittests show up: setting a aes mapping (`aes(fill=True)`) was considered equivalent to setting this values in the geom `geom_density(fill=True)`). Now this will result in the same weired result as in ggplot (if we would have already implemented fill... -> #191). The affected unittests (test_basic.py, test_readme_examples.py) were changed. Also implement `__depcopy__()` for `aes` and ´ggplot` to not deepcopy the needed eval environment as deepcopy failed with the above change. ggplot deepcopy now does *not* copy the dataframe, so this should result in some speedups. Also adjusted the unittest in test_geom.py to fit this new model. Closes: yhat/ggpy#145 Added unittests (test_ggplot_internals.py) to make sure that the original data is not changed and also that no data is changed after a geom addition.
Right now this doesn't work:
I would very much use https://github.com/pydata/patsy/, docs: http://patsy.readthedocs.org/en/latest/formulas.html#the-formula-language
.
Unfortunately I haven't found a way to support strings/factors without getting dummy coding, but numeric columns work:
CC: @njsmith: do you have any ideas?
The text was updated successfully, but these errors were encountered: