Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing Data Frame From Geom #73

Closed
stefaneng opened this issue Oct 24, 2013 · 5 comments
Closed

Accessing Data Frame From Geom #73

stefaneng opened this issue Oct 24, 2013 · 5 comments

Comments

@stefaneng
Copy link

I don't know if this is possible, but I ran into a problem when trying to implement geom_boxplot, see pull request #39. I have the code commented to showcase the problem I am running into. Hopefully I am not just overlooking some simple way to access the original data frame passed into ggplot(...)

My goal is to be able to pass the whole data frame into:

http://pandas.pydata.org/pandas-docs/stable/visualization.html#box-plotting

as this is much simpler that trying to implement multiple box plots with

http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.boxplot

@jhaynes
Copy link

jhaynes commented Oct 24, 2013

Doing something like this works... but produces a blank plot first... haven't sorted out why.

class geom_boxplot(geom):
    VALID_AES = ['y', 'lower','middle','upper','x','ymax','ymin','alpha',
                 'colour','color','fill','linetype','shape','size','weight']


    def plot_layer(self, layer):
        layer = {k: v for k, v in layer.items() if k in self.VALID_AES}
        layer.update(self.manual_aes)

        # Option 1 (No Pandas)

        # Boxplot takes in an array or sequence of vectors
        # The goal is to group the 'y' values by grouped 'x'
        # Then take the transpose and give it to plt.boxplot(**layer)
        # If we do not have a y value, no need to change anything
        #        if "y" in layer:
        # list(set([...])) just gets unique elements
        #           unique = list(set(layer["x"]))            
        # Group y values in len(x) arrays, having values y.x
        # Store it back in x as a transpose.            
        #          pass

        # plt.boxplot(**layer)
        # Option 2 With Pandas and DataFrame.boxplot()
        # Need access to the DataFrame passed to ggplot(..)
        df = DataFrame({'y': layer['y']})
        if 'x' in layer:
            layer['by'] = 'x'
            df['x'] = layer['x']
            del layer['x']
        if 'y' in layer:
            layer['column'] = 'y'
            del layer['y']
        df.boxplot(**layer)

@glamp
Copy link
Contributor

glamp commented Oct 24, 2013

take a look at how geom_bar works. i think you can piggy-back off of that.

@jhaynes
Copy link

jhaynes commented Oct 25, 2013

Thanks, geom_bar looks like a good place to start and avoiding deep coupling with pandas is probably not a terrible idea.

This is sort off topic, but is there a reason why you opted to pass the values from aes around rather than the data frame and its column names?

@glamp
Copy link
Contributor

glamp commented Oct 25, 2013

Largely b/c the geoms have aes defined on an individual basis. That said, I don't think theres any reason data frames wouldn't work.

On Oct 25, 2013, at 9:38 AM, jhaynes notifications@github.com wrote:
Thanks, geom_bar looks like a good place to start and avoiding deep coupling with pandas is probably not a terrible idea.

This is sort off topic, but is there a reason why you opted to pass the values from aes around rather than the data frame and its column names?


Reply to this email directly or view it on GitHub.

@glamp
Copy link
Contributor

glamp commented Dec 22, 2013

thanks to @JanSchulz for this one!

@glamp glamp closed this as completed Dec 22, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants