Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the index of a data frame as a (default) x coordinate when providing a data source #4580

Closed
Krastanov opened this issue Jun 19, 2016 · 8 comments

Comments

@Krastanov
Copy link

For some data frames the index has quite useful semantic meaning. Particularly so for datetime indices. It would be useful if the api has an option to use the index as a coordinate.

Example given here: https://stackoverflow.com/questions/37904231/plot-a-pandas-dataframe-using-the-dataframe-index-for-x-coordinate-in-bokeh/37905026#37905026

@birdsarah
Copy link
Member

I believe this is solved, can you try the latest dev build

@Krastanov
Copy link
Author

I tried to pip install from github (pip install --upgrade git+...), and I also tried to clone and then do pip install -e . from the bokeh folder, but both of those hang up. Is this an unrelated bug? Or is this type of pip dev installation unsupported? In short: I am sorry, I was unable to set up the dev environment and it seems it will take me some time (I will try later in the week).

Could you point me to the reference for the dev branch (if it is uploaded somewhere)? You mentioned that this feature is already implemented, but you did not say what the proper arguments for the function call are.

@bryevdv
Copy link
Member

bryevdv commented Jun 19, 2016

Pip install from github will not work because pip is not capable of building the coffeescript and typescript components. There are simple instructions for installing pre-built dev builds in the docs:

http://bokeh.pydata.org/en/latest/docs/installation.html#developer-builds

@Krastanov
Copy link
Author

Thanks for the pointer.

It is still not working in the dev release:

ds = ColumnDataSource(df) # contains a datatime index and the column `avg`
p = figure(x_axis_type='datetime')
p.square(source=ds, y='avg')
output_notebook()
show(p)

The code above produces a plot where all data source entries are in the correct y position, but all of them are at x=0. How can I specify that the x coordinate should be the data source index?

@bryevdv
Copy link
Member

bryevdv commented Jun 26, 2016

does x="index" work? I can't remember, ping @fpliger

@Krastanov
Copy link
Author

x='index' gives this error message in jupyter:

Javascript error adding output!
Error: attempted to retrieve property array for nonexistent field 'index'
See your browser Javascript console for more details.

@canavandl
Copy link
Contributor

canavandl commented Jun 30, 2016

Hi @Krastanov

The issue is that you have to specify which column should be the "x" column. If you don't specify the "x" value, the default behavior in bokeh.plotting is to try to find a column called "x" in your ColumnDataSource (which doesn't exist).

One tricky thing here is that you're using a named index ('timestamp') in pandas. That name is carried over when you create a ColumnDataSource, so that your source probably looks like:

ds = ColumnDataSource(df)
print(ds.data) # this lets you inspect the data dict
# the ts_n values would be the actual timestamps from the df
> {'timestamp': [ts_1, ts_2, ts_3, ts_4, ts_5], 'avg': [0.9, 0.8, 0.7, 0.8, 0.9]}

It would work if you use:

p.line(source=ds, x='timestamp', y='avg')

@Krastanov
Copy link
Author

I see, I did not know that indices (as opposed to normal columns) can be named. Thanks.

@damianavila damianavila added this to the 0.12.1 milestone Jul 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants