Skip to content
This repository has been archived by the owner on Apr 22, 2024. It is now read-only.

DataFrameClient Aggregated query not able to see time column label and tag columns #785

Closed
ashishkaransingh opened this issue Feb 7, 2020 · 11 comments · Fixed by #827
Closed
Assignees

Comments

@ashishkaransingh
Copy link

ashishkaransingh commented Feb 7, 2020

  • InfluxDB version: 1.7.9
  • InfluxDB-python version: 5.2.3
  • Python version: 3.7.4
  • Operating system version: Windows 10

Can someone please help.
Using Jupyter Notebook (anaconda3).
I do not see label "time" nor other tag "host" and "instance".
See df.columns only returned field "Percent_Processor_Time"
I am pulling data using influxdb DataFrameClient.

image

@russorat
Copy link
Contributor

@ashishkaransingh thanks for opening this! I've pinged some coworkers that have used Influx and Juptyer notebooks to see if they can help out.

@rhajek
Copy link
Contributor

rhajek commented Feb 19, 2020

Hi, I made little investigation of this issue and it looks like the query result is not properly converted into DataFrames. In your case the query returns multiple time series (DataFrames), each for host/ instance combination and the tag values are missing in DataFrames. Tag columns should be marked as indexes and values should be present in DataFrame. Then it will be possible to join multiple DataFrames using pd.concat(datasets, axis=0, sort=False)

Here is the link to problematic implementation _to_dataframe method -

key = (name, tuple(sorted(tags.items())))

The new InfluxDB 2.0 python client (query using Flux language) https://github.com/influxdata/influxdb-client-python works correctly.

Possible workaround is to use simple query like "SELECT usage_user, host, cpu FROM "telegraf"."autogen"."cpu" WHERE time > now() - 1m" and group data in pandas or try new InfluxDB 2.0 beta and new client library.

@ashishkaransingh
Copy link
Author

ashishkaransingh commented Feb 20, 2020

@ashishkaransingh thanks for opening this! I've pinged some coworkers that have used Influx and Juptyer notebooks to see if they can help out.

@russorat Sorry for the late reply and thank you getting traction on this!

@ashishkaransingh
Copy link
Author

@rhajek Thank you so much for pointing out the problematic implementation _to_dataframe method.

Even simple queries like the one mentioned below fails to get tag = time but does get "host"!

SELECT Percent_Processor_Time, host FROM "Processor" WHERE time > now() - 15d AND instance = '_Total'
Or
SELECT Percent_Processor_Time, host, time FROM "Processor" WHERE time > now() - 15d AND instance = '_Total'

image

@ashishkaransingh
Copy link
Author

@rhajek
Yeah i cannot use flux in production anytime soon.

@sebito91
Copy link
Contributor

Working on a fix for this now, thanks for the report!

@TimothyDalbey
Copy link

Same issue. Any ETA?

@ashishkaransingh
Copy link
Author

Working on a fix for this now, thanks for the report!

Thank you!

@timhallinflux
Copy link

timhallinflux commented May 14, 2020

For the time being, upgrade to 1.8.x and use the new python client. Since 1.8 supports the v2 API...you can use the v2 python client starting with the 1.8 release.

rolincova added a commit to rolincova/influxdb-python that referenced this issue May 28, 2020
rolincova added a commit to rolincova/influxdb-python that referenced this issue Jun 1, 2020
rolincova added a commit to rolincova/influxdb-python that referenced this issue Jun 1, 2020
rolincova added a commit to rolincova/influxdb-python that referenced this issue Jun 1, 2020
rolincova added a commit to rolincova/influxdb-python that referenced this issue Jun 1, 2020
rolincova added a commit to rolincova/influxdb-python that referenced this issue Jun 1, 2020
rolincova added a commit to rolincova/influxdb-python that referenced this issue Jun 1, 2020
rolincova added a commit to rolincova/influxdb-python that referenced this issue Jun 2, 2020
rolincova added a commit to rolincova/influxdb-python that referenced this issue Jun 2, 2020
russorat added a commit that referenced this issue Jun 3, 2020
…ndexes

Fix: add support for custom indexes for query in the DataFrameClient (#785)
@reactcker
Copy link

@russorat thanks for the update, i have one question.
How can I store the output to a new measurement lets say "Total_Processor"

curl -XPOST localhost:8086/api/v2/query -sS
-H ‘Accept:application/csv’
-H ‘Content-type:application/vnd.flux’
-d ‘from(bucket:“test”)
|> range(start:-15m)
|> filter(fn:® => r._measurement == “Processor” and
r._field == "Percent_Processor_Time"and
r.instance == “_Total”)’

Not sure how if this can be used and how to implement this:
https://v2.docs.influxdata.com/v2.0/reference/flux/stdlib/built-in/outputs/to/#output-data-requirements

Thanks

@russorat
Copy link
Contributor

@reactcker Thanks, you can use the set function to change the name of the measurement in the query. https://v2.docs.influxdata.com/v2.0/reference/flux/stdlib/built-in/transformations/set/

then you can use the to function to write it back to the database: https://v2.docs.influxdata.com/v2.0/reference/flux/stdlib/built-in/outputs/to/

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
9 participants