# The Pokemon Cookbook
This cookbook teaches you the concepts of the InfluxDB 3.0 Python Client library using a novel example of Pokemon data. The scenerio is to keep track of each trainer and the number of different pokemon they have caught.

<p align="center">
<img height="300" src="https://www.nicepng.com/png/full/62-622961_no-one-knows-if-people-eat-pokmon-png.png">
</p>

In [45]:

# Here we include all the imports required from influxdb_client_3 
from influxdb_client_3 import InfluxDBClient3, InfluxDBError, WriteOptions, write_client_options
import pandas as pd
import random
from IPython.display import display, HTML

## Client Setup
Now that we have done the inital configurations of our write paramters its time to include these within our client initalization. The InfluxDB 3.0 Client can both write and query data. For now we will use it to write data based upon our configuration.

In [46]:


client = InfluxDBClient3(
    token="",
    host="eu-central-1-1.aws.cloud2.influxdata.com",
    org="6a841c0c08328fb1", database="pokemon-codex")



## Querying Data
We have now stored 1000 registered pokemon catches within InfluxDB. We can now query this data using the InfluxDB 3.0 Python Client to gain some insights into our data. We are going to use Plotly to visualise our data.

In [47]:
# These are just some library imports for Plotly so we can make use of the interactive graphs.
import plotly.express as px
import plotly.io as pio
pio.renderers.default = "vscode"

# Lets start with a simple query to understand our schema.
query = '''SHOW COLUMNS FROM caught'''

# We can use the query method to run a query against the database.
# Under the hood this creates a flight ticket and uses the FlightClient to run the query. 
# For this example we are using the pandas mode, which will return a pandas DataFrame.
# Language also allows us to specify the query language, in this case we are using SQL.

table = client.query(query=query, language='sql', mode='pandas')

display(table)

Unnamed: 0,table_catalog,table_schema,table_name,column_name,data_type,is_nullable
0,public,iox,caught,attack,Int64,YES
1,public,iox,caught,defense,Int64,YES
2,public,iox,caught,hp,Int64,YES
3,public,iox,caught,id,"Dictionary(Int32, Utf8)",YES
4,public,iox,caught,level,Int64,YES
5,public,iox,caught,name,Utf8,YES
6,public,iox,caught,num,"Dictionary(Int32, Utf8)",YES
7,public,iox,caught,speed,Int64,YES
8,public,iox,caught,time,"Timestamp(Nanosecond, None)",NO
9,public,iox,caught,trainer,"Dictionary(Int32, Utf8)",YES


### Simple InfluxQL Query
The first query we will run is a simple InfluxQL query to get the number of pokemon caught by each trainer. We will then use Plotly to visualise this data.

In [48]:
import polars as pl
# Lets start with a simple query to understand our schema.
query = '''SELECT count("name") FROM caught WHERE time > now() - 6h GROUP BY trainer'''

# We can use the query method to run a query against the database.
# Under the hood this creates a flight ticket and uses the FlightClient to run the query. 
# For this example we are using the pandas mode, which will return a pandas DataFrame.
# Language also allows us to specify the query language, in this case we are using SQL.
table = client.query(query=query, language='influxql', mode='all')

pl_df = pl.from_arrow(table)
display(pl_df)

fig1 = px.bar(pl_df, x=pl_df["trainer"], y=pl_df["count"],color=pl_df["trainer"] ,title='Number of Pokémon caught in the last hour')
fig1.show()

iox::measurement,time,trainer,count
str,datetime[ns],str,i64
"""caught""",1970-01-01 00:00:00,"""ash""",38465
"""caught""",1970-01-01 00:00:00,"""brock""",38175
"""caught""",1970-01-01 00:00:00,"""gary""",38657
"""caught""",1970-01-01 00:00:00,"""james""",38269
"""caught""",1970-01-01 00:00:00,"""jessie""",38473
"""caught""",1970-01-01 00:00:00,"""misty""",38212


In [49]:
# Lets start with a simple query to understand our schema.
query = '''SELECT count("name") FROM caught WHERE time > now() - 1h GROUP BY trainer,type1'''

# We can use the query method to run a query against the database.
# Under the hood this creates a flight ticket and uses the FlightClient to run the query. 
# For this example we are using the pandas mode, which will return a pandas DataFrame.
# Language also allows us to specify the query language, in this case we are using SQL.
table = client.query(query=query, language='influxql' , mode='all')

pl_df = pl.from_arrow(table)
display(pl_df)

fig2 = px.bar(pl_df, x=pl_df["trainer"], y=pl_df["count"], color=pl_df["type1"], barmode= 'group', title='Number of Pokémon caught in the last hour grouped by type')
fig2.show()

iox::measurement,time,trainer,type1,count
str,datetime[ns],str,str,i64


ValueError: Cannot accept list of column references or list of columns for both `x` and `y`.

In [None]:
query='''SELECT count("name") FROM caught WHERE time > now() - 24h GROUP BY time(5m),trainer'''

table = client.query(query=query, language='influxql' , mode='all')

pl_df = pl.from_arrow(table)
display(pl_df)

fig4 = px.line(pl_df, x=pl_df["time"], y=pl_df["count"], color=pl_df["trainer"], title='Number of Pokémon caught in the last hour grouped by trainer and time')
fig4.show()

In [50]:
import time
import pyarrow as pa
class CodeTimer:
    def __init__(self):
        self.start_time = None
        self.end_time = None

    def start(self):
        self.start_time = time.time()

    def stop(self):
        if self.start_time is None:
            raise Exception("Timer has not been started. Use .start() method to start it.")

        self.end_time = time.time()
        elapsed_time = self.end_time - self.start_time
        print(f"Elapsed time: {elapsed_time} seconds")

query='''SELECT time as timestamp, attack FROM caught'''

table = client.query(query=query, language='sql' , mode='all')


print("Pandas == Arrow Table Conversion")
timer = CodeTimer()
timer.start()
df = table.to_pandas()
timer.stop()
df = df.set_index("timestamp")


print("Polars == Arrow table Conversion ")
timer = CodeTimer()
timer.start()
pdf = pl.from_arrow(table)
timer.stop()

print("Pandas == Resampling")
timer = CodeTimer()
timer.start()
result_pandas = df.resample('1T').mean()
timer.stop()

pdf = pdf.sort("timestamp")
print("Polars == Resampling")
timer = CodeTimer()
timer.start()
result_polars = pdf.group_by_dynamic(
    index_column='timestamp', 
    every='1h', 
    closed='left', 
    include_boundaries=True
).agg([
    pl.col('attack').mean().alias('mean_value')
])
timer.stop()

print("Pandas == to Arrow ")
timer = CodeTimer()
timer.start()
table = pa.Table.from_pandas(result_pandas)
timer.stop()

print("Polars == to Arrow")
timer = CodeTimer()
timer.start()
table=result_polars.to_arrow()
timer.stop()

Pandas == Arrow Table Conversion
Elapsed time: 0.00740504264831543 seconds
Polars == Arrow table Conversion 
Elapsed time: 0.009402990341186523 seconds
Pandas == Resampling
Elapsed time: 0.3618597984313965 seconds
Polars == Resampling
Elapsed time: 0.003142118453979492 seconds
Pandas == to Arrow 
Elapsed time: 0.006847858428955078 seconds
Polars == to Arrow
Elapsed time: 0.00023412704467773438 seconds


# Conclusion
We have now covered the basics of the InfluxDB 3.0 Python Client. I hope you found this novel cook book informative and fun. If you have any questions, bugs or feature requests please raise an issue on the [GitHub repo](https://github.com/InfluxCommunity/influxdb3-python/issues)


<p align="center">
<img height="100", width="100" src="https://i.pinimg.com/originals/18/15/44/181544facabe62d30c52e94b369f0f3a.png">
</p>