Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor Issue: Incorrect subplot syntax in timePlot function #40

Open
Tanvi-Jain01 opened this issue Jun 17, 2023 · 2 comments
Open

Minor Issue: Incorrect subplot syntax in timePlot function #40

Tanvi-Jain01 opened this issue Jun 17, 2023 · 2 comments

Comments

@Tanvi-Jain01
Copy link

Tanvi-Jain01 commented Jun 17, 2023

timePlot: ValueError in subplot function

While I was exploring this library, I tried timePlot where my dataframe, pollutants and datatype was correct, but I was still facing the error of plots for my pollutants.

Code:

import numpy as np
import pandas as pd
np.random.seed(42)  

start_date = pd.to_datetime('2022-01-01')
end_date = pd.to_datetime('2022-12-31')

dates = pd.date_range(start_date, end_date)

pm25_values = np.random.rand(365)
ws_values = np.random.rand(365)
wd_values = np.random.rand(365)

df = pd.DataFrame({
    'date': dates,
    'pm25': pm25_values,
    'ws': ws_values,
    'wd': wd_values
})

df['date'] = df['date'].dt.strftime('%Y-%m-%d')  # Convert date format to 'YYYY-MM-DD'

print(df)

from vayu.timePlot import timePlot
axes=timePlot(df,'2022', 5)

Error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 3
      1 from vayu.timePlot import timePlot
----> 3 axes=timePlot(df,'2022', 5)

File ~\anaconda3\lib\site-packages\vayu\timePlot.py:46, in timePlot(df, year, month, pollutants)
     43 color = color_list[ix % len(color_list)]
     45 # plotting
---> 46 plt.subplot(f"{len(pollutants)}1{ix}")
     47 a = values.plot.line(color=color)
     48 a.axes.get_xaxis().set_visible(False)

File ~\anaconda3\lib\site-packages\matplotlib\pyplot.py:1323, in subplot(*args, **kwargs)
   1320 fig = gcf()
   1322 # First, search for an existing subplot with a matching spec.
-> 1323 key = SubplotSpec._from_subplot_args(fig, args)
   1325 for ax in fig.axes:
   1326     # if we found an Axes at the position sort out if we can re-use it
   1327     if ax.get_subplotspec() == key:
   1328         # if the user passed no kwargs, re-use

File ~\anaconda3\lib\site-packages\matplotlib\gridspec.py:573, in SubplotSpec._from_subplot_args(figure, args)
    571     return arg
    572 elif not isinstance(arg, Integral):
--> 573     raise ValueError(
    574         f"Single argument to subplot must be a three-digit "
    575         f"integer, not {arg!r}")
    576 try:
    577     rows, cols, num = map(int, str(arg))

ValueError: Single argument to subplot must be a three-digit integer, not '510'
<Figure size 640x480 with 0 Axes>

Issue-1:

Source Code:

plt.subplot(f"{len(pollutants)}1{ix}")

Explaination:

The above code line prevents the proper execution of the code.
The plt.subplot() function expects the subplot number to be specified as three separate integers: the number of rows, the number of columns, and the index of the current subplot.

Solution:

plt.subplot(len(pollutants), 1, ix+1)

The above code line will make the execution correct.

Issue-2:

I copied the source code of timePlot from vayu and I was trying to create a solution where I again got an error which is explained below.

Code:

import pandas as pd
import matplotlib.pyplot as plt

def timePlot(df, year, month,  pollutants=["ws", "nox", "o3", "pm25", "pm10"]):
    # Cuts the df down to the month specified
    df.index = pd.to_datetime(df.date)
    df_n_1 = df[(df.index.month == int(month)) & (df.index.year == int(year))]
    #print(df_n_1)

    fig, axs = plt.subplots(len(pollutants), 1, figsize=(10, 8), sharex=True)
   

    for ax, pollutant in zip(axs, pollutants):
        values = df_n_1[pollutant]
        

        # plotting
        ax.plot(values.index, values.values)
        ax.yaxis.set_label_position("right") 
        ax.set_ylabel(pollutant)
        plt.xticks(rotation=45)
        plt.yticks(rotation=45)
        
timePlot(df,'2022', 5, pollutants=['pm25'])

Error:

ypeError                                 Traceback (most recent call last)
Cell In[14], line 24
     21         plt.xticks(rotation=45)
     22         plt.yticks(rotation=45)
---> 24 timePlot(df,'2022', 5, pollutants=['pm25'])

Cell In[14], line 13, in timePlot(df, year, month, pollutants)
      8 #print(df_n_1)
     10 fig, axs = plt.subplots(len(pollutants), 1, figsize=(10, 8), sharex=True)
---> 13 for ax, pollutant in zip(axs, pollutants):
     14     values = df_n_1[pollutant]
     17     # plotting

TypeError: 'Axes' object is not iterable

Explaination:

The zip() function in Python expects multiple iterables as its arguments. In the code I provided above, when plt.subplots() returns a single axs object (when there is only one subplot), the axs variable becomes that single Axes object. It treats that axs object as an iterable of individual items, resulting in the TypeError that I encountered.

Solution:

By converting axs into a list using np.atleast_1d(), you ensure that axs is always treated as a list, regardless of whether it contains a single Axes object or multiple Axes objects. This allows the for loop to correctly iterate over the elements of axs, whether it's a single Axes object or a list of Axes objects, which will resolve the Type error.

Below example solves the above 2 issues

Example:

import pandas as pd
import matplotlib.pyplot as plt

def timePlot(df, year, month,  pollutants=["ws", "nox", "o3", "pm25", "pm10"]):
    # Cuts the df down to the month specified
    df.index = pd.to_datetime(df.date)
    df_n_1 = df[(df.index.month == int(month)) & (df.index.year == int(year))]
    #print(df_n_1)

    fig, axs = plt.subplots(len(pollutants), 1, figsize=(10, 8), sharex=True)
    axs = np.atleast_1d(axs)  # Convert axs to a list

    for ax, pollutant in zip(axs, pollutants):
        values = df_n_1[pollutant]
        

        # plotting
        ax.plot(values.index, values.values)
        ax.yaxis.set_label_position("right") 
        ax.set_ylabel(pollutant)
        plt.xticks(rotation=45)
        plt.yticks(rotation=45)
        
timePlot(df,'2022', 5, pollutants=['pm25'])

Output:

timeplot1
@patel-zeel
Copy link
Member

Suggestions:

  • Plotly plot with slider might be better. Benefits:
    • Slider to move along time-axis
    • Change size of window through GUI
    • Can dynamically enable/disable multiple pollutants

@Tanvi-Jain01
Copy link
Author

Tanvi-Jain01 commented Jun 30, 2023

@patel-zeel I have implemented visualization using plotly.

Changes:

I have modified the method signature, I am removing the month parameter from it, as by using plotly we can visualize of any month so we don't need to specify a particular month.

From:

def timePlot(df, year, month, pollutants=["ws", "nox", "o3", "pm25", "pm10"]):

To:

def timePlot(df, year, pollutants=["ws", "nox", "o3", "pm25", "pm10"]):

Improved Code:

import plotly.graph_objects as go

def timePlot(df, year, month, pollutants=["ws", "nox", "o3", "pm25", "pm10"]):
    # Cuts the df down to the month specified
    df.index = pd.to_datetime(df.date)
    df_n_1 = df[(df.index.month == int(month)) & (df.index.year == int(year))]
    
    fig = go.Figure()
    
    for pollutant in pollutants:
        values = df_n_1[pollutant]
        
        # Add trace for each pollutant
        fig.add_trace(go.Scatter(
            x=values.index,
            y=values.values,
            name=pollutant
        ))
        
    # Configure layout
    fig.update_layout(
        xaxis=dict(
            rangeselector=dict(
                buttons=list([
                    dict(count=1, label="1d", step="day", stepmode="backward"),
                    dict(count=7, label="1w", step="day", stepmode="backward"),
                    dict(count=1, label="1m", step="month", stepmode="backward"),
                    dict(count=6, label="6m", step="month", stepmode="backward"),
                    dict(count=1, label="YTD", step="year", stepmode="backward"),
                    dict(count=1, label="1y", step="year", stepmode="backward"),
                    dict(step="all")
                ])
            ),
            rangeslider=dict(
                visible=True
            ),
            type="date"
        )
    )
    
    fig.show()
    
timePlot(df, 2022, 8, pollutants=['pm25', 'pm10', 'no', 'no2', 'nox', 'nh3', 'so2', 'co', 'o3', 'benzene', 'toluene'])

Output:

timeplot plotly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants