The plot() function in pandas is a convenient way to create visualizations directly from DataFrames and Series.

It is built on top of Matplotlib, so it provides a high-level interface for creating common plots like line plots, bar plots, histograms, scatter plots, and more.

In [None]:
import pandas as pd
import numpy as np

data = {
    'Year': [2010, 2011, 2012, 2013, 2014],
    'Sales': [200, 250, 300, 350, 400],
    'Profit': [50, 60, 70, 80, 90]
}

df = pd.DataFrame(data)
print(df)

df.plot(x='Year', y='Sales', kind='line', title='Sales Over Years')

The plot() function has several parameters to customize the visualization:

x: Column name for the x-axis.

y: Column name(s) for the y-axis.

kind: Type of plot to create. Common options include:

'line': Line plot (default).

'bar': Vertical bar plot.

'barh': Horizontal bar plot.

'hist': Histogram.

'box': Box plot.

'scatter': Scatter plot.

'pie': Pie chart.

title: Title of the plot.

xlabel: Label for the x-axis.

ylabel: Label for the y-axis.

figsize: Tuple specifying the size of the figure (e.g., (10, 6)).

legend: Boolean to show or hide the legend.

color: Color of the plot elements.

style: Line style (e.g., '--' for dashed lines).

In [None]:
df.plot(x='Year', y=['Sales', 'Profit'], kind='line', title='Sales and Profit Over Years')

In [None]:
df.plot(x='Year', y='Sales', kind='bar', title='Sales Over Years', color='green')

In [None]:
df['Sales'].plot(kind='hist', bins=10, title='Sales Distribution')

The choice of bins depends on the dataset and the level of detail you want to see.

Too few bins can oversimplify the data, while too many bins can overcomplicate it.

Common rules of thumb for choosing the number of bins:

    Square root rule: number of bins = sqrt(number of data points).
    
    Sturges' formula: number of bins = 1 + 3.322 * log10(number of data points).

In [None]:
df[['Sales', 'Profit']].plot(kind='box', title='Sales and Profit Distribution')

In [None]:
df.plot(x='Sales', y='Profit', kind='scatter', title='Sales vs Profit')

In [None]:
df.set_index('Year')['Sales'].plot(kind='pie', autopct='%1.1f%%', title='Sales Distribution by Year')

In [None]:
ax = df.plot(x='Year', y='Sales', kind='line', title='Sales Over Years')
ax.set_xlabel('Year')
ax.set_ylabel('Sales (in millions)')
ax.grid(True)

In [None]:
ax = df.plot(x='Year', y='Sales', kind='line', title='Sales Over Years')
ax.figure.savefig('sales_over_years.png')