# 7. Data Visualization
- Data visualization is essential for presenting data insights in an intuitive and understandable manner. Effective visualizations can reveal trends, patterns, and outliers in the data. This section covers basic plotting with pandas, as well as advanced visualizations using Matplotlib and Seaborn.

## Basic Plotting with Pandas


- Pandas offers built-in plotting capabilities through its integration with Matplotlib. This allows for quick and simple visualizations directly from DataFrames and Series. Basic plot types include line plots, histograms, and scatter plots. These plots are useful for exploring data distributions, relationships, and trends.

- Plotting with plot(): The plot() function in pandas allows you to create various types of plots by specifying the kind parameter (e.g., 'line', 'hist', 'scatter').
Histograms: Useful for showing the distribution of a single numeric variable.
Scatter Plots: Used to visualize the relationship between two numeric variables.
Line Plots: Ideal for visualizing trends over time or ordered data.



Examples:

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Sample data
data = {
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi', 'Bhagath', 'Monika'],
    'Age': [25, 30, 35, 28, 40, 50],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur', 'Bangalore', 'Hyderabad']
}
df = pd.DataFrame(data)

# Line plot of Age
df.plot(x='Name', y='Age', kind='line', marker='o')
plt.title('Line Plot of Age')
plt.xlabel('Name')
plt.ylabel('Age')
plt.grid(True)
plt.show()

# Histogram of Age
df['Age'].plot(kind='hist', bins=5, color='skyblue', edgecolor='black')
plt.title('Histogram of Age')
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.show()

# Scatter plot of Age vs. City
df['Age'].plot(kind='scatter', x='Name', y='Age', color='red')
plt.title('Scatter Plot of Age by Name')
plt.xlabel('Name')
plt.ylabel('Age')
plt.show()


#### Output:

`- Line Plot:` Displays the trend of Age across different Names.
Histogram: Shows the distribution of Age values.

`- Scatter Plot:` Visualizes the relationship between Name and Age, though less meaningful in this case as 'Name' is categorical.

## Integration with Matplotlib and Seaborn


- While pandas provides basic plotting capabilities, Matplotlib and Seaborn offer more advanced and customizable visualization options. Matplotlib is a comprehensive plotting library, while Seaborn is built on top of Matplotlib and provides additional functionalities and aesthetic improvements for statistical graphics.

- Customizing Plots with Matplotlib: Matplotlib allows extensive customization of plots, including setting titles, labels, legends, and more. You can control colors, markers, line styles, and plot sizes.
Advanced Visualizations with Seaborn: Seaborn simplifies the creation of complex plots like heatmaps, violin plots, and pair plots. It also integrates well with pandas DataFrames and enhances the aesthetics of visualizations.


Examples:

In [None]:
import seaborn as sns

# Sample data
data = {
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi', 'Bhagath', 'Monika'],
    'Age': [25, 30, 35, 28, 40, 50],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur', 'Bangalore', 'Hyderabad']
}
df = pd.DataFrame(data)

# Advanced scatter plot with Seaborn
sns.scatterplot(data=df, x='Name', y='Age', hue='City', palette='viridis', s=100)
plt.title('Scatter Plot of Age by Name and City')
plt.xlabel('Name')
plt.ylabel('Age')
plt.show()

# Heatmap of Age by City
age_city = df.pivot_table(index='City', values='Age', aggfunc='mean')
sns.heatmap(age_city, annot=True, cmap='coolwarm', cbar=True)
plt.title('Heatmap of Average Age by City')
plt.xlabel('Age')
plt.ylabel('City')
plt.show()

# Pair plot of Age
sns.pairplot(df, hue='City', palette='pastel')
plt.title('Pair Plot of Age by City')
plt.show()


### Output:

- `Advanced Scatter Plot:` Shows Age distribution across different Names with color-coding based on City.
- `Heatmap:` Displays the average Age for each City, using color intensity to represent values.
- `Pair Plot:` Illustrates relationships between variables and their distributions across different Cities.

## Matplotlib

### 1. Introduction to Matplotlib

- Matplotlib is a plotting library that can produce high-quality graphs and plots. It is highly customizable and supports various backends and formats. The core component of Matplotlib is pyplot, which provides a MATLAB-like interface for creating plots.

 - `Installation:` To use Matplotlib, you need to install it via pip.
 ```python
 pip install matplotlib


 - Basic Concepts:

    - Figure: The entire window or image where plotting takes place.
    - Axes: The area within the figure where the data is plotted (i.e., the actual graph).
    - Artist: Everything you see on the figure (e.g., lines, labels, markers).

    Examples:


In [None]:
import matplotlib.pyplot as plt

# Creating a simple line plot
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]

plt.plot(x, y, marker='o')  # Plot with circles at data points
plt.title('Simple Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)
plt.show()


## 2. Basic Plot Types


- Matplotlib supports various plot types for different data visualization needs. Some of the most common plots are line plots, bar plots, scatter plots, and histograms.

  - `Line Plot:` Useful for showing trends over time or continuous data.
  - `Bar Plot:` Ideal for comparing quantities across different categories.
  - `Scatter Plot:` Great for visualizing relationships between two numeric variables.
  - `Histogram:` Useful for showing the distribution of a numeric variable.

  
Examples:

In [None]:
import matplotlib.pyplot as plt

# Line Plot
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
plt.plot(x, y, label='Line Plot', color='blue')
plt.legend()
plt.show()

# Bar Plot
categories = ['A', 'B', 'C', 'D']
values = [10, 20, 15, 25]
plt.bar(categories, values, color='green')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Plot')
plt.show()

# Scatter Plot
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
plt.scatter(x, y, color='red')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot')
plt.show()

# Histogram
data = [1, 2, 2, 3, 4, 4, 4, 5, 6, 7]
plt.hist(data, bins=5, color='purple', edgecolor='black')
plt.xlabel('Bins')
plt.ylabel('Frequency')
plt.title('Histogram')
plt.show()


###  3. Customizing Plots


- Matplotlib allows extensive customization of plots to improve readability and aesthetics. You can adjust colors, line styles, markers, and more. Customizing plots helps in tailoring the visualizations to specific needs and enhancing their interpretability.

Examples:

In [None]:
import matplotlib.pyplot as plt

# Customized Line Plot
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
plt.plot(x, y, marker='o', linestyle='--', color='magenta', linewidth=2, markersize=10)
plt.title('Customized Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)
plt.show()

# Customized Bar Plot
categories = ['A', 'B', 'C', 'D']
values = [10, 20, 15, 25]
plt.bar(categories, values, color=['cyan', 'yellow', 'orange', 'green'], edgecolor='black', hatch='//')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Customized Bar Plot')
plt.show()


### 4. Creating Subplots


- Subplots allow you to display multiple plots in a single figure. This is useful for comparing different datasets or visualizing multiple aspects of a single dataset. You can create subplots using the subplot() function or subplots() function for more complex layouts.

Examples:

In [None]:
import matplotlib.pyplot as plt

# Creating Subplots
fig, axs = plt.subplots(2, 2, figsize=(10, 8))

# Line Plot
axs[0, 0].plot([1, 2, 3], [4, 5, 6], 'tab:blue')
axs[0, 0].set_title('Line Plot')

# Bar Plot
axs[0, 1].bar(['A', 'B', 'C'], [10, 20, 15], color='orange')
axs[0, 1].set_title('Bar Plot')

# Scatter Plot
axs[1, 0].scatter([1, 2, 3], [4, 5, 6], color='red')
axs[1, 0].set_title('Scatter Plot')

# Histogram
axs[1, 1].hist([1, 2, 2, 3, 4, 4, 4, 5], bins=5, color='purple')
axs[1, 1].set_title('Histogram')

plt.tight_layout()
plt.show()


### 5. Saving Figures


- Matplotlib allows you to save figures in various formats (e.g., PNG, PDF, SVG) using the savefig() function. This is useful for exporting plots for reports, publications, or presentations.

Examples:

In [None]:
import matplotlib.pyplot as plt

# Creating a simple plot
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
plt.plot(x, y, marker='o', color='blue')
plt.title('Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

# Saving the plot
plt.savefig('line_plot.png', dpi=300)
plt.savefig('line_plot.pdf')
plt.show()


# Seaborn
- Seaborn is a Python visualization library based on Matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics. It is particularly well-suited for data exploration and statistical analysis. Seaborn integrates closely with Pandas data structures and is known for its attractive default styles and color palettes.

### 1. Introduction to Seaborn


- Seaborn builds on Matplotlib and provides a higher-level interface for drawing statistical plots. It simplifies the creation of complex visualizations and supports data-driven visualizations with ease. Seaborn is designed to work seamlessly with Pandas DataFrames and Series, making it a powerful tool for data analysis and visualization.

  - `Installation:` To use Seaborn, you need to install it via pip.
  ```python
  pip install seaborn



#### Basic Concepts:

- ` Axes: `Similar to Matplotlib, Seaborn plots are drawn on Axes.
- `Data:` Seaborn functions work directly with Pandas DataFrames, allowing for easy data manipulation and visualization.



Examples:

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Load a sample dataset
data = sns.load_dataset('iris')

# Creating a basic scatter plot
sns.scatterplot(data=data, x='sepal_length', y='sepal_width', hue='species')
plt.title('Scatter Plot of Sepal Length vs Sepal Width')
plt.show()


### 2. Basic Plot Types


- Seaborn provides several types of plots that are commonly used in data analysis and exploration. Some of the basic plot types include:

  - `Scatter Plot:` Useful for visualizing relationships between two numeric variables.
  - `Line Plot:` Useful for showing trends over time or continuous data.
  - `Bar Plot:` Ideal for comparing quantities across different categories.
  - `Histogram:` Useful for showing the distribution of a numeric variable.
  - `Box Plot:` Useful for visualizing the distribution and spread of data, including outliers.


Examples:

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Load a sample dataset
data = sns.load_dataset('titanic')

# Line Plot
sns.lineplot(data=data, x='age', y='fare', hue='class')
plt.title('Line Plot of Age vs Fare by Class')
plt.show()

# Bar Plot
sns.barplot(data=data, x='class', y='fare')
plt.title('Bar Plot of Average Fare by Class')
plt.show()

# Histogram
sns.histplot(data=data, x='age', bins=20, kde=True)
plt.title('Histogram of Age Distribution')
plt.show()

# Box Plot
sns.boxplot(data=data, x='class', y='age')
plt.title('Box Plot of Age by Class')
plt.show()


## 3. Customizing Plots


-  Seaborn allows extensive customization of plots to enhance their appearance and make them more informative. You can customize aspects such as colors, markers, and styles. Seaborn also provides several themes and color palettes to improve the visual appeal of your plots.

Examples:


In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Load a sample dataset
data = sns.load_dataset('tips')

# Setting the style
sns.set(style='whitegrid')

# Customizing the scatter plot
sns.scatterplot(data=data, x='total_bill', y='tip', hue='day', style='time', palette='deep')
plt.title('Customized Scatter Plot of Total Bill vs Tip')
plt.show()

# Customizing the bar plot
sns.barplot(data=data, x='day', y='total_bill', palette='coolwarm', ci=None)
plt.title('Customized Bar Plot of Total Bill by Day')
plt.show()


## 4. Advanced Visualizations

- Seaborn provides advanced visualization techniques that are especially useful for statistical analysis and data exploration. These include:

  - `Pair Plots:` Useful for visualizing pairwise relationships in a dataset.
  - `Heatmaps:` Useful for displaying matrix data and correlations.
  - `Facet Grids:` Useful for creating multi-plot grids based on categorical variables.


Examples:

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Load a sample dataset
data = sns.load_dataset('flights')

# Pair Plot
sns.pairplot(data=sns.load_dataset('iris'), hue='species')
plt.show()

# Heatmap
corr = data.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.title('Heatmap of Correlations')
plt.show()

# Facet Grid
g = sns.FacetGrid(data=data, col='sex', row='time')
g.map_dataframe(sns.scatterplot, x='total_bill', y='tip')
g.add_legend()
plt.show()


## 5. Integration with Pandas

- Seaborn integrates smoothly with Pandas DataFrames and Series, making it easy to create visualizations directly from DataFrame objects. This integration allows you to leverage Pandas' data manipulation capabilities in conjunction with Seaborn's powerful visualization features.

Examples:

In [None]:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

# Creating a DataFrame
data = pd.DataFrame({
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
    'Age': [25, 30, 35, 28],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
})

# Plotting with Seaborn
sns.barplot(data=data, x='City', y='Age', palette='Set1')
plt.title('Bar Plot of Age by City')
plt.show()



# Plotly

- Plotly is an interactive graphing library that allows for the creation of rich, interactive visualizations. It provides a high-level interface for creating complex plots with interactivity and customization. Plotly integrates with Pandas and can be used to generate interactive plots directly from data frames.

## 1. Introduction to Plotly


- Plotly is designed for creating interactive plots that can be embedded in web applications, notebooks, and dashboards. Unlike static plots, Plotly graphs support zooming, panning, and hovering to reveal additional information. Plotly supports a wide range of plot types, including line charts, scatter plots, bar charts, and more complex visualizations like heatmaps and 3D plots.

  - `Installation:` To use Plotly, you need to install it via pip.
  ```python
  pip install plotly


#### Basic Concepts:

  - `Figures:` Plotly plots are represented by Figure objects.
  - `Traces:` These are the components of a plot, representing data series.
  - `Layouts:` Define the appearance and layout of the plot.


  
Examples:

In [None]:
import plotly.express as px
import pandas as pd

# Load a sample dataset
data = pd.DataFrame({
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
    'Age': [25, 30, 35, 28],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
})

# Creating a scatter plot
fig = px.scatter(data, x='Age', y='City', color='Name', title='Scatter Plot of Age by City')
fig.show()


## 2. Basic Plot Types


- Plotly supports a wide variety of plot types. Some of the basic plot types include:

  - `Line Plot:` Useful for showing trends over time or continuous data.
  - `Scatter Plot:` Useful for visualizing the relationship between two numeric variables.
  - `Bar Plot: `Ideal for comparing quantities across different categories.
 -` Histogram:` Useful for showing the distribution of a numeric variable.
  - `Box Plot:` Useful for visualizing the distribution and spread of data, including outliers.


Examples:

In [None]:
import plotly.express as px
import pandas as pd

# Load a sample dataset
data = px.data.tips()

# Line Plot
fig = px.line(data, x='total_bill', y='tip', color='day', title='Line Plot of Total Bill vs Tip')
fig.show()

# Bar Plot
fig = px.bar(data, x='day', y='total_bill', title='Bar Plot of Total Bill by Day')
fig.show()

# Histogram
fig = px.histogram(data, x='total_bill', nbins=30, title='Histogram of Total Bill Distribution')
fig.show()

# Box Plot
fig = px.box(data, x='day', y='total_bill', title='Box Plot of Total Bill by Day')
fig.show()


## 3. Customizing Plots


- Plotly provides extensive customization options to enhance the appearance and functionality of plots. You can customize aspects such as colors, markers, axes, and titles. Plotly's interactive features can also be fine-tuned to improve the user experience.

Examples:

In [None]:
import plotly.express as px
import pandas as pd

# Load a sample dataset
data = px.data.iris()

# Customizing the scatter plot
fig = px.scatter(data, x='sepal_width', y='sepal_length', color='species', symbol='species', size='petal_length',
                 title='Customized Scatter Plot of Sepal Width vs Sepal Length')
fig.update_layout(
    xaxis_title='Sepal Width (cm)',
    yaxis_title='Sepal Length (cm)',
    legend_title='Species'
)
fig.show()

# Customizing the bar plot
fig = px.bar(data, x='species', y='petal_length', color='species', text='petal_length',
             title='Customized Bar Plot of Petal Length by Species')
fig.update_traces(texttemplate='%{text:.2s}', textposition='outside')
fig.update_layout(yaxis_title='Petal Length (cm)')
fig.show()


## 4. Advanced Visualizations


- Plotly offers advanced visualization capabilities that go beyond basic plots. These include:

  - `3D Plots: `Useful for visualizing three-dimensional data.
  - `Heatmaps:` Useful for displaying matrix data and correlations.
  - `Subplots:` Useful for creating multiple plots in a single figure.
  - `Geospatial Maps: `Useful for visualizing data on geographical maps.


Examples:

In [None]:
import plotly.express as px
import pandas as pd

# Load a sample dataset
data = px.data.gapminder()

# 3D Scatter Plot
fig = px.scatter_3d(data, x='gdpPercap', y='lifeExp', z='pop', color='continent', size='pop',
                    title='3D Scatter Plot of GDP per Capita, Life Expectancy, and Population')
fig.show()

# Heatmap
data_corr = px.data.iris().corr()
fig = px.imshow(data_corr, text_auto=True, color_continuous_scale='Viridis', title='Heatmap of Feature Correlations')
fig.show()

# Subplots
from plotly.subplots import make_subplots
import plotly.graph_objects as go

fig = make_subplots(rows=1, cols=2, subplot_titles=('Scatter Plot', 'Bar Plot'))
fig.add_trace(go.Scatter(x=data['total_bill'], y=data['tip'], mode='markers', name='Scatter Plot'), row=1, col=1)
fig.add_trace(go.Bar(x=data['day'], y=data['total_bill'], name='Bar Plot'), row=1, col=2)
fig.update_layout(title='Subplots Example')
fig.show()

# Geospatial Map
data = px.data.carshare()
fig = px.scatter_mapbox(data, lat='lat', lon='lon', color='peak', size='car_hours',
                       mapbox_style='carto-positron', title='Geospatial Map of Carshare Data')
fig.show()


## 5. Integration with Pandas


- Plotly integrates seamlessly with Pandas DataFrames and Series, allowing you to create interactive plots directly from Pandas data structures. This integration simplifies the process of visualizing data and enhances the capabilities of your data analysis workflow.

Examples:

In [None]:
import plotly.express as px
import pandas as pd

# Creating a DataFrame
data = pd.DataFrame({
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
    'Age': [25, 30, 35, 28],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
})

# Creating a bar plot with Plotly
fig = px.bar(data, x='City', y='Age', color='Name', title='Bar Plot of Age by City')
fig.show()
