
# Short Introduction of Seaborn Library

📊  Are you ready to level up your data visualization game? 

🌟 Seaborn is your ticket to creating stunning visualizations that bring your data to life! 💫 Whether you're a seasoned data scientist or just getting started on your data journey, Seaborn's user-friendly interface and powerful capabilities make it a must-have tool in your toolkit. 

🛠️ Let's dive in and explore how you can harness the full potential of Seaborn Libraries in just a few simple steps! 

# Expamle:

 ### Step 1: Import Seaborn and Your Data
📥 First things first, fire up your Jupyter Notebook or Python environment and import Seaborn along with your dataset. 🐍 Whether you're working with CSV files, Excel spreadsheets, or pulling data from Kaggle datasets, Seaborn plays nice with all formats!



In [72]:
#import libraries:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt


#Remove the warnings
import warnings
warnings.filterwarnings('ignore')

In [73]:
# Load your dataset
df = pd.read_csv('your dataset')
pd.head()

FileNotFoundError: [Errno 2] No such file or directory: 'your dataset'

### Step 2: Choose Your Visualization 
Now comes the fun part – selecting the perfect visualization to showcase your insights! 🎨 Seaborn offers a wide range of plots, from simple scatter plots to intricate heatmaps. Let's say you want to explore the relationship between two variables – Seaborn's scatterplot() has got you covered!


In [None]:
# Create a scatter plot
sns.scatterplot(x='your_x_variable', y='your_y_variable', data=df)

### Step 3: Customize and Enhance 
With Seaborn, customization is key! Tweak colors, adjust marker styles, and add labels to make your visualizations pop!


In [None]:
# Customize your scatter plot
sns.scatterplot(x='your_x_variable', y='your_y_variable', data=df, color='skyblue', marker='o')
plt.title('Your Title Here')
plt.xlabel('X Axis Label')
plt.ylabel('Y Axis Label')

### Step 4: Show it Off!
Once you've polished your masterpiece, it's time to share it! Whether you're presenting your findings to stakeholders, sharing insights with colleagues, or showcasing your skills on LinkedIn, Seaborn's visually stunning plots are sure to impress


In [None]:
# Display your plot
plt.show()

# About Author

Hello everyone! Welcome to my data science notebook.
👋I'm **Motsim Aslam**, and I'm excited to have you join me on my journey of exploring and innovating in the world of data science. 📊
I'm passionate about uncovering the hidden secrets within datasets and using machine learning to make a meaningful impact. Let's dive in together and extract valuable insights!


# Connect with me

[![GitHub](https://img.shields.io/badge/GitHub-Profile-<COLOR>?style=flat-square&logo=github)](https://www.kaggle.com/motsimaslam)

[![Kaggle](https://img.shields.io/badge/Kaggle-Profile-<COLOR>?style=flat-square&logo=kaggle)](https://www.kaggle.com/MotsimAslam)

[![LinkedIn](https://img.shields.io/badge/LinkedIn-Profile-<COLOR>?style=flat-square&logo=linkedin)](https://www.linkedin.com/in/motsimaslam/)

Feel free to connect with me on GitHub, Kaggle, and LinkedIn!




## Let's use a real dataset and create a visualization using Seaborn. We'll use the famous Iris dataset, which contains information about iris flowers.

# Overview of Dataset

### Iris Dataset Summary:

The Iris dataset is a classic dataset in the field of machine learning and data science. It consists of measurements of various features of three species of iris flowers: Setosa, Versicolor, and Virginica. The dataset contains 150 samples, with 50 samples for each species. Each sample consists of four features: sepal length, sepal width, petal length, and petal width, all measured in centimeters.

The goal of the dataset is to classify iris flowers based on these features. It serves as a fundamental dataset for practicing classification algorithms and exploring data visualization techniques. The Iris dataset is widely used in introductory machine learning courses, benchmarking algorithms, and exploring techniques for data analysis and visualization.

#### Key Points:

* Contains measurements of iris flowers from three species: Setosa, Versicolor, and Virginica.
* Consists of 150 samples, with 50 samples for each species.
* Each sample has four features: sepal length, sepal width, petal length, and petal width.
* The dataset is commonly used for classification tasks and exploring data visualization techniques.

# Exploratory Data Analysis (EDA) 📊🔍¶
We'll start by loading the dataset into a seaborn DataFrame to examine its structure and contents.


# Load the Necessary Libraries 

In [None]:
# Import necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt
import warnings

# Ignore all warnings
import warnings
warnings.filterwarnings("ignore")



# Load the Dataset

In [None]:
# Load the Iris dataset
df = sns.load_dataset('iris')

In [None]:
# Display the first few rows of the dataset
print("First few rows of the dataset:")
df.head()

In [None]:
df.tail()

# Summary Statistics 📈
We'll calculate summary statistics for numerical columns to understand the distribution of the data.

In [None]:
# Summary overview of dataset
print("Summary Overview of Statistical Columns:")
df.describe()

In [None]:
# Finding information about the DataFrame's structure
df.info()

# Finding the Missing Values
Let's find the missing values in dataset

In [None]:
# checking the missing values in dataset
df.isnull().sum()

*** As per the above output there is no missing values in each columns. So we go ahead and do some exploratory Data Analysis (EDA)*** 

# Deal with Outlier

In [None]:
plt.figure(figsize=(8,6))
sns.boxplot(data=df)
plt.title("Data with outlier")
plt.show()

*** As per above output, I am not seeing any outlier inside data.**

# Visulization
Let's create some attractive visulization like Pairplot, Barplot, Pieplot etc

In [None]:
# Make Pair plot
sns.pairplot(df, hue='species', palette='Dark2', diag_kind='kde')
plt.show()

In [None]:
df.head()

In [None]:
# Make Line Plot
sns.lineplot(df, x='sepal_length', y='petal_width', hue='species')
plt.xlabel('Sepal Length')
plt.ylabel('Petal Width')
plt.title('Line Plot: Sepal Length vs. Petal Width')
plt.show()

Dataset Summary

In [None]:
# Make Bar Plot
sns.barplot(df, x='species', y='sepal_length', palette='Dark2', ci=68 )

# Add labels and title
plt.xlabel('Species')
plt.ylabel('Sepal Length')
plt.title('Bar Plot: Sepal Length by Species')

# Show the plot
plt.show()

In [None]:

species_count = df['species'].value_counts().reset_index() # To count the average values
species_count.columns = ['species', 'count']  # Rename the columns

# Create a pie plot
plt.pie(species_count['count'], labels=species_count['species'], autopct='%1.1f%%', startangle=90)

# Add a title
plt.title('Pie Plot: Iris Dataset')

# Equal aspect ratio to ensure a circular pie
plt.axis('equal')

# Show the plot
plt.show()

In [None]:
# To select only numerial value
numerial_col=df.select_dtypes(include=['number'])

# To computer correlation matric
correlation_matrics=numerial_col.corr()

correlation_matrics


In [None]:
# Create a heatmap
sns.heatmap(correlation_matrics, annot=True)

# Labeling
plt.title ('Correlation Between Data/ Correlation Matrix Heatmap')

# Show the plot
plt.show()

The "Id" column serves as a unique identifier for each entry in the dataset and doesn't provide meaningful correlations with other features.

When looking at the relationships between sepal length, sepal width, petal length, and petal width:

* Sepal length tends to increase alongside petal length and petal width, showing a strong positive correlation.
* Sepal width generally decreases as petal length and petal width increase, indicating a moderate negative correlation.
* Petal length and petal width have strong positive correlations with each other and with sepal length, suggesting that as one increases, the others tend to increase as well.

 In summary, sepal length positively influences petal length and width, while sepal width tends to decrease as petal size increases. Petal length and width are strongly related, both positively correlating with sepal length.

# Your Feedback

### **If you have any questions, feel free to reach out. Your feedback is always valuable to me.¶**