# Day-2: DataScience Libraries

# Topics Covered
1. Pandas
2. NumPy
3. Matplotlib
4. Seaborn

## Pandas

Pandas is a powerful library for data manipulation and analysis. It provides data structures such as Series and DataFrame, which are essential for handling and analyzing structured data.

- Key Features
    - DataFrame: A 2-dimensional labeled data structure with columns of potentially different types.
    - Series: A 1-dimensional labeled array capable of holding any data type.

##### Documentation: https://pypi.org/project/pandas/

### Installation

In [None]:
! pip install pandas

### Basic Usage

#### Creating a DataFrame

In [None]:
import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)
print(df)

df.to_csv('output.csv', index=False)


#### Reading and Writing Data

In [None]:
# Reading from a CSV file
df2 = pd.read_csv('data.csv')

# Writing to a CSV file
df.to_csv('output.csv', index=False)


### Basic Operations

In [None]:
# Displaying the first few rows
print(df.head())

# Summary statistics
print(df.describe())

# Selecting a column
print(df['Name'])

# Filtering rows
print(df[df['Age'] > 30])


## NumPy

NumPy is the foundational package for numerical computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.

- Key Features
    - N-dimensional array object (ndarray)
    - Mathematical functions for operations on arrays

### Installation:

In [None]:
! pip install numpy


### Basic Usage


#### Creating Arrays

In [None]:
import numpy as np

# Creating an array
arr = np.array([1, 2, 3, 4, 5])
print(arr)

# Creating a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr_2d)


### Basic Operations

#### Array Operations

In [None]:
# Basic arithmetic operations
print(arr + 2)
print(arr * 3)

# Element-wise operations
print(np.sqrt(arr))

# Array slicing
print(arr[1:4])


#### Statistical Functions

In [None]:
print(np.mean(arr))
print(np.std(arr))
print(np.sum(arr))


## Matplotlib

Matplotlib is a plotting library for creating static, animated, and interactive visualizations in Python.

### Installation

In [None]:
! pip install matplotlib

### Basic usage 

#### Importing Matplotlib

In [None]:
import matplotlib.pyplot as plt

#### Creating a Simple Plot

In [None]:
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 35]

plt.plot(x, y)
plt.xlabel('X-axis label')
plt.ylabel('Y-axis label')
plt.title('Simple Line Plot')
plt.show()


#### Creating Different Types of Plots

In [None]:

# Scatter plot
plt.scatter(x, y)
plt.title('Scatter Plot')
plt.show()

# Bar plot
plt.bar(x, y)
plt.title('Bar Plot')
plt.show()

# Histogram
data = np.random.randn(1000)
plt.hist(data, bins=30)
plt.title('Histogram')
plt.show()



## Seaborn

Seaborn is a statistical data visualization library based on Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

### Installation

In [None]:
! pip install seaborn


### Basic usage

#### Importing Seaborn

In [None]:
import seaborn as sns

#### Creating a Simple Plot with Seaborn

In [None]:
# Load a dataset
df = sns.load_dataset('iris')

# Create a scatter plot
sns.scatterplot(data=df, x='sepal_length', y='sepal_width', hue='species')
plt.title('Scatter Plot with Seaborn')
plt.show()


#### Creating a Heatmap

In [None]:
# Create a heatmap
data = np.random.rand(10, 12)
sns.heatmap(data, annot=True)
plt.title('Heatmap')
plt.show()
