<h1 align=\"center\"><font color='green'><font size=\"6\">Python - Library</font> </h1>

<div style="background-color: #90EE90;">.</div>  </div>

 - A library is a collection of modules and packages. Libraries provide pre-built functions and tools to make programming easier.
 - Libraries can be installed using Python’s package manager, pip.

-  Some popular Python libraries include:
   - NumPy: For numerical operations and arrays.
   - Pandas: For data manipulation and analysis.
   - Matplotlib: For plotting graphs and data visualization.
   - Requests: For making HTTP requests.

## Numpy

 - NumPy (Numerical Python) is a library for the Python programming language that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.
 - It is a fundamental package for scientific computing in Python.


 - A package is a collection of related modules. It allows code to be organized hierarchically into directories and subdirectories.
 - You can use and create packages to group similar functionalities in one place.

- To use NumPy You need to install it.
- As mentioned above, you can install NumPy using pip if it is not already installed:

In [1]:
!pip install numpy
#I got this result as I already have numpy installed in my system

Defaulting to user installation because normal site-packages is not writeable


In [3]:
#To use NumPy in your code, you need to import it. The conventional alias for NumPy is np.
import numpy as np

#### NumPy Arrays
 - NumPy arrays are the central feature of the library. 
  - They can be created from Python lists or tuples.

#### Array Attributes
 - NumPy arrays have several important attributes:

    - shape: The dimensions of the array.
    - dtype: The data type of the array elements.
    - ndim: The number of dimensions (axes).

In [4]:
import numpy as np

# Creating a 1D array from a list
array_1d = np.array([1, 2, 3, 4, 5])
print("1D Array:", array_1d)

1D Array: [1 2 3 4 5]


In [5]:
# Creating a 2D array from a list of lists
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2D Array:\n", array_2d) #\n means new line

2D Array:
 [[1 2 3]
 [4 5 6]]


In [6]:
# Array attributes
print("Shape:", array_2d.shape)
print("Data Type:", array_2d.dtype)
print("Number of Dimensions:", array_2d.ndim)

Shape: (2, 3)
Data Type: int32
Number of Dimensions: 2


Array Operations
 - NumPy provides a variety of operations to perform on arrays.
 - You can perform arithmetic operations element-wise on NumPy arrays.

In [7]:
# Element-wise addition
array_a = np.array([1, 2, 3])
array_b = np.array([4, 5, 6])
sum_array = array_a + array_b
print("Sum Array:", sum_array)

# Element-wise multiplication
product_array = array_a * array_b
print("Product Array:", product_array)


Sum Array: [5 7 9]
Product Array: [ 4 10 18]



 - NumPy provides universal functions (ufuncs) to perform element-wise operations.


In [8]:
# Applying square root
sqrt_array = np.sqrt(array_a)
print("Square Root Array:", sqrt_array)

# Applying exponential
exp_array = np.exp(array_a)
print("Exponential Array:", exp_array)


Square Root Array: [1.         1.41421356 1.73205081]
Exponential Array: [ 2.71828183  7.3890561  20.08553692]




- You can change the shape of an array using the reshape method.

In [9]:
# Reshaping a 1D array to a 2D array
reshaped_array = array_a.reshape((3, 1))
print("Reshaped Array:\n", reshaped_array)


Reshaped Array:
 [[1]
 [2]
 [3]]


Basic Indexing

In [10]:
# Accessing individual elements
element = array_2d[0, 1]
print("Element at (0, 1):", element)


Element at (0, 1): 2


Slicing Arrays

In [11]:
# Slicing a 2D array
sliced_array = array_2d[0:2, 1:3]
print("Sliced Array:\n", sliced_array)


Sliced Array:
 [[2 3]
 [5 6]]



 - You can use boolean arrays to index and filter data.

In [12]:
# Boolean indexing
boolean_index = array_2d > 3
filtered_array = array_2d[boolean_index]
print("Filtered Array:", filtered_array)


Filtered Array: [4 5 6]



 - NumPy includes a variety of functions for mathematical operations, statistics, and more.

In [13]:
# Calculating mean and standard deviation
mean_value = np.mean(array_2d)
std_dev_value = np.std(array_2d)
print("Mean:", mean_value)
print("Standard Deviation:", std_dev_value)


Mean: 3.5
Standard Deviation: 1.707825127659933


In [14]:
# Calculating sum and product
sum_value = np.sum(array_2d)
product_value = np.prod(array_2d)
print("Sum:", sum_value)
print("Product:", product_value)


Sum: 21
Product: 720


 - NumPy provides functions to generate random numbers.

In [15]:
# Generating a random array
random_array = np.random.rand(3, 3)
print("Random Array:\n", random_array)


Random Array:
 [[0.81547143 0.3001853  0.06041686]
 [0.21288336 0.26863038 0.57876091]
 [0.22448935 0.32077099 0.17291083]]


 - NumPy has a submodule numpy.linalg for linear algebra operations.

In [16]:
# Matrix multiplication
matrix_a = np.array([[1, 2], [3, 4]])
matrix_b = np.array([[5, 6], [7, 8]])
product_matrix = np.dot(matrix_a, matrix_b)
print("Matrix Product:\n", product_matrix)


Matrix Product:
 [[19 22]
 [43 50]]


NumPy is a powerful tool for numerical computing in Python, providing efficient array operations and mathematical functions. Experiment with these features to leverage NumPy's full capabilities:
 - Creating Arrays: Use np.array(), np.zeros(), np.ones(), np.arange(), etc.
 - Array Operations: Arithmetic, universal functions, and reshaping.
 - Indexing and Slicing: Access and manipulate array elements.
 - Common Functions: Statistical functions, mathematical operations.
 - Random Numbers: Generate random arrays with np.random.
 - Linear Algebra: Matrix operations using numpy.linalg.
 

## Pandas

 - Pandas is a powerful and widely-used Python library for data manipulation and analysis.
 - Pandas is often used in conjunction with libraries like NumPy and Matplotlib.



Key Features of Pandas:
 - Data cleaning and preparation: Handle missing values, filter data, and apply transformations.
 - Data wrangling: Efficiently reshape, merge, and slice data.
 - Easy I/O: Read and write data from CSV, Excel, databases, and more.
 - Time-series functionality: Powerful tools for time-based data.

The two primary data structures in pandas are:

 - Series: A one-dimensional labeled array capable of holding any data type such as integers, floats, or strings.
 - DataFrame: A two-dimensional labeled data structure, similar to a table in relational databases or Excel spreadsheets.

In [None]:
#To use pandas, you first need to install and import it:
!pip install pandas

In [17]:
import pandas as pd  # Standard alias for pandas

In [18]:
# Create a Series with default index
s = pd.Series([10, 20, 30, 40, 50])
print(s)

0    10
1    20
2    30
3    40
4    50
dtype: int64


In [19]:
#Creating a Series with a Custom Index
s = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
print(s)

a    10
b    20
c    30
dtype: int64


#### Dataframe
 - A DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure. 
 - It is similar to a table in a database or an Excel spreadsheet.

In [20]:
# Creating a DataFrame
data = {
    'Name': ['Risa', 'Babita', 'Chintu','Sima', 'Jhilmil', 'Ram','Shyam','Laxman','Nishi'],
    'Age': [25, 30, 35, 36, 23, 32, 56, 45, 24],
    'Salary': [50000, 60000, 70000, 75000, 45000, 55000, 67000, 70000, 32000]
}

df = pd.DataFrame(data)
print(df)


      Name  Age  Salary
0     Risa   25   50000
1   Babita   30   60000
2   Chintu   35   70000
3     Sima   36   75000
4  Jhilmil   23   45000
5      Ram   32   55000
6    Shyam   56   67000
7   Laxman   45   70000
8    Nishi   24   32000


In [21]:
#Writing to a CSV File
#converting the dataframe created to a Csv file
df.to_csv('output.csv', index=False)  # index=False to exclude the index

In [22]:
#Reading a CSV File
data = pd.read_csv('output.csv')

 - Pandas makes it easy to read and write data from various file formats such as CSV, Excel, and SQL databases

In [23]:
print(data.head())# Displays the first 5 rows

      Name  Age  Salary
0     Risa   25   50000
1   Babita   30   60000
2   Chintu   35   70000
3     Sima   36   75000
4  Jhilmil   23   45000


In [24]:
# Accessing a single column
print(data['Name'])

# Accessing multiple columns
data[['Name', 'Age']]

0       Risa
1     Babita
2     Chintu
3       Sima
4    Jhilmil
5        Ram
6      Shyam
7     Laxman
8      Nishi
Name: Name, dtype: object


Unnamed: 0,Name,Age
0,Risa,25
1,Babita,30
2,Chintu,35
3,Sima,36
4,Jhilmil,23
5,Ram,32
6,Shyam,56
7,Laxman,45
8,Nishi,24


In [25]:
# Using iloc to access the first row
print(data.iloc[0])

# Using loc to access a row by index label
data.loc[1]


Name       Risa
Age          25
Salary    50000
Name: 0, dtype: object


Name      Babita
Age           30
Salary     60000
Name: 1, dtype: object

In [26]:
# Filter rows where Age is greater than 30
filtered_data = data[data['Age'] > 30]
print(filtered_data)

     Name  Age  Salary
2  Chintu   35   70000
3    Sima   36   75000
5     Ram   32   55000
6   Shyam   56   67000
7  Laxman   45   70000


In [27]:
# Check for missing data in the DataFrame
print(data.isnull()) #this returns a true or false
print(data.isnull().sum()) # this gives the number of null values

    Name    Age  Salary
0  False  False   False
1  False  False   False
2  False  False   False
3  False  False   False
4  False  False   False
5  False  False   False
6  False  False   False
7  False  False   False
8  False  False   False
Name      0
Age       0
Salary    0
dtype: int64


In [None]:
#since the dataframe did not have any null values
#No further processes were necessary 
#Nut in case of the presence of Null values

In [None]:
#We can handle them by either filling them with 0 
data_filled = data.fillna(0)
#or by dropping them
data_clean = data.dropna()

In [30]:
# Group data by a column and calculate the mean of another column
grouped = data.groupby('Age')['Salary'].mean()
grouped

Age
23    45000.0
24    32000.0
25    50000.0
30    60000.0
32    55000.0
35    70000.0
36    75000.0
45    70000.0
56    67000.0
Name: Salary, dtype: float64

In [31]:
#Pandas allows merging DataFrames based on a common column or index, similar to SQL joins.
# Create two DataFrames
df1 = pd.DataFrame({
    'Name': ['Alice', 'Bob'],
    'Age': [25, 30]
})

df2 = pd.DataFrame({
    'Name': ['Alice', 'Bob'],
    'Salary': [50000, 60000]
})

# Merge DataFrames on the 'Name' column
merged_df = pd.merge(df1, df2, on='Name')
print(merged_df)


    Name  Age  Salary
0  Alice   25   50000
1    Bob   30   60000


In [None]:
# Convert a column to datetime format
df['Date'] = pd.to_datetime(df['Date'])

# Perform operations on dates
df.set_index('Date', inplace=True)  # Set the Date column as the index