<a href="https://colab.research.google.com/github/Navyasri28/Python-practices/blob/main/Panda's.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Panda's - Introduction

Pandas is a popular Python library designed to simplify data manipulation and analysis. It provides various functions to help with tasks such as cleaning, exploring, and transforming data. The name "Pandas" is derived from "Panel Data" and "Python Data Analysis." Wes McKinney developed it in 2008 to address the need for flexible and powerful data analysis tools in Python.

Pandas is a versatile tool that allows you to extract insights from your data. With Pandas, you can determine relationships between different columns, calculate averages, and identify the maximum and minimum values. Additionally, it helps in data cleaning by enabling the removal of irrelevant rows or those with incorrect or missing values, ensuring your data is accurate and useful.

**Installation**

To install Pandas, you can use the Python package manager pip. Open your command prompt or terminal and enter the following command:

In [None]:
pip install pandas


This command will download and install the Pandas library along with any necessary dependencies, making it ready for use in your Python projects.

Pandas import

In [None]:
import pandas as pd

mydataset = {
  'names': ["Alice", "Bob", "Charlie"],
  'ages': [25, 30, 35],
  'cities': ["New York", "Los Angeles", "Chicago"]
}

myvar = pd.DataFrame(mydataset)

print(myvar)


     names  ages       cities
0    Alice    25     New York
1      Bob    30  Los Angeles
2  Charlie    35      Chicago


**Pandas: A Library for Managing Structured Data in Python**

Pandas is a robust Python library tailored for data manipulation and analysis. It offers versatile and powerful tools for handling structured data, enabling users to perform intricate operations with minimal effort. Here are some of the key features of Pandas:

**DataFrames and Series**: Pandas utilizes DataFrames, which are two-dimensional tables, and Series, which are one-dimensional arrays, to represent structured data. This is akin to how tables are structured in databases or spreadsheets like Excel.

**Comprehensive Data Operations**: Pandas allows you to efficiently filter, sort, group, merge, concatenate, pivot, and reshape data, making complex data operations straightforward.

**Handling Missing Data**: The library provides various methods to detect and manage missing or null values in datasets.

**Integration with Other Libraries**: Pandas seamlessly integrates with other essential Python libraries commonly used in data science, such as NumPy, SciPy, and matplotlib, enhancing its functionality and utility.

**Pandas Series: Overview**

A Pandas Series is a one-dimensional array where each element is associated with a label, known as the index. This structure allows for diverse data types, including numbers, strings, booleans, or even more complex objects. The labeled index in a Series enhances its capability for data manipulation and analysis.

**Creating a Pandas Series**

You can create a Pandas Series from different data sources like lists, arrays, and dictionaries. The creation process is simple and allows for index customization to fit specific requirements.

From Lists

To create a Series from a list:

In [1]:
import pandas as pd

# Create a Series from a list
data = [10, 20, 30, 40]
s = pd.Series(data, index=["a", "b", "c", "d"])

print("Series from list:")
print(s)


Series from list:
a    10
b    20
c    30
d    40
dtype: int64


From Arrays

To create a Series from a NumPy array:

In [2]:
import numpy as np

# Create a Series from a NumPy array
array_data = np.array([1.1, 2.2, 3.3, 4.4])
s_from_array = pd.Series(array_data, index=["x", "y", "z", "w"])

print("Series from array:")
print(s_from_array)


Series from array:
x    1.1
y    2.2
z    3.3
w    4.4
dtype: float64


From Dictionaries

To create a Series from a dictionary:

In [3]:
# Create a Series from a dictionary
dict_data = {"Apple": 100, "Banana": 200, "Cherry": 300}
s_from_dict = pd.Series(dict_data)

print("Series from dictionary:")
print(s_from_dict)


Series from dictionary:
Apple     100
Banana    200
Cherry    300
dtype: int64


**Exploring Series Attributes and Methods**

Pandas Series offers a variety of attributes and methods for efficient data manipulation. These tools enable operations on individual elements or the entire Series, providing flexibility in data handling.

Common Attributes

s.index: Returns the index labels of the Series.

s.values: Returns the Series values as a NumPy array.

s.dtype: Indicates the data type of the Series elements.

s.size: Provides the count of elements in the Series.

Example: Exploring Series Attributes

In [4]:
# Example: Exploring Series attributes
print("Index:", s.index)
print("Values:", s.values)
print("Data type:", s.dtype)
print("Size:", s.size)


Index: Index(['a', 'b', 'c', 'd'], dtype='object')
Values: [10 20 30 40]
Data type: int64
Size: 4


**Common Methods**
Pandas Series methods enable various operations such as arithmetic, data manipulation, and indexing.

s.head(n): Returns the first n elements of the Series.

s.tail(n): Returns the last n elements of the Series.

s.sort_values(): Sorts the Series by its values.

s.mean(), s.median(), s.std(): Calculate common statistical measures.

s.str: Offers string manipulation methods for Series containing string data.

s.apply(func): Applies a given function to each element in the Series.

In [5]:
# Example: Using Series methods
print("First two elements:", s.head(2))
print("Last two elements:", s.tail(2))
print("Sorted values:", s.sort_values())
print("Mean of the Series:", s.mean())


First two elements: a    10
b    20
dtype: int64
Last two elements: c    30
d    40
dtype: int64
Sorted values: a    10
b    20
c    30
d    40
dtype: int64
Mean of the Series: 25.0
