<a href="https://www.kaggle.com/code/heemalichaudhari/learning-pandas?scriptVersionId=115736674" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

#### Pandas is a popular Python library for data analysis. It provides easy-to-use data structures and data analysis tools for handling and manipulating large amounts of data.

#### One of the key features of pandas is its fast and efficient handling of missing data. Pandas uses a special data type called a `NaN`, or "Not a Number", to represent missing data. You can easily detect and remove missing data using pandas' `isnull()` and `dropna()` functions.

#### Another important feature of pandas is its ability to manipulate and analyze data in a tabular form, similar to an Excel spreadsheet. You can easily select, slice, and index rows and columns of data using pandas' `DataFrame` object.

# Importing Library

In [1]:
import pandas as pd

# Create a pandas Series with 10 random integers

In [2]:
s = pd.Series(data=[1, 3, 5, 7, 9, 11, 13, 15, 17, 19])

# Access the elements of the Series by position

In [3]:
print(s[3])

7


# Access the elements of the Series by index

In [4]:
print(s[3:7])

3     7
4     9
5    11
6    13
dtype: int64


# Modify the values of the Series in place

In [5]:
s[3:7] = 0
print(s)

0     1
1     3
2     5
3     0
4     0
5     0
6     0
7    15
8    17
9    19
dtype: int64


# Create a pandas DataFrame with 10 rows and 3 columns

In [6]:
df = pd.DataFrame(data=[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15],[16, 17, 18], [19, 20, 21], [22, 23, 24], [25, 26, 27], [28, 29, 30]],columns=['a', 'b', 'c'])

# Access the elements of the DataFrame by row and column

In [7]:
print(df.iloc[3, 1])

11


# Access the elements of the DataFrame by row and column label

In [8]:
print(df.loc[3, 'b'])

11


# Select a subset of the DataFrame by column

In [9]:
subset = df[['a', 'b']]

# Apply a function to each row of the DataFrame

In [10]:
def func(row):
  return row['a'] + row['b']

df['d'] = df.apply(func, axis=1)

# Group the data by a column and compute the mean of each group

In [11]:
mean_by_group = df.groupby('a').mean()

# Sort the DataFrame by a column

In [12]:
df.sort_values(by='b', ascending=False, inplace=True)

# Drop missing values from the DataFrame

In [13]:
df.dropna(inplace=True)

# Fill missing values in the DataFrame

In [14]:
df.fillna(value=0, inplace=True)

# Convert the data type of a column

In [15]:
df['a'] = df['a'].astype(str)

# Rename the columns of the DataFrame

In [16]:
df.rename(columns={'a': 'A', 'b': 'B'}, inplace=True)

# Create a pivot table of the DataFrame

In [17]:
pivot_table = df.pivot_table(index='A', columns='B', values='c')

# Convert the DataFrame to a NumPy array

In [18]:
array = df.to_numpy()

# Create a copy of the DataFrame


In [19]:
df_copy = df.copy()

# Reset the index of the DataFrame

In [20]:
df.reset_index(inplace=True, drop=True)

# Set the index of the DataFrame

In [21]:
df.set_index('A', inplace=True)