### Pandas-DataFrame And Series
Pandas is a powerful data manipulation library in Python, widely used for data analysis and data cleaning. It provides two primary data structures: Series and DataFrame. A Series is a one-dimensional array-like object, while a DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns).

In [2]:
import pandas as pd

- A Pandas Series is a one-dimensional array-like object that can hold any data type. It is similar to a column in a table.

In [2]:
data = [10,20,30,40]
series = pd.Series(data)
print(series)
print(type(series))

0    10
1    20
2    30
3    40
dtype: int64
<class 'pandas.core.series.Series'>


In [3]:
# Series from dictionary

data1 = {'Name':'Alok','Age':23,'Subject':'BCA'}
dict_series = pd.Series(data1)
print(dict_series)

Name       Alok
Age          23
Subject     BCA
dtype: object


In [5]:
# Series with Index
num = [10,20,30,40]
num_index = ['A','B','C','D']
series1 = pd.Series(data=num,index=num_index)
print(series1)

A    10
B    20
C    30
D    40
dtype: int64


- DataFrame

In [6]:
data = {
        
        'Name':['Alok','Rohan','Tanya','Subham','Keshav'],
        'Age':[22,25,24,20,30],
        'Address':['Mainatand','Patna','Delhi','Bettiah','Gujrat']
}

df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,Address
0,Alok,22,Mainatand
1,Rohan,25,Patna
2,Tanya,24,Delhi
3,Subham,20,Bettiah
4,Keshav,30,Gujrat


In [7]:
# Dataframe from list of dictionaries

data=[
    {'Name':'Alok','Age':22,'City':'Bettiah'},
    {'Name':'Uday','Age':24,'City':'Delhi'},
    {'Name':'Kushi','Age':22,'City':'Chandigarh'},
    {'Name':'Sandhya','Age':26,'City':'Bangalore'}
    
]
df1 = pd.DataFrame(data)
print(df1)
print(type(df1))

      Name  Age        City
0     Alok   22     Bettiah
1     Uday   24       Delhi
2    Kushi   22  Chandigarh
3  Sandhya   26   Bangalore
<class 'pandas.core.frame.DataFrame'>


In [9]:
# Reading data from csv file

data = pd.read_csv('data.csv')
data.head()

Unnamed: 0,Date,Category,Value,Product,Sales,Region
0,2023-01-01,A,28.0,Product1,754.0,East
1,2023-01-02,B,39.0,Product3,110.0,North
2,2023-01-03,C,32.0,Product2,398.0,East
3,2023-01-04,B,8.0,Product1,522.0,East
4,2023-01-05,B,26.0,Product3,869.0,North


In [10]:
data.tail()

Unnamed: 0,Date,Category,Value,Product,Sales,Region
45,2023-02-15,B,99.0,Product2,599.0,West
46,2023-02-16,B,6.0,Product1,938.0,South
47,2023-02-17,B,69.0,Product3,143.0,West
48,2023-02-18,C,65.0,Product3,182.0,North
49,2023-02-19,C,11.0,Product3,708.0,North


In [12]:
data.columns

Index(['Date', 'Category', 'Value', 'Product', 'Sales', 'Region'], dtype='object')

In [13]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50 entries, 0 to 49
Data columns (total 6 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   Date      50 non-null     object 
 1   Category  50 non-null     object 
 2   Value     47 non-null     float64
 3   Product   50 non-null     object 
 4   Sales     46 non-null     float64
 5   Region    50 non-null     object 
dtypes: float64(2), object(4)
memory usage: 2.5+ KB


In [14]:
data.describe()

Unnamed: 0,Value,Sales
count,47.0,46.0
mean,51.744681,557.130435
std,29.050532,274.598584
min,2.0,108.0
25%,27.5,339.0
50%,54.0,591.5
75%,70.0,767.5
max,99.0,992.0


In [15]:
data.isnull().sum()

Date        0
Category    0
Value       3
Product     0
Sales       4
Region      0
dtype: int64

In [18]:
data['Region']

0      East
1     North
2      East
3      East
4     North
5      West
6      East
7      West
8      West
9      West
10    North
11     West
12    South
13     East
14     West
15    North
16    South
17     West
18     West
19     East
20    South
21    South
22    North
23     West
24     East
25     West
26    North
27    North
28    North
29    South
30     West
31     West
32    South
33     East
34     West
35     West
36     East
37    North
38    South
39     West
40     East
41     East
42     West
43     East
44     East
45     West
46    South
47     West
48    North
49    North
Name: Region, dtype: object

In [19]:
df

Unnamed: 0,Name,Age,Address
0,Alok,22,Mainatand
1,Rohan,25,Patna
2,Tanya,24,Delhi
3,Subham,20,Bettiah
4,Keshav,30,Gujrat


In [22]:
# Adding New column
df['Gender'] = ['Male','Male','Female','Male','Male']

In [21]:
df

Unnamed: 0,Name,Age,Address,Gender
0,Alok,22,Mainatand,Male
1,Rohan,25,Patna,Male
2,Tanya,24,Delhi,Female
3,Subham,20,Bettiah,Male
4,Keshav,30,Gujrat,Male


In [23]:
df.loc[0]

Name            Alok
Age               22
Address    Mainatand
Gender          Male
Name: 0, dtype: object

In [24]:
df.iloc[0]

Name            Alok
Age               22
Address    Mainatand
Gender          Male
Name: 0, dtype: object

In [None]:
# Check specified detail
df.at[2,'Name']

'Tanya'

In [33]:
df.iat[2,0]

'Tanya'

In [31]:
df

Unnamed: 0,Name,Age,Address,Gender
0,Alok,22,Mainatand,Male
1,Rohan,25,Patna,Male
2,Tanya,24,Delhi,Female
3,Subham,20,Bettiah,Male
4,Keshav,30,Gujrat,Male
