# Introduction to Pandas:-

When working with tabular data, such as data stored in spreadsheets or databases, pandas is the right tool for you. pandas will help you to explore, clean, and process your data. In pandas, a data table is called a DataFrame



![image.png](attachment:image.png)

 A DataFrame is a 2-dimensional data structure that can store data of different types (including characters, integers, floating point values, categorical data and more) in columns. It is similar to a spreadsheet, a SQL table or the data.frame in R.

# Load Pandas:

To load the pandas package and start working with it, import the package. The community agreed alias for pandas is pd, so loading pandas as pd is assumed standard practice for all of the pandas documentation.

In [7]:
import pandas as pd

# Creating a DataFrame

In [8]:
# Mnemonic name for DataFrame df

df=pd.DataFrame({
    'Name':['Rohit','Sachin','Pinki','Nikki','kavya'],
    'Age':[i for i in range(30,80,10)],
    'Experience':[i for i in range(5,18,3)]})

df

Unnamed: 0,Name,Age,Experience
0,Rohit,30,5
1,Sachin,40,8
2,Pinki,50,11
3,Nikki,60,14
4,kavya,70,17


Notice that the inferred dtype is int64 & object.

In [9]:
df.dtypes

Name          object
Age            int64
Experience     int64
dtype: object

To enforce a single dtype:

In [10]:
import numpy as np

df=pd.DataFrame({
    'Name':['simran','Rohit','sacchi','Nikku','Kabir'],
    'Age':[i for i in range(30,80,10)],
    'Experience':[i for i in range(5,18,3)]},dtype=np.str0)

df.dtypes

Name          object
Age           object
Experience    object
dtype: object

Constructing DataFrame from a dictionary including Series:

In [11]:
df=pd.DataFrame({
    'Name':['Rakhi','Prachi','kiram','Rohit','chikku'],
    'Age':[i for i in range(30,80,10)],
    'Experience':pd.Series([i for i in range(5,18,3)])})

df

Unnamed: 0,Name,Age,Experience
0,Rakhi,30,5
1,Prachi,40,8
2,kiram,50,11
3,Rohit,60,14
4,chikku,70,17


Constructing DataFrame from numpy ndarray:

In [12]:
# Genrate Random Column name
from random_word import RandomWords
r = RandomWords()

labels=[r.get_random_word() for i in range(10)]
arr=np.random.randint(1,1000,100).reshape(10,10)
df=pd.DataFrame(arr,columns=labels)

df

Unnamed: 0,gingerleaf,spermophile,sesquialteral,pungent,inocarpus,paleoclimatological,sulphostannite,chorometry,unspurred,ghostology
0,345,940,295,292,620,885,348,712,102,331
1,530,884,247,314,854,999,761,673,676,249
2,244,143,650,913,391,587,377,418,786,324
3,211,908,658,728,916,220,191,971,127,87
4,657,552,184,559,496,608,275,480,703,981
5,116,579,757,85,924,289,924,131,264,144
6,728,346,191,804,609,623,783,745,786,977
7,33,639,765,98,924,260,983,532,689,142
8,405,289,157,795,18,252,388,363,581,761
9,58,473,542,962,332,373,443,470,406,117


In [13]:
pip install random-word

Note: you may need to restart the kernel to use updated packages.


# Attributes

## 1. 𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓪𝓽



Access a single value for a row/column label pair.

Similar to loc, in that both provide label-based lookups. Use at if you only need to get or set a single value in a DataFrame or Series.



In [32]:
#df.at[0,'lathery']

# 2.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓪𝔁𝓮𝓼

Return a list representing the axes of the DataFrame.

It has the row axis labels and column axis labels as the only members. They are returned in that order.

In [16]:
df.axes

[RangeIndex(start=0, stop=10, step=1),
 Index(['gingerleaf', 'spermophile', 'sesquialteral', 'pungent', 'inocarpus',
        'paleoclimatological', 'sulphostannite', 'chorometry', 'unspurred',
        'ghostology'],
       dtype='object')]

# 3.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓬𝓸𝓵𝓾𝓶𝓷𝓼

In [17]:
df.columns

Index(['gingerleaf', 'spermophile', 'sesquialteral', 'pungent', 'inocarpus',
       'paleoclimatological', 'sulphostannite', 'chorometry', 'unspurred',
       'ghostology'],
      dtype='object')

# 4.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓭𝓽𝔂𝓹𝓮𝓼

In [18]:
df.dtypes

gingerleaf             int32
spermophile            int32
sesquialteral          int32
pungent                int32
inocarpus              int32
paleoclimatological    int32
sulphostannite         int32
chorometry             int32
unspurred              int32
ghostology             int32
dtype: object

# 5.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓮𝓶𝓹𝓽𝔂

In [19]:
df.empty

False

# 6.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓲𝓪𝓽

In [20]:
df.iat[0,0]

345

# 7.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓲𝓵𝓸𝓬

In [21]:
df.iloc[0:,0:]

Unnamed: 0,gingerleaf,spermophile,sesquialteral,pungent,inocarpus,paleoclimatological,sulphostannite,chorometry,unspurred,ghostology
0,345,940,295,292,620,885,348,712,102,331
1,530,884,247,314,854,999,761,673,676,249
2,244,143,650,913,391,587,377,418,786,324
3,211,908,658,728,916,220,191,971,127,87
4,657,552,184,559,496,608,275,480,703,981
5,116,579,757,85,924,289,924,131,264,144
6,728,346,191,804,609,623,783,745,786,977
7,33,639,765,98,924,260,983,532,689,142
8,405,289,157,795,18,252,388,363,581,761
9,58,473,542,962,332,373,443,470,406,117


# 8.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓲ndex

In [22]:
df.index

RangeIndex(start=0, stop=10, step=1)

# 9.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓵𝓸𝓬

In [23]:
df.loc[5]

gingerleaf             116
spermophile            579
sesquialteral          757
pungent                 85
inocarpus              924
paleoclimatological    289
sulphostannite         924
chorometry             131
unspurred              264
ghostology             144
Name: 5, dtype: int32

# 10.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓷𝓭𝓲𝓶

In [24]:
df.ndim

2

# 11.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓼𝓱𝓪𝓹𝓮

In [25]:
df.shape

(10, 10)

# 12.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓼𝓲𝔃𝓮

In [26]:
df.size

100

# 13.𝓓𝓪𝓽𝓪𝓕𝓻𝓪𝓶𝓮.𝓿𝓪𝓵𝓾𝓮𝓼

In [27]:
x=df.values
x

array([[345, 940, 295, 292, 620, 885, 348, 712, 102, 331],
       [530, 884, 247, 314, 854, 999, 761, 673, 676, 249],
       [244, 143, 650, 913, 391, 587, 377, 418, 786, 324],
       [211, 908, 658, 728, 916, 220, 191, 971, 127,  87],
       [657, 552, 184, 559, 496, 608, 275, 480, 703, 981],
       [116, 579, 757,  85, 924, 289, 924, 131, 264, 144],
       [728, 346, 191, 804, 609, 623, 783, 745, 786, 977],
       [ 33, 639, 765,  98, 924, 260, 983, 532, 689, 142],
       [405, 289, 157, 795,  18, 252, 388, 363, 581, 761],
       [ 58, 473, 542, 962, 332, 373, 443, 470, 406, 117]])

# Read and Write. Tabular data


![image.png](attachment:image.png)