![image.png](attachment:image.png)

### Install Libraries
To install libraries mostly we use **pip install library**.
To install **pandas** and **Numpy** library we use following commands.
- **pip install pandas**
- **pip install numpy**

#### Import Libraries
After the installation of required libraries we have to import those libraries to use different methods/fuction.\
We import libraries like this:\
**For example:**
#### **import pandas as pd**
- Using **import** we can actually import library.
- **Panda** is the library name.
- **pd** is short form of pandas for easy use.
- In the same way we can install **import numpy as np**.

In [2]:
import pandas as pd
import numpy as np

### Object Creation
  As same in numpy we can create here an object or make a series containing differnt objects.
  - Object can be created from **dates, random numbers** and also using **dictionaries.** 

In [3]:
s=pd.Series([1,2,3,np.nan,4,5])
s

0    1.0
1    2.0
2    3.0
3    NaN
4    4.0
5    5.0
dtype: float64

In [12]:
dates=pd.date_range("20220719",periods=5)
dates

DatetimeIndex(['2022-07-19', '2022-07-20', '2022-07-21', '2022-07-22',
               '2022-07-23'],
              dtype='datetime64[ns]', freq='D')

#### Using random number and dates we can actually create a DataFrame.

In [13]:
df = pd.DataFrame(np.random.randn(5,4),index=dates, columns=list("ABCD"))
df

Unnamed: 0,A,B,C,D
2022-07-19,0.335925,1.179993,0.74329,0.447257
2022-07-20,-0.833055,-0.518417,-1.011523,-1.237513
2022-07-21,0.126407,1.222909,0.563691,-1.919835
2022-07-22,0.984153,-0.058488,1.348612,0.173
2022-07-23,-1.467302,-0.278251,0.881815,-0.707297


### Creating a DataFrame using dictionary
Dictionary consist on two major things.\
**1- Key**\
**1- Value**

In [14]:
df2 = pd.DataFrame(
    {
        "A":1.0,
        "B":pd.Timestamp("20220720"),
        "C":pd.Series(1,index=list(range(4)), dtype="float32"),
        "D":np.array([3]*4,dtype="int32"),
        "E":pd.Categorical(["girl", "women","girl", "women"]),
        "F":"females"
    }
)
df2

Unnamed: 0,A,B,C,D,E,F
0,1.0,2022-07-20,1.0,3,girl,females
1,1.0,2022-07-20,1.0,3,women,females
2,1.0,2022-07-20,1.0,3,girl,females
3,1.0,2022-07-20,1.0,3,women,females


**Checking Data types of variables**

In [15]:
df2.dtypes

A           float64
B    datetime64[ns]
C           float32
D             int32
E          category
F            object
dtype: object

**Using** "*head*" **function we can view first five rows of data by default**\
**We can increase or decrease row also.**

In [16]:
df.head(2)

Unnamed: 0,A,B,C,D
2022-07-19,0.335925,1.179993,0.74329,0.447257
2022-07-20,-0.833055,-0.518417,-1.011523,-1.237513


**Using** "*tail*" **function we can view last five rows of data by default**

In [9]:
df.tail(2)

Unnamed: 0,A,B,C,D
2022-07-22,0.533342,-1.337875,0.01112,1.206803
2022-07-23,0.13139,-0.615857,0.122526,0.180004


**Index is head of rows actually**\
We can view them also using **index** function

In [17]:
df.index

DatetimeIndex(['2022-07-19', '2022-07-20', '2022-07-21', '2022-07-22',
               '2022-07-23'],
              dtype='datetime64[ns]', freq='D')

To convert DataFrame into array we can use following command.\
**df.to_numpy()**

In [18]:
df.to_numpy()

array([[ 0.3359247 ,  1.17999306,  0.74329035,  0.44725735],
       [-0.83305456, -0.51841743, -1.01152343, -1.23751252],
       [ 0.12640725,  1.22290879,  0.56369066, -1.91983473],
       [ 0.98415309, -0.05848849,  1.34861162,  0.17300033],
       [-1.46730191, -0.27825061,  0.88181461, -0.70729687]])

We use **describe()** function to view data briefly.

In [19]:
df.describe()

Unnamed: 0,A,B,C,D
count,5.0,5.0,5.0,5.0
mean,-0.170774,0.309549,0.505177,-0.648877
std,0.974489,0.83042,0.896345,0.98008
min,-1.467302,-0.518417,-1.011523,-1.919835
25%,-0.833055,-0.278251,0.563691,-1.237513
50%,0.126407,-0.058488,0.74329,-0.707297
75%,0.335925,1.179993,0.881815,0.173
max,0.984153,1.222909,1.348612,0.447257


**Converting rows in columns**

In [20]:
df.T

Unnamed: 0,2022-07-19,2022-07-20,2022-07-21,2022-07-22,2022-07-23
A,0.335925,-0.833055,0.126407,0.984153,-1.467302
B,1.179993,-0.518417,1.222909,-0.058488,-0.278251
C,0.74329,-1.011523,0.563691,1.348612,0.881815
D,0.447257,-1.237513,-1.919835,0.173,-0.707297


**We can also sort Data in ascending and decending order on specific axos also.**

In [21]:
df.sort_index(axis=0, ascending=True)
# df.sort_index(axis=0, ascending=False)

Unnamed: 0,A,B,C,D
2022-07-19,0.335925,1.179993,0.74329,0.447257
2022-07-20,-0.833055,-0.518417,-1.011523,-1.237513
2022-07-21,0.126407,1.222909,0.563691,-1.919835
2022-07-22,0.984153,-0.058488,1.348612,0.173
2022-07-23,-1.467302,-0.278251,0.881815,-0.707297


In [22]:
df.sort_values(by="B", ascending=True)

Unnamed: 0,A,B,C,D
2022-07-20,-0.833055,-0.518417,-1.011523,-1.237513
2022-07-23,-1.467302,-0.278251,0.881815,-0.707297
2022-07-22,0.984153,-0.058488,1.348612,0.173
2022-07-19,0.335925,1.179993,0.74329,0.447257
2022-07-21,0.126407,1.222909,0.563691,-1.919835


**Column wise Selection**

In [23]:
df["A"]

2022-07-19    0.335925
2022-07-20   -0.833055
2022-07-21    0.126407
2022-07-22    0.984153
2022-07-23   -1.467302
Freq: D, Name: A, dtype: float64

In [24]:
df["B"]

2022-07-19    1.179993
2022-07-20   -0.518417
2022-07-21    1.222909
2022-07-22   -0.058488
2022-07-23   -0.278251
Freq: D, Name: B, dtype: float64

**Row wise slection**

In [25]:
df[0:2]

Unnamed: 0,A,B,C,D
2022-07-19,0.335925,1.179993,0.74329,0.447257
2022-07-20,-0.833055,-0.518417,-1.011523,-1.237513


### Row wise slection
**Also Date wise slection**\
**showing data of each colum of date at 0 index**.

In [27]:
df.loc[dates[0]]

A    0.335925
B    1.179993
C    0.743290
D    0.447257
Name: 2022-07-19 00:00:00, dtype: float64

### Column wise slection of data

In [20]:
#Column wise slection of data
df.loc[:,["A","C"]]

Unnamed: 0,A,C
2022-07-19,-1.252699,1.456603
2022-07-20,0.088898,-2.313301
2022-07-21,-0.653696,0.09558
2022-07-22,0.533342,0.01112
2022-07-23,0.13139,0.122526


### Column wise slection of data
**2022-07-19 se ly kr 2022-07-22 tk ka data aye ga**

In [28]:
df.loc["20220719":"20220722",["A","C"]]

Unnamed: 0,A,C
2022-07-19,0.335925,0.74329
2022-07-20,-0.833055,-1.011523
2022-07-21,0.126407,0.563691
2022-07-22,0.984153,1.348612


### Column wise slection of data
**only 2022-07-19 or 2022-07-22 ka data aye ga**
- here order is also change and we can..

In [22]:

df.loc[["20220719","20220722"],["A","C","B"]] #here order is also change and we can..

Unnamed: 0,A,C,B
2022-07-19,-1.252699,1.456603,-1.327766
2022-07-22,0.533342,0.01112,-1.337875


#### Data of Soecific Date

In [23]:
df.loc["20220719",["A","C","B"]]

A   -1.252699
C    1.456603
B   -1.327766
Name: 2022-07-19 00:00:00, dtype: float64

In [24]:
df.loc[["20220719"],["A","C","B"]]

Unnamed: 0,A,C,B
2022-07-19,-1.252699,1.456603,-1.327766


In [25]:
df.at[dates[4],"A"]

0.13139026591018935

In [26]:
df.iloc[3]

A    0.533342
B   -1.337875
C    0.011120
D    1.206803
Name: 2022-07-22 00:00:00, dtype: float64

In [27]:
df.iloc[2:4]

Unnamed: 0,A,B,C,D
2022-07-21,-0.653696,-0.797819,0.09558,1.510146
2022-07-22,0.533342,-1.337875,0.01112,1.206803


**Getting Data of four columns with three rows**

In [28]:
df.iloc[0:3,0:4]

Unnamed: 0,A,B,C,D
2022-07-19,-1.252699,-1.327766,1.456603,-0.398064
2022-07-20,0.088898,0.584527,-2.313301,0.168674
2022-07-21,-0.653696,-0.797819,0.09558,1.510146


**Getting Data of all columns with three rows**

In [29]:
df.iloc[0:3,:]

Unnamed: 0,A,B,C,D
2022-07-19,-1.252699,-1.327766,1.456603,-0.398064
2022-07-20,0.088898,0.584527,-2.313301,0.168674
2022-07-21,-0.653696,-0.797819,0.09558,1.510146


**Getting Data of two columns with all rows**

In [30]:
df.iloc[:,0:2]

Unnamed: 0,A,B
2022-07-19,-1.252699,-1.327766
2022-07-20,0.088898,0.584527
2022-07-21,-0.653696,-0.797819
2022-07-22,0.533342,-1.337875
2022-07-23,0.13139,-0.615857


**Finding Values greater than zer0 in column A**

In [31]:
df[df["A"]>0]

Unnamed: 0,A,B,C,D
2022-07-20,0.088898,0.584527,-2.313301,0.168674
2022-07-22,0.533342,-1.337875,0.01112,1.206803
2022-07-23,0.13139,-0.615857,0.122526,0.180004


In [32]:
df[df["B"]>0]

Unnamed: 0,A,B,C,D
2022-07-20,0.088898,0.584527,-2.313301,0.168674


#### Find Vlues Greater than zero in whole Dataset

In [33]:
df[df>0]

Unnamed: 0,A,B,C,D
2022-07-19,,,1.456603,
2022-07-20,0.088898,0.584527,,0.168674
2022-07-21,,,0.09558,1.510146
2022-07-22,0.533342,,0.01112,1.206803
2022-07-23,0.13139,,0.122526,0.180004


### Add new column in data 

In [34]:
df3=df.copy()
df3["babaG"]=["one","three","four","two","five"]
df3

Unnamed: 0,A,B,C,D,babaG
2022-07-19,-1.252699,-1.327766,1.456603,-0.398064,one
2022-07-20,0.088898,0.584527,-2.313301,0.168674,three
2022-07-21,-0.653696,-0.797819,0.09558,1.510146,four
2022-07-22,0.533342,-1.337875,0.01112,1.206803,two
2022-07-23,0.13139,-0.615857,0.122526,0.180004,five


In [35]:
df3["new"]=[1.2,2.2,2.3,2.4,2.5]
df3

Unnamed: 0,A,B,C,D,babaG,new
2022-07-19,-1.252699,-1.327766,1.456603,-0.398064,one,1.2
2022-07-20,0.088898,0.584527,-2.313301,0.168674,three,2.2
2022-07-21,-0.653696,-0.797819,0.09558,1.510146,four,2.3
2022-07-22,0.533342,-1.337875,0.01112,1.206803,two,2.4
2022-07-23,0.13139,-0.615857,0.122526,0.180004,five,2.5


**Finding mean at axis 1 and creating a new column for that mean**

In [36]:
df3["Mean"]= df3.mean(axis=1)
df3

  df3["Mean"]= df3.mean(axis=1)


Unnamed: 0,A,B,C,D,babaG,new,Mean
2022-07-19,-1.252699,-1.327766,1.456603,-0.398064,one,1.2,-0.064385
2022-07-20,0.088898,0.584527,-2.313301,0.168674,three,2.2,0.14576
2022-07-21,-0.653696,-0.797819,0.09558,1.510146,four,2.3,0.490842
2022-07-22,0.533342,-1.337875,0.01112,1.206803,two,2.4,0.562678
2022-07-23,0.13139,-0.615857,0.122526,0.180004,five,2.5,0.463613


## Delte Column

In [40]:
df3 = df3.iloc[:, 0:2]
df3

Unnamed: 0,A,B
2022-07-19,-1.252699,-1.327766
2022-07-20,0.088898,0.584527
2022-07-21,-0.653696,-0.797819
2022-07-22,0.533342,-1.337875
2022-07-23,0.13139,-0.615857
