#### Pandas Overview

Pandas is a **data manipulation** library based on **NumPy**, used for **data analysis**, **cleaning**, and **preprocessing**.  

Pandas supports **in-memory databases**, meaning all data is stored in your computer’s RAM for fast access and processing.  

Pandas primarily provides **two data structures**:

1. **Series**
   - A **1-dimensional array-like** data structure.
   - Can hold homogeneous data.
   
2. **DataFrame**
   - A **2-dimensional**, size-mutable, table-like data structure.
   - Can hold **heterogeneous data elements** (different data types in different columns).
   - Provides **labeled access** to rows and columns.


#### importing 

In [44]:
import pandas as pd
import numpy as np

In [45]:
labels=['a','b','c']
labels
mydata=[10,20,30]
mydata
arr=np.array(mydata)
arr
d={'a':10,'b':20,'c':30}
d
type(arr)

numpy.ndarray

In [46]:
pd.Series(data=mydata)

0    10
1    20
2    30
dtype: int64

In [47]:
type(pd.Series(data=mydata))

pandas.core.series.Series

#### parameters passed in    Series 

In [48]:
'''pandas.Series(
    data=None,
    index=None,
    dtype=None,
    name=None,
    copy=False,
    fastpath=False
)
'''

'pandas.Series(\n    data=None,\n    index=None,\n    dtype=None,\n    name=None,\n    copy=False,\n    fastpath=False\n)\n'

In [49]:
pd.Series(data=mydata, index=labels)


a    10
b    20
c    30
dtype: int64

In [50]:
pd.Series(d)
d

{'a': 10, 'b': 20, 'c': 30}

In [51]:

pd_series = pd.Series(labels)
print(pd_series)

pd_series = pd.Series(d)
print(pd_series)


pd_series = pd.Series([sum,len,print])
print(pd_series)

0    a
1    b
2    c
dtype: object
a    10
b    20
c    30
dtype: int64
0      <built-in function sum>
1      <built-in function len>
2    <built-in function print>
dtype: object


#### dataframes
1. like excel , having rows and columns 
2. 2d 


In [52]:
from  numpy.random import randn
np.random.seed(101)

df=pd.DataFrame(randn(5,4),['A','B','C','D','E'],['K','L','M','N'])
df


Unnamed: 0,K,L,M,N
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [53]:
print(df['K'])
print(df[['K','M']])

print(type(df['K']))


A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: K, dtype: float64
          K         M
A  2.706850  0.907969
B  0.651118 -0.848077
C -2.018168  0.528813
D  0.188695 -0.933237
E  0.190794  2.605967
<class 'pandas.core.series.Series'>


In [54]:
df.drop('K',axis=1, inplace =True)

In [55]:
df

Unnamed: 0,L,M,N
A,0.628133,0.907969,0.503826
B,-0.319318,-0.848077,0.605965
C,0.740122,0.528813,-0.589001
D,-0.758872,-0.933237,0.955057
E,1.978757,2.605967,0.683509


In [56]:
df.drop('A',axis=0, inplace =True)

In [57]:
df

Unnamed: 0,L,M,N
B,-0.319318,-0.848077,0.605965
C,0.740122,0.528813,-0.589001
D,-0.758872,-0.933237,0.955057
E,1.978757,2.605967,0.683509
