## Let us understand the heterogeneous entity provided by Pandas named as DataFrame.

#### Dataframe is a heterogeneous object and hence are able to store different datatypes together. It is built upon the numpy module, therefore, most of the numpy concepts remain the same.

In [31]:
# Importing Pandas Module
import pandas as pd

## The simplest dataframe can be created in various ways as illustrated below:

### 1. Using list of dictionary 

In [32]:
lst = [{'C1' : 1, 'C2' : 2}, {'C3' : 3, 'C4' : 4, 'C4' : 5}]
print(pd.DataFrame(lst, index = ["R1", "R2"]))

     C1   C2   C3   C4
R1  1.0  2.0  NaN  NaN
R2  NaN  NaN  3.0  5.0


**Observe NaN -> which simply means 'Not a Number'. And is the primary way to represent missing or undefined data within DataFrames and Series.**

### 2. Using dictionary

In [33]:
dc = {"C1" : [1, 2],
      "C2" : [3, 4]}
print(pd.DataFrame(dc, index = ["R1", "R2"]))

    C1  C2
R1   1   3
R2   2   4


### 3. Using list

In [34]:
lst = [[10, 20], [30, 40]]
print(pd.DataFrame(lst, index = list('pq'), columns = list('ab')))

    a   b
p  10  20
q  30  40


## We can also have a heterogeneous dataframe in which different columns have different datatypes:

In [35]:
df = pd.DataFrame({'A' : [10., 20.],
                   'B' : 'Text',
                   'C' : [1, 2],
                   'D' : 3 + 9j })
print(df)

      A     B  C         D
0  10.0  Text  1  3.0+9.0j
1  20.0  Text  2  3.0+9.0j


In [36]:
print(df.dtypes)

A       float64
B        object
C         int64
D    complex128
dtype: object


**To reveal complete information about a given dataframe we use DataFrameName.info() command:**

In [37]:
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype     
---  ------  --------------  -----     
 0   A       2 non-null      float64   
 1   B       2 non-null      object    
 2   C       2 non-null      int64     
 3   D       2 non-null      complex128
dtypes: complex128(1), float64(1), int64(1), object(1)
memory usage: 212.0+ bytes
None


### We can retrieve values of dataframe without labels and also try to retrieve the names of rows and columns individually:

In [38]:
print(df.index)

RangeIndex(start=0, stop=2, step=1)


In [39]:
print(df.columns)

Index(['A', 'B', 'C', 'D'], dtype='object')


In [40]:
print(df.values)

[[10.0 'Text' 1 (3+9j)]
 [20.0 'Text' 2 (3+9j)]]
