# Data Frame
DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used pandas object. Like Series, DataFrame accepts many different kinds of input:

Dict of 1D ndarrays, lists, dicts, or Series

2-D numpy.ndarray

Structured or record ndarray

A Series

Another DataFrame

Along with the data, you can optionally pass index (row labels) and columns (column labels) arguments. If you pass an index and / or columns, you are guaranteeing the index and / or columns of the resulting DataFrame. Thus, a dict of Series plus a specific index will discard all data not matching up to the passed index.

If axis labels are not passed, they will be constructed from the input data based on common sense rules.

In [2]:
import numpy as np
import pandas as pd

# Create DataFrame

## From dict of Series or dicts
The resulting index will be the union of the indexes of the various Series. If there are any nested dicts, these will first be converted to Series. If no columns are passed, the columns will be the ordered list of dict keys.

In [3]:
d = { 
    'one': pd.Series([1,2,3],index= ['a','b','c']),
    'two': pd.Series([4,5,6],index= ['a','b','c']),
    'three': pd.Series([7,8,9,10], index = ['a','d','c','f'])
}

In [4]:
df = pd.DataFrame(d)
df

Unnamed: 0,one,two,three
a,1.0,4.0,7.0
b,2.0,5.0,
c,3.0,6.0,9.0
d,,,8.0
f,,,10.0


In [8]:
df = pd.DataFrame(d, columns=['one','two','three'])
df

Unnamed: 0,one,two,three
a,1.0,4.0,7.0
b,2.0,5.0,
c,3.0,6.0,9.0
d,,,8.0
f,,,10.0


In [12]:
print(df.index)
print(df.columns)

Index(['a', 'b', 'c', 'd', 'f'], dtype='object')
Index(['one', 'two', 'three'], dtype='object')


## From dict of ndarrays
The ndarrays must all be the same length. If an index is passed, it must clearly also be the same length as the arrays. If no index is passed, the result will be range(n), where n is the array length.

In [14]:
data = {
    'one':[1,2,3,4,5],
    'two':[5,6,7,8,9]
}

In [15]:
df = pd.DataFrame(data) #no index is passed
df

Unnamed: 0,one,two
0,1,5
1,2,6
2,3,7
3,4,8
4,5,9


In [16]:
df = pd.DataFrame(data, index=['a','b','c','e','f']) #index is passed
df

Unnamed: 0,one,two
a,1,5
b,2,6
c,3,7
e,4,8
f,5,9


## from List
list can be one dimensional, two dimensional and so on. if we pass one list then pandas will count it as data and if we pass two list then first list count as data and second will be count as index and if we pass three list then 3rd list count as column

In [19]:
pd.DataFrame([['sharif',23,"5.7\"",76,"maried"],
              ['adt',22,"5.4\"",54,"maried"]],
             index=[1,2],
             columns=['name','age','height','wight','status']
            )

Unnamed: 0,name,age,height,wight,status
1,sharif,23,"5.7""",76,maried
2,adt,22,"5.4""",54,maried


In [20]:
pd.DataFrame([['sharif',23,"5.7\"",76,"maried"],
              ['adt',22,"5.4\"",54,"maried"]],
             [1,2],
             ['name','age','height','wight','status']
            )

Unnamed: 0,name,age,height,wight,status
1,sharif,23,"5.7""",76,maried
2,adt,22,"5.4""",54,maried


## From structured or record array
This case is handled identically to a dict of arrays.

In [21]:
data = np.zeros((2, ), dtype=[('A', 'i4'), ('B', 'f4'), ('C', 'a10')])
data[:] = [(1, 2., 'Hello'), (2, 3., "World")]
pd.DataFrame(data)

Unnamed: 0,A,B,C
0,1,2.0,b'Hello'
1,2,3.0,b'World'


In [None]:
pd.DataFrame(data, index=['first', 'second'])