# Pandas DataFrames


####  What is a DataFrame?
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns.

![title](structure_table.jpg)

`pandas.DataFrame`


A pandas DataFrame can be created using the following constructor −

pandas.DataFrame( data, index, columns, dtype, copy)

# ` Create an Empty DataFrame`
A basic DataFrame, which can be created is an Empty Dataframe.

In [1]:
import pandas as pd
df = pd.DataFrame()

print(df)

Empty DataFrame
Columns: []
Index: []


# ` Create a DataFrame from Lists`
The DataFrame can be created using a single list or a list of lists.



# Example 1

In [5]:
import pandas as pd
data = [1,2,3,4,5]
df =pd.DataFrame(data)
print(df)

   0
0  1
1  2
2  3
3  4
4  5


# Example 2

In [6]:
data = [["yasir",'20'],['adnan','21'],['Ali','22']]
df = pd.DataFrame(data,columns=["Names","Age"])
print(df)

   Names Age
0  yasir  20
1  adnan  21
2    Ali  22


# Example 3

In [6]:
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df1 = pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print(df1)

     Name   Age
0    Alex  10.0
1     Bob  12.0
2  Clarke  13.0


  df1 = pd.DataFrame(data,columns=['Name','Age'],dtype=float)


# `Create a DataFrame from Dict of ndarrays / Lists`
All the ndarrays must be of same length. If index is passed, then the length of the index should equal to the length of the arrays.

If no index is passed, then by default, index will be range(n), where n is the array length.

In [8]:
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print(df)

    Name  Age
0    Tom   28
1   Jack   34
2  Steve   29
3  Ricky   42


# Example 2
Let us now create an indexed DataFrame using arrays

In [11]:
import pandas as pd
data1 = {"Names":["Yasir","adnan","Jawad","Qadir"],"Age" :[12,13,14,15]}
df = pd.DataFrame(data1,index = ["20mte006",'20mte003','20mte007','20cs014'])
print(df)

          Names  Age
20mte006  Yasir   12
20mte003  adnan   13
20mte007  Jawad   14
20cs014   Qadir   15


# Create a DataFrame from List of Dicts
List of Dictionaries can be passed as input data to create a DataFrame. The dictionary keys are by default taken as column names.

In [13]:
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
print(df)

   a   b     c
0  1   2   NaN
1  5  10  20.0


# Column Selection


In [14]:
import pandas as pd
data1 = {"Names":["Yasir","adnan","Jawad","Qadir"],"Age" :[12,13,14,15]}
df = pd.DataFrame(data1,index = ["20mte006",'20mte003','20mte007','20cs014'])
print(df)

          Names  Age
20mte006  Yasir   12
20mte003  adnan   13
20mte007  Jawad   14
20cs014   Qadir   15


In [15]:
df['Names']

20mte006    Yasir
20mte003    adnan
20mte007    Jawad
20cs014     Qadir
Name: Names, dtype: object

In [17]:
df.index

Index(['20mte006', '20mte003', '20mte007', '20cs014'], dtype='object')

# Column Addition
We will understand this by adding a new column to an existing data frame.

In [36]:
dat = {"one":[1,2,3,4,5],"two":[4,5,6,7,8],'three':[7,8,9,10,11]}
dff = pd.DataFrame(dat)

print(dff)

   one  two  three
0    1    4      7
1    2    5      8
2    3    6      9
3    4    7     10
4    5    8     11


In [37]:
dff['four'] = [1,121,3,14,15]
dff

Unnamed: 0,one,two,three,four
0,1,4,7,1
1,2,5,8,121
2,3,6,9,3
3,4,7,10,14
4,5,8,11,15


In [38]:
dff['Five'] = dff['four']+dff['three']
print(dff)


   one  two  three  four  Five
0    1    4      7     1     8
1    2    5      8   121   129
2    3    6      9     3    12
3    4    7     10    14    24
4    5    8     11    15    26


# Column Deletion
Columns can be deleted or popped; let us take an example to understand how.

Example

In [39]:
my_df = dff
print(my_df)

   one  two  three  four  Five
0    1    4      7     1     8
1    2    5      8   121   129
2    3    6      9     3    12
3    4    7     10    14    24
4    5    8     11    15    26


### `Del Function `

In [40]:
del my_df['Five']
print(my_df)

   one  two  three  four
0    1    4      7     1
1    2    5      8   121
2    3    6      9     3
3    4    7     10    14
4    5    8     11    15


`Deleting Column Using ` Del Function`  `

### `POP() Fun`

In [42]:
my_df.pop('four')
my_df

Unnamed: 0,one,two,three
0,1,4,7
1,2,5,8
2,3,6,9
3,4,7,10
4,5,8,11


# ` Row Selection, Addition, and Deletion `
We will now understand row selection, addition and deletion through examples. Let us begin with the concept of selection.

Selection by Label
Rows can be selected by passing row label to a loc function.

In [6]:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 
   'two' : pd.Series([1, 2, 3, 4],index = ['a','b','c','d'])}
df = pd.DataFrame(d)
print(df)

   one  two
a  1.0    1
b  2.0    2
c  3.0    3
d  NaN    4


In [8]:
df.loc['b']

one    2.0
two    2.0
Name: b, dtype: float64

# Selection by integer location
Rows can be selected by passing integer location to an iloc function.

In [9]:
df.iloc[2]

one    3.0
two    3.0
Name: c, dtype: float64

In [10]:
df.iloc[3]

one    NaN
two    4.0
Name: d, dtype: float64

# Addition of Rows
Add new rows to a DataFrame using the append function. This function will append the rows at the end.

In [11]:
import pandas as pd

df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])


In [12]:
df

Unnamed: 0,a,b
0,1,2
1,3,4


In [13]:
df2

Unnamed: 0,a,b
0,5,6
1,7,8


In [19]:
new_df = df.append(df2)
print(new_df)

   a  b
0  1  2
1  3  4
0  5  6
1  7  8


  new_df = df.append(df2)


# Deletion of Rows
Use index label to delete or drop rows from a DataFrame. If label is duplicated, then multiple rows will be dropped.

If you observe, in the above example, the labels are duplicate. Let us drop a label and will see how many rows will get dropped

In [20]:
new_df

Unnamed: 0,a,b
0,1,2
1,3,4
0,5,6
1,7,8


In [22]:
new_df.drop(1)

Unnamed: 0,a,b
0,1,2
0,5,6
