In the previous section, you saw how to use Panda Series, which is one object in Pandas. In this section, you will learn about another object called a Data Frame. Hopefully, you've started to notice a pattern of 1-dimensional and 2-dimensional data structures, and labelled vs sequential (0,1,2,3,4...) indexes. The Panda Series is a 1-dimensional labelled structure. Data Frame is a 2-dimensional labelled structure. As with a Series, it is built on top of NumPy arrays to take advantage of faster processing (compared to Python Lists and Dictionaries).

## Data Frames

Data Frames are 2-dimensional arrays with attached row and columns labels, and generally with different data types across columns and/or missing data. A Data Frame is very similar to a spreadsheet.

In the table, you can see the structure of a Data Frame.

![image.png](attachment:image.png)

## How to construct Data Frame objects

You can create a Data Frame from lists, dictionaries, arrays, Series or combinations of these. The most common way to create a new Data Frame is using a DataFrame() constructor and passing a dictionary as a parameter, but you'll look at all of these options in this section.

As a Data Frame is an object, it has attributes and methods that you will explore throughout this section.

First step in working with Data Frame in Pandas is import the Pandas and NumPy libraries.

In [3]:
import pandas as pd
import numpy as np

#Create a DataFrame 
#from two-dimensional array 
df1 = pd.DataFrame(np.array([[6.5, 90.3], [3.6, 3.2]]))
df1

Unnamed: 0,0,1
0,6.5,90.3
1,3.6,3.2


In [4]:
# creating a DataFrame
# from a List of Series object
df2 = pd.DataFrame([pd.Series(np.arange(1,6)), # First row(series 1 to 5) with 5 elements
                   pd.Series(np.arange(6,11)), # Second row(series 6 to 10) with 5 elements
                   pd.Series(np.arange(11,16)),]) # Thurd row(series 11 to 16) with 5 elements

df2

Unnamed: 0,0,1,2,3,4
0,1,2,3,4,5
1,6,7,8,9,10
2,11,12,13,14,15


Note: The rows and columns are labelled with numbers. When you don't specify labels for the rows and/or columns, by default Pandas creates a range of the integers starting from 0 as the labels.

As with NumPy arrays, you can use the shape attribute to get the dimensions of a Data Frame.

In [5]:
# getting the shape of df2
df2.shape
# output (3,5) -> 3 rows and 5 columns

(3, 5)