# Create Pandas DataFrame from Python List

## Overview
In this class, you will learn how to convert a Python **List** into a **Pandas DataFrame**.  
We will cover:
- Creating a DataFrame from different types of lists.
- Handling single lists, multiple lists.
- Adding lists as rows or columns in a DataFrame.

---

## What is a List in Python?
- A **List** is a simple data structure in Python that stores multiple values.
- It can contain **heterogeneous elements** (values of different types).
- Lists are **mutable** and allow modifications.

---

## Why Convert a List to a Pandas DataFrame?
- A **DataFrame** provides a **structured, tabular format** for efficient data analysis.
- Converting a List into a **2D structure** makes data easier to process.
- Pandas offers various functions to **manipulate, analyze, and visualize** data efficiently.

---

## How to Create a DataFrame from a List?
- Use the **`pd.DataFrame()`** constructor from the Pandas library.
- Different cases of DataFrame creation include:
  - **Single list** → Converts into a single-column DataFrame.
  - **Multiple lists** → Each list becomes a separate column or row.

---


## Create DataFrame from list using constructor

**DataFrame constructor** can create DataFrame from different data structures in python like **`dict`**, list, set, tuple, and **`ndarray`**.

**Example:**

Here we create a DataFrame object using a list of heterogeneous data. By default, all list elements are added as a row in the DataFrame. And row index is the range of numbers(starting at 0).

In [None]:
import pandas as pd
import numpy as np

# Create list
fruits_list = ['Apple', 10, 'Orange', 55.50]
print(fruits_list)

# Create DataFrame from list
fruits_df = pd.DataFrame(fruits_list)
fruits_df

['Apple', 10, 'Orange', 55.5]


Unnamed: 0,0
0,Apple
1,10
2,Orange
3,55.5


In [None]:
type(fruits_list), type(fruits_df)

(list, pandas.core.frame.DataFrame)

## Create DataFrame from list with a customized column name

While creating a DataFrame from the list, we can give a customized column label in the resultant DataFrame. By default, it provides a range of integers as column labels, i.e., 0, 1, 2…n.

We can specify column labels into the **`columns=[col_labels]`** parameter in the DataFrame constructor.

**Example:**

In the below example, we create DataFrame from a list of fruit names and provides a column label as **`Fruits`**.

In [None]:
import pandas as pd

# Create list
fruits_list = ['Apple', 'Banana', 'Orange','Mango']
print(fruits_list)

# Create DataFrame from list
fruits_df = pd.DataFrame(fruits_list, columns=['Fruits'])
fruits_df

['Apple', 'Banana', 'Orange', 'Mango']


Unnamed: 0,Fruits
0,Apple
1,Banana
2,Orange
3,Mango


In [None]:
fruits_df['Fruits']

0     Apple
1    Banana
2    Orange
3     Mango
Name: Fruits, dtype: object

In [None]:
type(fruits_df['Fruits'])

## Create DataFrame from list with a customized index

As we just discussed the changing column label, we can even customize the row index as well. We can give a meaningful row index to identify each row uniquely. It becomes easier to access the rows using the index label.

We can specify row index into the **`index=[row_index1, row_index2]`** parameter in the DataFrame constructor. By default, it gives a range of integers as row index i.e. 0, 1, 2…n.

**Example:**

Let’s see how we can provide the custom row index while creating DataFrame from the List.

In [None]:
import pandas as pd

# Create list
fruits_list = ['Apple', 'Banana', 'Orange','Mango']
print(fruits_list)

# Create DataFrame from list
fruits_df = pd.DataFrame(fruits_list, index=['Fruit1', 'Fruit2', 'Fruit3', 'Fruit4'])
fruits_df

['Apple', 'Banana', 'Orange', 'Mango']


Unnamed: 0,0
Fruit1,Apple
Fruit2,Banana
Fruit3,Orange
Fruit4,Mango


In [None]:
fruits_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 4 entries, Fruit1 to Fruit4
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   0       4 non-null      object
dtypes: object(1)
memory usage: 64.0+ bytes


## Create DataFrame from list by changing data type

While converting a Python List to the DataFrame, we may need to change the values’ data type.

We can change the data type of the list elements using the **`dtype`** parameter of the DataFrame constructor.

**Example:**

Suppose we have a list of fruit’s prices of type **object**. But, while creating DataFrame we need to correct its data type to **float64**. In such case we use **`dtype`** parameter as shown below example.

In [None]:
import pandas as pd

# Create list
price_list = ['50', '100', '60', '20']
print(price_list)

# Create DataFrame from list
price_df = pd.DataFrame(price_list)
print("Data type before : ", price_df.dtypes)


['50', '100', '60', '20']
Data type before :  0    object
dtype: object


In [None]:
display(price_df)

Unnamed: 0,0
0,50
1,100
2,60
3,20


In [None]:
type(price_df)

In [None]:
price_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   0       4 non-null      object
dtypes: object(1)
memory usage: 160.0+ bytes


In [None]:

# Create DataFrame from list with type change
price_df = pd.DataFrame(price_list, dtype=np.int64)
print("Data type after : ", price_df.dtypes)


ValueError: Trying to coerce float values to integers

In [None]:
price_df[0] = price_df[0].astype("int64")

In [None]:
display(price_df)

Unnamed: 0,0
0,50
1,100
2,60
3,20


In [None]:
price_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       4 non-null      float64
dtypes: float64(1)
memory usage: 160.0 bytes


## Create DataFrame from multiple lists

It is the most common use case in the industry where you have multiple separate lists, and you need to add them as different columns in the DataFrame. This case can be resolved by using following two ways:

1. **`zip(list1, list2...)`**
2. **`dict { 'col1' : list1, 'col2' : list2}`**

**Example:**

The below example demonstrates the use of **`zip()`** function to combine multiple lists in one list and pass it to the DataFrame constructor.

In [None]:
import pandas as pd

# Create multiple lists
fruits_list = ['Apple', 'Banana', 'Orange', 'Mango']
price_list = [120, 40, 80, 500]

# Create DataFrame
fruits_df = pd.DataFrame(list(zip(fruits_list, price_list )), columns = ['Name', 'Price'])
fruits_df

Unnamed: 0,Name,Price
0,Apple,120
1,Banana,40
2,Orange,80
3,Mango,500


In [None]:
dict1 = {'Age':[10,20,30], 'class2':30, 'class3':40}

In [None]:
type(dict1)

dict

In [None]:
dict1['class1']

50

The below example demonstrates the use of Python dictionary data structure to solve the purpose. Here, column names are keys of the dict and, lists are the values of dict which need to be added in the DataFrame.

In [None]:
import pandas as pd

# Create multiple lists
fruits_list = ['Apple', 'Banana', 'Orange', 'Mango']
price_list = [120, 40, 80, 500]

# Create dict
fruits_dict = {'Name': fruits_list,
               'Price': price_list}
print(fruits_dict)

# Create DataFrame from dict
fruits_df = pd.DataFrame(fruits_dict)
fruits_df


{'Name': ['Apple', 'Banana', 'Orange', 'Mango'], 'Price': [120, 40, 80, 500]}


Unnamed: 0,Name,Price
0,Apple,120
1,Banana,40
2,Orange,80
3,Mango,500


In [None]:
fruits_dict['Name']

['Apple', 'Banana', 'Orange', 'Mango']

In [None]:
fruits_df.shape()

TypeError: 'tuple' object is not callable

In [None]:
fruits_df.shape

(4, 2)

In [None]:
fruits_df.info

In [None]:
fruits_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    4 non-null      object
 1   Price   4 non-null      int64 
dtypes: int64(1), object(1)
memory usage: 192.0+ bytes
