# Dataframes from other Python structures

Sometimes you only have a few lines of data, not a whole Excel sheet etc

Let's make up some data and insert it in a dataframe

In [1]:
import pandas as pd
from collections import OrderedDict
from datetime import date

The “default” manner to create a DataFrame from python is to use a list of dictionaries. In this case each dictionary key is used for the column headings. A default index will be created automatically:

In [4]:
sales = [{'account': 'Jones LLC', 'Jan': 150, 'Feb': 200, 'Mar': 140},
         {'account': 'Alpha Co',  'Jan': 200, 'Feb': 210, 'Mar': 215},
         {'account': 'Blue Inc',  'Jan': 50,  'Feb': 90,  'Mar': 95 }]
df = pd.DataFrame(sales)
df.head()

Unnamed: 0,Feb,Jan,Mar,account
0,200,150,140,Jones LLC
1,210,200,215,Alpha Co
2,90,50,95,Blue Inc


As you can see, this approach is very “row oriented”. If you would like to create a DataFrame in a “column oriented” manner, you would use from_dict and you can order it

In [7]:
sales = OrderedDict([ ('account', ['Jones LLC', 'Alpha Co', 'Blue Inc']),
          ('Jan', [150, 200, 50]),
          ('Feb',  [200, 210, 90]),
          ('Mar', [140, 215, 95]) ] )
df = pd.DataFrame.from_dict(sales)
df.head()

Unnamed: 0,account,Jan,Feb,Mar
0,Jones LLC,150,200,140
1,Alpha Co,200,210,215
2,Blue Inc,50,90,95


In [8]:
# you could also have reordered the columns manually
df = df[['account', 'Jan', 'Feb', 'Mar']]

# Lists

The other option for creating your DataFrames from python is to include the data in a list structure.

The first approach is to use a row oriented approach using pandas from_records . This approach is similar to the dictionary approach but you need to explicitly call out the column labels.

In [9]:
sales = [('Jones LLC', 150, 200, 50),
         ('Alpha Co', 200, 210, 90),
         ('Blue Inc', 140, 215, 95)]
labels = ['account', 'Jan', 'Feb', 'Mar']
df = pd.DataFrame.from_records(sales, columns=labels)
df.head()

Unnamed: 0,account,Jan,Feb,Mar
0,Jones LLC,150,200,50
1,Alpha Co,200,210,90
2,Blue Inc,140,215,95


The second method is the from_items which is column oriented and actually looks similar to the OrderedDict example above.

In [10]:
sales = [('account', ['Jones LLC', 'Alpha Co', 'Blue Inc']),
         ('Jan', [150, 200, 50]),
         ('Feb', [200, 210, 90]),
         ('Mar', [140, 215, 95]),
         ]
df = pd.DataFrame.from_items(sales)

<img src "listdict.png">

<img src="listdict.png">