# From dict of Series or dicts
## From dict of Series

* The resulting **index** will be the union
  of the indexes of the various Series.
* If no **columns** are passed,
  the columns will be the ordered list of dict keys.
* When a particular set of **columns** is passed
  along with a dict of data,
  the passed columns override the keys in the dict.
* The row and column labels can be accessed respectively
  by accessing the **index** and **columns** attributes.

## Links

[pandas documentattion][docs] >>  
[Getting started][getting_started] >>  
[Intro to data structures][dsintro] >>  
[DataFrame][dataframe] >>  
[From dict of Series or dicts][from_dict]


[docs]: https://pandas.pydata.org/pandas-docs/stable/index.html
[getting_started]: https://pandas.pydata.org/pandas-docs/stable/getting_started/index.html
[dsintro]: https://pandas.pydata.org/pandas-docs/stable/getting_started/dsintro.html#
[dataframe]: https://pandas.pydata.org/pandas-docs/stable/getting_started/dsintro.html#dataframe
[from_dict]: https://pandas.pydata.org/pandas-docs/stable/getting_started/dsintro.html#from-dict-of-series-or-dicts

In [1]:
import numpy as np
import pandas as pd


### 1.0 dict of Series; no index is set

In [2]:
# Creates a dict of Series each received from lists,
# when no index is passed.
d = {'one': pd.Series([1, 2, 3, 4]),
     'two': pd.Series([1, 2, 3]),
     'three': pd.Series([1, 2]),
     'four': pd.Series([1])}

# Prints the dict, iterating over a nested list
# received from the dict with dict.items method. 
s = '{}:\n{}\n'
# for _ in d.items():
#     print(s.format(_[0], _[1]))
# It looks nicer when iterating over a list of keys
# recieved from the dict with dict.keys method.
for _ in d.keys():
    print(s.format(_, d[_]))


one:
0    1
1    2
2    3
3    4
dtype: int64

two:
0    1
1    2
2    3
dtype: int64

three:
0    1
1    2
dtype: int64

four:
0    1
dtype: int64



### 1.1 DataFrame from dict of Series; no index is set

In [3]:
# Creates a DataFrame from the dict of Series,
# when no index is passed.
df = pd.DataFrame(d)

# Prints the DataFrame and some its attributes.
s = '{}\n\nindex: {}\ncolumns: {}\nshape: {}'
print(s.format(df, list(df.index), list(df.columns), df.shape))

# But plain output of DataFrame looks nicer
# because of table format.
df


   one  two  three  four
0    1  1.0    1.0   1.0
1    2  2.0    2.0   NaN
2    3  3.0    NaN   NaN
3    4  NaN    NaN   NaN

index: [0, 1, 2, 3]
columns: ['one', 'two', 'three', 'four']
shape: (4, 4)


Unnamed: 0,one,two,three,four
0,1,1.0,1.0,1.0
1,2,2.0,2.0,
2,3,3.0,,
3,4,,,


### 1.2 DataFrame from dict of Series; index is set

In [4]:
# Sets a nested list,
# which is acually a list of index lists.
i = [[0, 1, 2, 3],
     [3, 2, 1, 0],
     [0, 1, 2],
     [1, 2, 3],
     np.random.randint(0, 3 + 1, 10)]

# Creates a list of DataFrames, iterating over the nested list,
# when data is the dict of Series, and index is passed by the list.
df = [pd.DataFrame(d, index=_)
      for _ in i]

# Prints DataFrames together with some its attributes,
# iterating over the list of DataFrames.
s = '{}\n\nindex: {}\nshape: {}\n\n'
for _ in df:
    print(s.format(_, list(_.index), _.shape))


   one  two  three  four
0    1  1.0    1.0   1.0
1    2  2.0    2.0   NaN
2    3  3.0    NaN   NaN
3    4  NaN    NaN   NaN

index: [0, 1, 2, 3]
shape: (4, 4)


   one  two  three  four
3    4  NaN    NaN   NaN
2    3  3.0    NaN   NaN
1    2  2.0    2.0   NaN
0    1  1.0    1.0   1.0

index: [3, 2, 1, 0]
shape: (4, 4)


   one  two  three  four
0    1    1    1.0   1.0
1    2    2    2.0   NaN
2    3    3    NaN   NaN

index: [0, 1, 2]
shape: (3, 4)


   one  two  three  four
1    2  2.0    2.0   NaN
2    3  3.0    NaN   NaN
3    4  NaN    NaN   NaN

index: [1, 2, 3]
shape: (3, 4)


   one  two  three  four
2    3  3.0    NaN   NaN
0    1  1.0    1.0   1.0
2    3  3.0    NaN   NaN
0    1  1.0    1.0   1.0
0    1  1.0    1.0   1.0
3    4  NaN    NaN   NaN
2    3  3.0    NaN   NaN
2    3  3.0    NaN   NaN
1    2  2.0    2.0   NaN
2    3  3.0    NaN   NaN

index: [2, 0, 2, 0, 0, 3, 2, 2, 1, 2]
shape: (10, 4)




### 1.3 DataFrame from dict of Series; index and columns are set

In [5]:
# Sets a list of tuples of index and columns lists.
i = [([0, 1, 2, 3], ['one', 'two', 'three', 'four']),
     ([3, 2, 1, 0], ['four', 'three', 'two', 'one']),
     ([0, 1, 2], ['one', 'two', 'three']),
     ([1, 2, 3], ['two', 'three', 'four']),
     (np.random.randint(0, 3 + 1, 10),
      ['one', 'two', 'three', 'four', 'five'])]

# Creates a list of DataFrames,
# iterating over the list of tuples,
# when data is a dict of Series,
# index and columns are passed by the list.
df = [pd.DataFrame(d, index=_[0], columns=_[1])
      for _ in i]

# Prints DataFrames together with some its attributes,
# iterating over the list of DataFrames.
s = '{}\n\nindex: {}\ncolumns: {}\nshape: {}\n\n'
for _ in df:
    print(s.format(_, list(_.index), list(_.index), _.shape))


   one  two  three  four
0    1  1.0    1.0   1.0
1    2  2.0    2.0   NaN
2    3  3.0    NaN   NaN
3    4  NaN    NaN   NaN

index: [0, 1, 2, 3]
columns: [0, 1, 2, 3]
shape: (4, 4)


   four  three  two  one
3   NaN    NaN  NaN    4
2   NaN    NaN  3.0    3
1   NaN    2.0  2.0    2
0   1.0    1.0  1.0    1

index: [3, 2, 1, 0]
columns: [3, 2, 1, 0]
shape: (4, 4)


   one  two  three
0    1    1    1.0
1    2    2    2.0
2    3    3    NaN

index: [0, 1, 2]
columns: [0, 1, 2]
shape: (3, 3)


   two  three  four
1  2.0    2.0   NaN
2  3.0    NaN   NaN
3  NaN    NaN   NaN

index: [1, 2, 3]
columns: [1, 2, 3]
shape: (3, 3)


   one  two  three  four five
0    1  1.0    1.0   1.0  NaN
1    2  2.0    2.0   NaN  NaN
0    1  1.0    1.0   1.0  NaN
1    2  2.0    2.0   NaN  NaN
2    3  3.0    NaN   NaN  NaN
1    2  2.0    2.0   NaN  NaN
3    4  NaN    NaN   NaN  NaN
0    1  1.0    1.0   1.0  NaN
1    2  2.0    2.0   NaN  NaN
2    3  3.0    NaN   NaN  NaN

index: [0, 1, 0, 1, 2, 1, 3, 0, 1, 2]
c

### 2.0 dict of Series; index is set

In [6]:
# Creates a dict of Series each received from lists,
# when index is passed.
d = {'one': pd.Series([1, 2, 3, 4],
                      index=list('abcd')),
     'two': pd.Series([1, 2, 3],
                      index=list('abc')),
     'three': pd.Series([1, 2, 3],
                        index=list('bcd')),
     'four': pd.Series([1],
                       index=list('c'))}

# Prints the dict.
s = '{}:\n{}\n'
for _ in d.keys():
    print(s.format(_, d[_]))


one:
a    1
b    2
c    3
d    4
dtype: int64

two:
a    1
b    2
c    3
dtype: int64

three:
b    1
c    2
d    3
dtype: int64

four:
c    1
dtype: int64



### 2.1 DataFrame from dict of Series; no index is set

In [7]:
# Creates a DataFrame from the dict of Series,
# when no index is passed.
df = pd.DataFrame(d)

# Prints DataFrame and some its attributes.
s = '{}\n\nindex: {}\ncolumns: {}\nshape: {}'
print(s.format(df, list(df.index), list(df.columns), df.shape))


   one  two  three  four
a    1  1.0    NaN   NaN
b    2  2.0    1.0   NaN
c    3  3.0    2.0   1.0
d    4  NaN    3.0   NaN

index: ['a', 'b', 'c', 'd']
columns: ['one', 'two', 'three', 'four']
shape: (4, 4)


### 2.2 DataFrame from dict of Series; index is set

In [8]:
# Sets a nested list,
# which is acually a list of index lists.
i = [list('abcd'),
     list('dcba'),
     list('abc'),
     list('bcd'),
     list('abracadabra')]

# Creates a list of DataFrames,
# iterating over the nested list,
# when data is the dict of Series,
# and index is passed by the list.
df = [pd.DataFrame(d, index=_)
      for _ in i]

# Prints DataFrames together with some its attributes,
# iterating over the list of DataFrames.
s = '{}\n\nindex: {}\nshape: {}\n\n'
for _ in df:
    print(s.format(_, list(_.index), _.shape))


   one  two  three  four
a    1  1.0    NaN   NaN
b    2  2.0    1.0   NaN
c    3  3.0    2.0   1.0
d    4  NaN    3.0   NaN

index: ['a', 'b', 'c', 'd']
shape: (4, 4)


   one  two  three  four
d    4  NaN    3.0   NaN
c    3  3.0    2.0   1.0
b    2  2.0    1.0   NaN
a    1  1.0    NaN   NaN

index: ['d', 'c', 'b', 'a']
shape: (4, 4)


   one  two  three  four
a    1    1    NaN   NaN
b    2    2    1.0   NaN
c    3    3    2.0   1.0

index: ['a', 'b', 'c']
shape: (3, 4)


   one  two  three  four
b    2  2.0      1   NaN
c    3  3.0      2   1.0
d    4  NaN      3   NaN

index: ['b', 'c', 'd']
shape: (3, 4)


   one  two  three  four
a  1.0  1.0    NaN   NaN
b  2.0  2.0    1.0   NaN
r  NaN  NaN    NaN   NaN
a  1.0  1.0    NaN   NaN
c  3.0  3.0    2.0   1.0
a  1.0  1.0    NaN   NaN
d  4.0  NaN    3.0   NaN
a  1.0  1.0    NaN   NaN
b  2.0  2.0    1.0   NaN
r  NaN  NaN    NaN   NaN
a  1.0  1.0    NaN   NaN

index: ['a', 'b', 'r', 'a', 'c', 'a', 'd', 'a', 'b', 'r', 'a']
shape: (11, 4)



### 2.3 DataFrame from dict of Series; index and colums are set

In [9]:
# Sets a list of tuples of index and columns lists.
i = [(list('abcd'), ['one', 'two', 'three', 'four']),
     (list('dcba'), ['four', 'three', 'two', 'one']),
     (list('abc'), ['one', 'two', 'three']),
     (list('bcd'), ['two', 'three', 'four']),
     (list('abracadabra'), ['one', 'two', 'three', 'four', 'five'])]

# Creates a list of DataFrames,
# iterating over the list of tuples,
# when data is a dict of Series,
# index and columns are passed by the list.
df = [pd.DataFrame(d, index=_[0], columns=_[1])
      for _ in i]

# Prints DataFrames together with some its attributes,
# iterating over the list of DataFrames.
s = '{}\n\nindex: {}\ncolumns: {}\nshape: {}\n\n'
for _ in df:
    print(s.format(_, list(_.index), list(_.index), _.shape))


   one  two  three  four
a    1  1.0    NaN   NaN
b    2  2.0    1.0   NaN
c    3  3.0    2.0   1.0
d    4  NaN    3.0   NaN

index: ['a', 'b', 'c', 'd']
columns: ['a', 'b', 'c', 'd']
shape: (4, 4)


   four  three  two  one
d   NaN    3.0  NaN    4
c   1.0    2.0  3.0    3
b   NaN    1.0  2.0    2
a   NaN    NaN  1.0    1

index: ['d', 'c', 'b', 'a']
columns: ['d', 'c', 'b', 'a']
shape: (4, 4)


   one  two  three
a    1    1    NaN
b    2    2    1.0
c    3    3    2.0

index: ['a', 'b', 'c']
columns: ['a', 'b', 'c']
shape: (3, 3)


   two  three  four
b  2.0      1   NaN
c  3.0      2   1.0
d  NaN      3   NaN

index: ['b', 'c', 'd']
columns: ['b', 'c', 'd']
shape: (3, 3)


   one  two  three  four five
a  1.0  1.0    NaN   NaN  NaN
b  2.0  2.0    1.0   NaN  NaN
r  NaN  NaN    NaN   NaN  NaN
a  1.0  1.0    NaN   NaN  NaN
c  3.0  3.0    2.0   1.0  NaN
a  1.0  1.0    NaN   NaN  NaN
d  4.0  NaN    3.0   NaN  NaN
a  1.0  1.0    NaN   NaN  NaN
b  2.0  2.0    1.0   NaN  NaN
r  NaN  NaN   