- Title: Add Columns into a pandas DataFrame
- Slug: python-pandas-add-column
- Date: 2020-04-13
- Category: Computer Science
- Tags: programming, Python, pandas, DataFrame, add, insert, column
- Author: Ben Du
- Modified: 2020-04-13


## Comment

When a Series is added as a new column into a DataFrame, 
values are added by matching index.

In [1]:
import pandas as pd

In [3]:
df = pd.DataFrame({'x': [1, 2, 3, 4, 5], 'y': [5, 4, 3, 2, 1]})
df.head()

Unnamed: 0,x,y
0,1,5
1,2,4
2,3,3
3,4,2
4,5,1


In [6]:
s = pd.Series([10, 20, 30, 40, 50], index=[4, 3, 2, 1, 0])
s

4    10
3    20
2    30
1    40
0    50
dtype: int64

In [7]:
df['z'] = s

In [8]:
df

Unnamed: 0,x,y,z
0,1,5,50
1,2,4,40
2,3,3,30
3,4,2,20
4,5,1,10


In [9]:
s2 = pd.Series([10, 20, 30, 40, 50], index=[40, 30, 20, 10, 0])
s2

40    10
30    20
20    30
10    40
0     50
dtype: int64

In [13]:
df['s2'] = s2
df

Unnamed: 0,x,y,z,s2
0,1,5,50,50.0
1,2,4,40,
2,3,3,30,
3,4,2,20,
4,5,1,10,


In [15]:
df['s3'] = [10, 20, 30, 40, 50]
df

Unnamed: 0,x,y,z,s2,s3
0,1,5,50,50.0,10
1,2,4,40,,20
2,3,3,30,,30
3,4,2,20,,40
4,5,1,10,,50


In [17]:
max(s, s2)

ValueError: Can only compare identically-labeled Series objects

In [18]:
s

4    10
3    20
2    30
1    40
0    50
dtype: int64

In [25]:
s1 = pd.Series([1, 2, 3, 4, 5], index=[4, 3, 2, 1, 0])

In [26]:
s1

4    1
3    2
2    3
1    4
0    5
dtype: int64

In [27]:
max(s, s1)

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

In [28]:
dir(s1)

['T',
 '_AXIS_ALIASES',
 '_AXIS_IALIASES',
 '_AXIS_LEN',
 '_AXIS_NAMES',
 '_AXIS_NUMBERS',
 '_AXIS_ORDERS',
 '_AXIS_REVERSED',
 '_AXIS_SLICEMAP',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_prepare__',
 '__array_priority__',
 '__array_wrap__',
 '__bool__',
 '__bytes__',
 '__class__',
 '__contains__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dict__',
 '__dir__',
 '__div__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__finalize__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__iand__',
 '__ifloordiv__',
 '__imod__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__ior__',
 '__ipow__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__ixor__',
 '__le__',
 '__len__',
 '__long__',
 '__lt__',
 '__matmul__',
 '__mod__',
 '__module__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__nonzero__',
 '__or__',
 

In [29]:
s

4    10
3    20
2    30
1    40
0    50
dtype: int64

In [30]:
s.clip(15, 35)

4    15
3    20
2    30
1    35
0    35
dtype: int64

In [31]:
s.clip_lower(15)

4    15
3    20
2    30
1    40
0    50
dtype: int64

In [32]:
s

4    10
3    20
2    30
1    40
0    50
dtype: int64

In [40]:
s2

40    10
30    20
20    30
10    40
0     50
dtype: int64

In [36]:
s.combine(s1, min)

4    1
3    2
2    3
1    4
0    5
dtype: int64

In [54]:
s.combine(s2, min, np.inf)

0     50
1     40
2     30
3     20
4     10
10    40
20    30
30    20
40    10
dtype: int64

In [52]:
import numpy as np

In [53]:
np.inf

inf

In [55]:
s

4    10
3    20
2    30
1    40
0    50
dtype: int64

In [57]:
s2

40    10
30    20
20    30
10    40
0     50
dtype: int64

In [61]:
np.minimum(s, s1)

4    1
3    2
2    3
1    4
0    5
dtype: int64

In [67]:
np.clip(pd.Series([1, 2, 3]), pd.Series([3, 4, 5]), None)

0    3
1    4
2    5
dtype: int64

In [71]:
?s.clip

[0;31mSignature:[0m [0ms[0m[0;34m.[0m[0mclip[0m[0;34m([0m[0mlower[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mupper[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0maxis[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0minplace[0m[0;34m=[0m[0;32mFalse[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Trim values at input threshold(s).

Assigns values outside boundary to boundary values. Thresholds
can be singular values or array like, and in the latter case
the clipping is performed element-wise in the specified axis.

Parameters
----------
lower : float or array_like, default None
    Minimum threshold value. All values below this
    threshold will be set to it.
upper : float or array_like, default None
    Maximum threshold value. All values above this
    threshold will be set to it.
axis : int or string axis name, optional
    Align object with lower and upper along the given axis.
inplace

In [70]:
s.clip_lower

4    10
3    20
2    30
1    40
0    50
dtype: int64

In [69]:
s1

4    1
3    2
2    3
1    4
0    5
dtype: int64