# Combining data

For combining datasets or data arrays along a dimension, see concatenate.
For combining datasets with different variables, see merge.
For combining datasets or data arrays with different indexes or missing values, see combine.

Concatenate

To combine arrays along existing or new dimension into a larger array, you can use concat(). concat takes an iterable of DataArray or Dataset objects, as well as a dimension name, and concatenates along that dimension:

In [2]:
import xarray as xr
import pandas as pd
import numpy as np

In [3]:
arr = xr.DataArray(np.random.randn(2, 3),
[('x', ['a', 'b']), ('y', [10, 20, 30])])

In [4]:
arr[:, :1]

<xarray.DataArray (x: 2, y: 1)>
array([[ 0.036683],
       [-0.715525]])
Coordinates:
  * x        (x) <U1 'a' 'b'
  * y        (y) int64 10

In [5]:
xr.concat([arr[:, :1], arr[:, 1:]], dim='y')

<xarray.DataArray (x: 2, y: 3)>
array([[ 0.036683,  1.67034 , -0.230991],
       [-0.715525,  0.172852,  0.384833]])
Coordinates:
  * x        (x) <U1 'a' 'b'
  * y        (y) int64 10 20 30

In addition to combining along an existing dimension, concat can create a new dimension by stacking lower dimensional arrays together:

In [6]:
arr[0]

<xarray.DataArray (y: 3)>
array([ 0.036683,  1.67034 , -0.230991])
Coordinates:
    x        <U1 'a'
  * y        (y) int64 10 20 30

In [7]:
# to combine these 1d arrays into a 2d array in numpy, you would use np.array
xr.concat([arr[0], arr[1]], 'x')

<xarray.DataArray (x: 2, y: 3)>
array([[ 0.036683,  1.67034 , -0.230991],
       [-0.715525,  0.172852,  0.384833]])
Coordinates:
  * y        (y) int64 10 20 30
  * x        (x) <U1 'a' 'b'

If the second argument to concat is a new dimension name, the arrays will be concatenated along that new dimension, which is always inserted as the first dimension:

In [8]:
 xr.concat([arr[0], arr[1]], 'new_dim')

<xarray.DataArray (new_dim: 2, y: 3)>
array([[ 0.036683,  1.67034 , -0.230991],
       [-0.715525,  0.172852,  0.384833]])
Coordinates:
  * y        (y) int64 10 20 30
    x        (new_dim) <U1 'a' 'b'
Dimensions without coordinates: new_dim

The second argument to concat can also be an Index or DataArray object as well as a string, in which case it is used to label the values along the new dimension:

In [9]:
xr.concat([arr[0], arr[1]], pd.Index([-90, -100], name='new_dim'))

<xarray.DataArray (new_dim: 2, y: 3)>
array([[ 0.036683,  1.67034 , -0.230991],
       [-0.715525,  0.172852,  0.384833]])
Coordinates:
  * y        (y) int64 10 20 30
    x        (new_dim) <U1 'a' 'b'
  * new_dim  (new_dim) int64 -90 -100

Of course, concat also works on Dataset objects:

In [13]:
ds = arr.to_dataset(name='foo')
ds

<xarray.Dataset>
Dimensions:  (x: 2, y: 3)
Coordinates:
  * x        (x) <U1 'a' 'b'
  * y        (y) int64 10 20 30
Data variables:
    foo      (x, y) float64 0.03668 1.67 -0.231 -0.7155 0.1729 0.3848

In [11]:
xr.concat([ds.sel(x='a'), ds.sel(x='b')], 'x')

<xarray.Dataset>
Dimensions:  (x: 2, y: 3)
Coordinates:
  * y        (y) int64 10 20 30
  * x        (x) <U1 'a' 'b'
Data variables:
    foo      (x, y) float64 0.03668 1.67 -0.231 -0.7155 0.1729 0.3848

# Merge



To combine variables and coordinates between multiple DataArray and/or Dataset object, use merge(). It can merge a list of Dataset, DataArray or dictionaries of objects convertible to DataArray objects:

In [12]:
xr.merge([ds, ds.rename({'foo': 'bar'})])

<xarray.Dataset>
Dimensions:  (x: 2, y: 3)
Coordinates:
  * x        (x) <U1 'a' 'b'
  * y        (y) int64 10 20 30
Data variables:
    foo      (x, y) float64 0.03668 1.67 -0.231 -0.7155 0.1729 0.3848
    bar      (x, y) float64 0.03668 1.67 -0.231 -0.7155 0.1729 0.3848

In [14]:
xr.merge([xr.DataArray(n, name='var%d' % n) for n in range(5)])

<xarray.Dataset>
Dimensions:  ()
Data variables:
    var0     int64 0
    var1     int64 1
    var2     int64 2
    var3     int64 3
    var4     int64 4

If you merge another dataset (or a dictionary including data array objects), by default the resulting dataset will be aligned on the union of all index coordinates:

In [15]:
other = xr.Dataset({'bar': ('x', [1, 2, 3, 4]), 'x': list('abcd')})
other

<xarray.Dataset>
Dimensions:  (x: 4)
Coordinates:
  * x        (x) <U1 'a' 'b' 'c' 'd'
Data variables:
    bar      (x) int64 1 2 3 4

In [16]:
ds

<xarray.Dataset>
Dimensions:  (x: 2, y: 3)
Coordinates:
  * x        (x) <U1 'a' 'b'
  * y        (y) int64 10 20 30
Data variables:
    foo      (x, y) float64 0.03668 1.67 -0.231 -0.7155 0.1729 0.3848

In [17]:
xr.merge([ds, other])

<xarray.Dataset>
Dimensions:  (x: 4, y: 3)
Coordinates:
  * x        (x) object 'a' 'b' 'c' 'd'
  * y        (y) int64 10 20 30
Data variables:
    foo      (x, y) float64 0.03668 1.67 -0.231 -0.7155 ... nan nan nan nan
    bar      (x) int64 1 2 3 4