Indexing, grouping, and sorting datasets are all part of data analysis preparation. Another step in this process is combining, or concatenating, datasets. This is beneficial when more than one dataset needs to be combined.

For example, multiple months of financial records or investment data from different markets can be consolidated into one dataset in order to streamline and centralize data analysis.

`Concatenation` is the process of appending data from one object with another.

`Concatenation` creates a new object that represents data from all concatenated objects.

There are multiple ways to concatenate objects, including by column and row.

DataFrames can be joined together, or concatenated, using the Pandas concat function. This function enables users to join and combine more than one DataFrame.

The concat function accepts the following arguments:

a list of DataFrames to be joined

the axis to join on (by column or row)

the join operation (inner vs. outer)

[Pandas Concat document](https://pandas.pydata.org/docs/user_guide/merging.html)

### Import Libraries and Dependencies

In [None]:
import pandas as pd


### Read in files

In [None]:

# Read in data and index by CustomerID
france_data = pd.read_csv('france_products.csv', index_col='CustomerID')
uk_data = pd.read_csv('uk_products.csv', index_col='CustomerID')
netherlands_data = pd.read_csv('netherlands_products.csv', index_col='CustomerID')
customer_data = pd.read_csv('customer_info.csv', index_col='CustomerID')
products_data = pd.read_csv('products.csv', index_col='CustomerID')

### Output sample of data

In [None]:
# Show sample of France data
france_data.head()

In [None]:
# Show sample of UK data
uk_data.head()

In [None]:
# Show sample of Netherlands data
netherlands_data.head()

### Concatenate data by rows using `concat` function and `inner` join

A key consideration to keep in mind when concatenating DataFrames is that data is joined by index. Pandas' concat function will by default join rows or columns by index. Before concatenating DataFrames, make sure the DataFrames are indexed by the same column.

DataFrames can be joined by either column or row. The axis argument can be configured to specify which one to use.

If you need to create a dataset that reflects multiple columns from different DataFrames, the DataFrames should be joined on column. This will create a DataFrame that incorporates the columns from all DataFrames.

If rows from one DataFrame simply need to be combined or added to another DataFrame, the DataFrames should be joined on row. Joining on the row axis requires the DataFrames being joined to have the same columns.

In [None]:
# Join UK, France, and Netherlands full datasets by axis
joined_data_rows = pd.concat([france_data, uk_data, netherlands_data], axis="rows", join="inner")
joined_data_rows

### Concatenate data by column using `concat` function and `inner` join

In [None]:
# Show sample of customer data
customer_data.head()

In [None]:
# Show sample of product data
products_data.head()

In [None]:
# Join Customer and products by columns axis
joined_data_cols = pd.concat([customer_data, products_data], axis='columns', join='inner')
joined_data_cols.head()