# Método GroupBy

- 1. Concatenar, Juntar e Mesclar
    - 2.1 Concatenação (``pd.concat``)
    - 2.2 Mesclar (``pd.merge``)
    - 2.3 Join (``pd.join``)

In [None]:
import pandas as pd

## 1 Concatenar, Juntar e Mesclar

In [None]:
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                   'B':['B0', 'B1', 'B2', 'B3'],
                   'C':['C0', 'C1', 'C2', 'C3'],
                   'D':['D0', 'D1', 'D2', 'D3']},
                  index=[0, 1, 2, 3])

In [None]:
df2 = pd.DataFrame({'A': ['A4', 'A5', 'A6', 'A7'],
                   'B':['B4', 'B5', 'B6', 'B7'],
                   'C':['C4', 'C5', 'C6', 'C7'],
                   'D':['D4', 'D5', 'D6', 'D7']},
                  index=[4, 5, 6, 7])

In [None]:
df3 = pd.DataFrame({'A': ['A8', 'A9', 'A10', 'A11'],
                   'B':['B8', 'B9', 'B10', 'B11'],
                   'C':['C8', 'C9', 'C10', 'C11'],
                   'D':['D8', 'D9', 'D10', 'D11']},
                  index=[8, 9, 10, 11])

In [None]:
df1

In [None]:
df2

In [None]:
df3

###### Concatenação

In [None]:
pd.concat([df1,df2, df3])

#### Mesclar

In [None]:
esquerda = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                        'A': ['A0', 'A1', 'A2', 'A3'],
                        'B':['B0', 'B1','B2', 'B3']})

direita = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                        'C': ['C0', 'C1', 'C2', 'C3'],
                        'D':['D0', 'D1','D2', 'D3']})

In [None]:
esquerda

In [None]:
direita

In [None]:
pd.merge(esquerda, direita, how='inner', on='key')

In [None]:
esquerda

In [None]:
direita

In [None]:
pd.merge(esquerda, direita, how='cross')

### Método Join

In [None]:
esquerda = pd.DataFrame({'A': ['A0', 'A1', 'A2'],
                        'B':['B0', 'B1','B2']},
                       index=['K0', 'K1', 'K2'])

direita = pd.DataFrame({'C': ['C0', 'C2', 'C3'],
                        'D':['D0', 'D2','D3']},
                      index=['K0', 'K2', 'K3'])

In [None]:
esquerda.join(direita)

In [None]:
esquerda.join(direita, how='outer')

- **Mais informações em: ** https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

- **Qual a diferença entre join and merge?**

pandas.merge() is the underlying function used for all merge/join behavior.

DataFrames provide the pandas.DataFrame.merge() and pandas.DataFrame.join() methods as a convenient way to access the capabilities of pandas.merge(). For example, df1.merge(right=df2, ...) is equivalent to pandas.merge(left=df1, right=df2, ...).

These are the main differences between df.join() and df.merge():

- 1) lookup on right table: df1.join(df2) always joins via the index of df2, but df1.merge(df2) can join to one or more columns of df2 (default) or to the index of df2 (with right_index=True).
- 2) lookup on left table: by default, df1.join(df2) uses the index of df1 and df1.merge(df2) uses column(s) of df1. That can be overridden by specifying df1.join(df2, on=key_or_keys) or df1.merge(df2, left_index=True).
- 3) left vs inner join: df1.join(df2) does a left join by default (keeps all rows of df1), but df.merge does an inner join by default (returns only matching rows of df1 and df2).

So, the generic approach is to use pandas.merge(df1, df2) or df1.merge(df2). But for a number of common situations (keeping all rows of df1 and joining to an index in df2), you can save some typing by using df1.join(df2) instead.