# Alternate Groupby Syntax

This chapter covers a few other alternative syntaxes available to do aggregations with the `groupby` method. This chapter has great potential to confuse beginning pandas users since these methods do not give you any extra power to do data analysis. However, many other people who use pandas will use these syntaxes so it's important to be aware that they exist. Let's begin by reading in the San Francisco employee compensation dataset.

In [None]:
import pandas as pd
import numpy as np
sf_emp = pd.read_csv('../data/sf_employee_compensation.csv')
sf_emp.head(3)

## Aggregating a single column

Originally, we passed a two-item tuple to our new column name as a parameter to the `agg` method when aggregating a single column.

In [None]:
sf_emp.groupby('organization group').agg(mean_salary=('salaries', 'mean'))

### Alternative - use a dictionary

Instead of a tuple, you can use a dictionary to map the aggregating column to the aggregating function. Although this syntax uses less code, it does not allow you to rename columns during the aggregation like the original.

In [None]:
sf_emp.groupby('organization group').agg({'salaries': 'mean'})

### Alternative - select the column with the brackets

Instead of using a dictionary, can place the aggregating columns in brackets following the `groupby` method and then pass the aggregating function as a string to the `agg` method.

In [None]:
sf_emp.groupby('organization group')['salaries'].agg('mean')

You can even bypass the `agg` method and call the `sum` method directly after the brackets.

In [None]:
sf_emp.groupby('organization group')['salaries'].sum()

### Possible advantage - allows for multiple aggregating columns

Using any of these alternative methods allows you to compute multiple aggregating functions with less amount of code. Use a list to contain the aggregating functions.

In [None]:
sf_emp.groupby('organization group').agg({'salaries': ['min', 'max', 'mean']})

We would need to use more code with the original syntax, but I prefer this as we are returned a DataFrame with a single level index for the columns and we can name each column exactly what we desire.

In [None]:
sf_emp.groupby('organization group').agg(mean_salary=('salaries', 'mean'),
                                         min_salary=('salaries', 'min'),
                                         max_salary=('salaries', 'max'))

## No Aggregating Columns

You actually do not need to specify the aggregating columns when grouping. Pandas will silently drop the columns that don't work for the particular aggregation method. For instance, only numeric columns have a mean. All other columns will be dropped. Here, we take the min, max, and mean for combination of year and organization group of all the numeric columns.

In [None]:
sf_emp.groupby(['year', 'organization group']).agg(['min', 'max', 'mean']).head().round(-3)

You can even call a method directly after grouping to apply it to all columns.

In [None]:
sf_emp.groupby(['year', 'organization group']).mean().head().round(-3)