https://towardsdatascience.com/pandas-transform-more-than-meets-the-eye-928542b40b56

https://stackoverflow.com/questions/27517425/apply-vs-transform-on-a-group-object/47143056#47143056  

There are two major differences between the `transform` and apply `groupby` methods.

- `apply` implicitly passes all the columns for each group as a DataFrame to the custom function, while `transform` passes each column for each group as a Series to the custom function
- The custom function passed to `apply` can return a scalar, or a Series or DataFrame (or numpy array or even list). The custom function passed to `transform` must return a sequence (a one dimensional Series, array or list) the same length as the group.

So, `transform` works on just one Series at a time and `apply` works on the entire DataFrame at once.

So, `transform` is only allowed to work with a single Series at a time. It is impossible for it to act on two columns at the same time. 

In [16]:
import pandas as pd

In [45]:
df = pd.DataFrame({ 
    'Company':['Stark Industries', 'Stark Industries', 'Stark Industries', 'Initech', 'Initech', 'Initech'],
     'Employee Name':['Tony Stark', 'Pepper Posts', 'Maria, Hill', 'Peter Gibbons', 'Bill Lumbergh', 'Milton Waddams'],
     'Yearly Salary':[250, 180, 160, 150, 103, 0] 
})

df_ = pd.DataFrame({ 
    'Company':['Token', 'Token', 'Token'],
     'Employee Name':['Steve', 'John Doe', 'Jane Doe'],
     'Yearly Salary':[8, 5, 10] 
})

# df = pd.concat([df, df_])

In [46]:
df

Unnamed: 0,Company,Employee Name,Yearly Salary
0,Stark Industries,Tony Stark,250
1,Stark Industries,Pepper Posts,180
2,Stark Industries,"Maria, Hill",160
3,Initech,Peter Gibbons,150
4,Initech,Bill Lumbergh,103
5,Initech,Milton Waddams,0


In [47]:
# df = df.sample(frac=1) # shuffle rows

In [48]:
df['mean'] = df.groupby('Company').transform('mean')

In [49]:
df

Unnamed: 0,Company,Employee Name,Yearly Salary,mean
0,Stark Industries,Tony Stark,250,196.666667
1,Stark Industries,Pepper Posts,180,196.666667
2,Stark Industries,"Maria, Hill",160,196.666667
3,Initech,Peter Gibbons,150,84.333333
4,Initech,Bill Lumbergh,103,84.333333
5,Initech,Milton Waddams,0,84.333333


In [52]:
def mean_(df_):
#     return 9
#     print(len(df_))
    print(type(df_))
    

# df['mean_'] = 
df.groupby('Company').transform(mean_)

<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>


Unnamed: 0,Employee Name,Yearly Salary,mean
0,,,
1,,,
2,,,
3,,,
4,,,
5,,,


In [57]:
data = {
        'a':['a1','a2','a3','a4','a5'],
        'b':['b1','b1','b2','b2','b1'],
        'c':[55,44.2,33.3,-66.5,0],
        'd':[10,100,1000,10000,100000],
        }
df = pd.DataFrame.from_dict(data)

In [58]:
df

Unnamed: 0,a,b,c,d
0,a1,b1,55.0,10
1,a2,b1,44.2,100
2,a3,b2,33.3,1000
3,a4,b2,-66.5,10000
4,a5,b1,0.0,100000


In [64]:
df.groupby('b')['c'].transform(sum) 

0    99.2
1    99.2
2   -33.2
3   -33.2
4    99.2
Name: c, dtype: float64

In [67]:
def f(x, col):
    return df.loc[x.index, col]*x

df.groupby('b')['c'].transform(f, col='d')

0       550.0
1      4420.0
2     33300.0
3   -665000.0
4         0.0
Name: c, dtype: float64

In [73]:
for group_name, df_group in df.groupby('b'):
    print(group_name)
    display(df_group)

b1


Unnamed: 0,a,b,c,d
0,a1,b1,55.0,10
1,a2,b1,44.2,100
4,a5,b1,0.0,100000


b2


Unnamed: 0,a,b,c,d
2,a3,b2,33.3,1000
3,a4,b2,-66.5,10000


In [79]:
def f(x):
    print(type(x))
    display(x)
#     print('\n')
#     print(x)
#     print(x.index)
#     return df.loc[x.index,'d']*x

# df['e'] = 
df.groupby('b').transform(f)

<class 'pandas.core.series.Series'>


0    a1
1    a2
4    a5
Name: a, dtype: object

<class 'pandas.core.series.Series'>


0      55
1    44.2
4       0
Name: c, dtype: object

<class 'pandas.core.series.Series'>


0        10
1       100
4    100000
Name: d, dtype: object

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,a,c,d
0,a1,55.0,10
1,a2,44.2,100
4,a5,0.0,100000


<class 'pandas.core.series.Series'>


2    a3
3    a4
Name: a, dtype: object

<class 'pandas.core.series.Series'>


2    33.3
3   -66.5
Name: c, dtype: object

<class 'pandas.core.series.Series'>


2     1000
3    10000
Name: d, dtype: object

Unnamed: 0,a,c,d
0,,,
1,,,
2,,,
3,,,
4,,,
