## SAS program to concatenate a variable across rows

```
***********************************************
*   Create a small data set to demonstrate    *
*   the concept                               *
***********************************************;
data data1;
  input category $ y $;
  datalines;
  A y1
  A y2
  B y3
  B y4
  B y5
  C y6
  C y7
  C y8
  C y9 
  ;

***********************************************
*   Combine the y variable values across      *
*   rows by category                          *
***********************************************;
data data2;
	set data1;
		by category;
	length combined $ 250;
	retain combined;
	if first.category then combined=y;
	else combined=catx(',',combined,y);
	if last.category;
	drop y;
run;
```

# Python code to produce the same data sets

In [1]:
# Import packages/libraries
import pandas as pd

In [2]:
# create the data frame that was used in the SAS program above
data1 = pd.DataFrame({'category':list('AABBBCCCC'),
                       'y':['y'+str(i) for i in range(1,10)]})

data1

Unnamed: 0,category,y
0,A,y1
1,A,y2
2,B,y3
3,B,y4
4,B,y5
5,C,y6
6,C,y7
7,C,y8
8,C,y9


In [3]:
# Join together the values in y by category, separated by a comma
data1.groupby('category')['y'].apply(lambda x: ', '.join(x)).reset_index()

Unnamed: 0,category,y
0,A,"y1, y2"
1,B,"y3, y4, y5"
2,C,"y6, y7, y8, y9"
