# Pivot_Longer : One function to cover transformations from wide to long form.

In [1]:
import janitor
import pandas as pd
import numpy as np

Unpivoting(reshaping data from wide to long form) in Pandas is executed either through [pd.melt](https://pandas.pydata.org/docs/reference/api/pandas.melt.html), [pd.wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html), or [pd.DataFrame.stack](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.stack.html). However, there are scenarios where a few more steps are required to massage the data into the long form that we desire. Take the dataframe below, copied from [Stack Overflow](https://stackoverflow.com/questions/64061588/pandas-melt-multiple-columns-to-tabulate-a-dataset#64062002): 

In [2]:
df = pd.DataFrame(
        {
            "id": [1, 2, 3],
            "M_start_date_1": [201709, 201709, 201709],
            "M_end_date_1": [201905, 201905, 201905],
            "M_start_date_2": [202004, 202004, 202004],
            "M_end_date_2": [202005, 202005, 202005],
            "F_start_date_1": [201803, 201803, 201803],
            "F_end_date_1": [201904, 201904, 201904],
            "F_start_date_2": [201912, 201912, 201912],
            "F_end_date_2": [202007, 202007, 202007],
        }
    )

df

Unnamed: 0,id,M_start_date_1,M_end_date_1,M_start_date_2,M_end_date_2,F_start_date_1,F_end_date_1,F_start_date_2,F_end_date_2
0,1,201709,201905,202004,202005,201803,201904,201912,202007
1,2,201709,201905,202004,202005,201803,201904,201912,202007
2,3,201709,201905,202004,202005,201803,201904,201912,202007


In [9]:
df = pd.DataFrame({
    'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3],
    'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3],
    'ht1': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1],
    'ht2': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9]
})

df

Unnamed: 0,famid,birth,ht1,ht2
0,1,1,2.8,3.4
1,1,2,2.9,3.8
2,1,3,2.2,2.9
3,2,1,2.0,3.2
4,2,2,1.8,2.8
5,2,3,1.9,2.4
6,3,1,2.2,3.3
7,3,2,2.3,3.4
8,3,3,2.1,2.9


In [18]:
%%timeit
df.pivot_longer(index=['famid','birth'],
                names_to=('.value', 'age'),
                names_pattern=r"(ht)(\d)")

68.9 ms Â± 1.38 ms per loop (mean Â± std. dev. of 7 runs, 10 loops each)


In [20]:
%timeit  pd.wide_to_long(df.reset_index(), stubnames='ht', i=['index','famid', 'birth'], j='age')

46.7 ms Â± 2.65 ms per loop (mean Â± std. dev. of 7 runs, 10 loops each)


In [16]:
df = pd.concat([df]*1000, ignore_index=True)
df

Unnamed: 0,famid,birth,ht1,ht2
0,1,1,2.8,3.4
1,1,2,2.9,3.8
2,1,3,2.2,2.9
3,2,1,2.0,3.2
4,2,2,1.8,2.8
...,...,...,...,...
8995,2,2,1.8,2.8
8996,2,3,1.9,2.4
8997,3,1,2.2,3.3
8998,3,2,2.3,3.4


Below is a [beautiful solution](https://stackoverflow.com/a/64062027/7175713), from Stack Overflow : 

In [8]:
df1 = df.set_index('id')
df1.columns = df1.columns.str.split('_', expand=True)
df1 = (df1.stack(level=[0,2,3])
          .sort_index(level=[0,1], ascending=[True, False])
          .reset_index(level=[2,3], drop=True)
          .sort_index(axis=1, ascending=False)
          .rename_axis(['id','cod'])
          .reset_index())

df1

Unnamed: 0,id,cod,start,end
0,1,M,201709,201905
1,1,M,202004,202005
2,1,M,201709,201905
3,1,M,202004,202005
4,1,M,201709,201905
...,...,...,...,...
11995,3,F,201912,202007
11996,3,F,201803,201904
11997,3,F,201912,202007
11998,3,F,201803,201904


We propose an alternative, based on [pandas melt](https://pandas.pydata.org/docs/reference/api/pandas.melt.html), that abstracts the reshaping mechanism, allows the user to focus on the task, can be applied to other scenarios,  and is chainable : 

In [7]:
df.pivot_longer(index=["id"], 
        names_to=("cod", ".value"), 
        names_pattern="(M|F)_(start|end)_.+"
    )

Unnamed: 0,id,cod,start,end
0,1,M,201709,201905
1,1,M,202004,202005
2,1,F,201803,201904
3,1,F,201912,202007
4,2,M,201709,201905
...,...,...,...,...
11995,2,F,201912,202007
11996,3,M,201709,201905
11997,3,M,202004,202005
11998,3,F,201803,201904


[pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) is not a new idea; it is a combination of ideas from R's [tidyr](https://tidyr.tidyverse.org/reference/pivot_longer.html) and R's [data.table](https://rdatatable.gitlab.io/data.table/) and is built on  pandas' [stack](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.stack.html) method. 

Do note that the [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) function is designed primarily to work with single indexed dataframes; for MultiIndex dataframes, `pandas_melt` is more than adequate. 

Also, the unpivoted dataframe is returned in order of appearance.

[pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) can melt dataframes easily; It replicates the same functionality as pandas' [melt](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.melt.html).

In [5]:
df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'},
                   'B': {0: 1, 1: 3, 2: 5},
                   'C': {0: 2, 1: 4, 2: 6}})

df

Unnamed: 0,A,B,C
0,a,1,2
1,b,3,4
2,c,5,6


In [6]:
df.pivot_longer(index='A', column_names='B')

Unnamed: 0,A,variable,value
0,a,B,1
1,b,B,3
2,c,B,5


You can dynamically select columns, using regular expressions with the `janitor.patterns` function (inspired by R's data.table's [patterns](https://rdatatable.gitlab.io/data.table/reference/patterns.html) function, and is really just a wrapper around `re.compile`), especially if it is a lot of column names, and you are *lazy* like me  ðŸ˜„

In [7]:
url = 'https://github.com/tidyverse/tidyr/raw/master/data-raw/billboard.csv'
df = pd.read_csv(url)

df

Unnamed: 0,year,artist,track,time,date.entered,wk1,wk2,wk3,wk4,wk5,...,wk67,wk68,wk69,wk70,wk71,wk72,wk73,wk74,wk75,wk76
0,2000,2 Pac,Baby Don't Cry (Keep...,4:22,2000-02-26,87,82.0,72.0,77.0,87.0,...,,,,,,,,,,
1,2000,2Ge+her,The Hardest Part Of ...,3:15,2000-09-02,91,87.0,92.0,,,...,,,,,,,,,,
2,2000,3 Doors Down,Kryptonite,3:53,2000-04-08,81,70.0,68.0,67.0,66.0,...,,,,,,,,,,
3,2000,3 Doors Down,Loser,4:24,2000-10-21,76,76.0,72.0,69.0,67.0,...,,,,,,,,,,
4,2000,504 Boyz,Wobble Wobble,3:35,2000-04-15,57,34.0,25.0,17.0,17.0,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
312,2000,Yankee Grey,Another Nine Minutes,3:10,2000-04-29,86,83.0,77.0,74.0,83.0,...,,,,,,,,,,
313,2000,"Yearwood, Trisha",Real Live Woman,3:55,2000-04-01,85,83.0,83.0,82.0,81.0,...,,,,,,,,,,
314,2000,Ying Yang Twins,Whistle While You Tw...,4:19,2000-03-18,95,94.0,91.0,85.0,84.0,...,,,,,,,,,,
315,2000,Zombie Nation,Kernkraft 400,3:30,2000-09-02,99,99.0,,,,...,,,,,,,,,,


In [8]:
# unpivot all columns that start with 'wk'
df.pivot_longer(column_names = janitor.patterns("^(wk)"), 
                names_to='week')

Unnamed: 0,year,artist,track,time,date.entered,week,value
0,2000,2 Pac,Baby Don't Cry (Keep...,4:22,2000-02-26,wk1,87.0
1,2000,2 Pac,Baby Don't Cry (Keep...,4:22,2000-02-26,wk2,82.0
2,2000,2 Pac,Baby Don't Cry (Keep...,4:22,2000-02-26,wk3,72.0
3,2000,2 Pac,Baby Don't Cry (Keep...,4:22,2000-02-26,wk4,77.0
4,2000,2 Pac,Baby Don't Cry (Keep...,4:22,2000-02-26,wk5,87.0
...,...,...,...,...,...,...,...
24087,2000,matchbox twenty,Bent,4:12,2000-04-29,wk72,
24088,2000,matchbox twenty,Bent,4:12,2000-04-29,wk73,
24089,2000,matchbox twenty,Bent,4:12,2000-04-29,wk74,
24090,2000,matchbox twenty,Bent,4:12,2000-04-29,wk75,


[pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) can also unpivot paired columns. Let's look at an example from pandas' [wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html) docs : 

In [9]:
df = pd.DataFrame({
    'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3],
    'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3],
    'ht1': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1],
    'ht2': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9]
})

df

Unnamed: 0,famid,birth,ht1,ht2
0,1,1,2.8,3.4
1,1,2,2.9,3.8
2,1,3,2.2,2.9
3,2,1,2.0,3.2
4,2,2,1.8,2.8
5,2,3,1.9,2.4
6,3,1,2.2,3.3
7,3,2,2.3,3.4
8,3,3,2.1,2.9


In the data above, the `height`(ht) is paired with `age`(numbers). Let's see how [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) handles this:

When `.value` is used in `names_to`, a pairing is created between ``names_to`` and ``names_pattern``. For the example above, we get this pairing :

                                          {".value": ("ht"), "age": (\d)} 

This tells the [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) function to keep values associated with `.value`(`ht`) as the column name, while values not associated with `.value`, in this case, the numbers, will be collated under a new column ``age``. Internally, pandas `str.extractall` is used to get the capturing groups before reshaping. This level of abstraction, we believe, allows the user to focus on the task, and get things done faster.

[pd.wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html) handles this already, so why bother? Let's look at another scenario where [pd.wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html) would need a few more steps. [Source Data](https://community.rstudio.com/t/pivot-longer-on-multiple-column-sets-pairs/43958) :



In [10]:
df = pd.DataFrame(
    {
        "off_loc": ["A", "B", "C", "D", "E", "F"],
        "pt_loc": ["G", "H", "I", "J", "K", "L"],
        "pt_lat": [
            100.07548220000001,
            75.191326,
            122.65134479999999,
            124.13553329999999,
            124.13553329999999,
            124.01028909999998,
        ],
        "off_lat": [
            121.271083,
            75.93845266,
            135.043791,
            134.51128400000002,
            134.484374,
            137.962195,
        ],
        "pt_long": [
            4.472089953,
            -144.387785,
            -40.45611048,
            -46.07156181,
            -46.07156181,
            -46.01594293,
        ],
        "off_long": [
            -7.188632000000001,
            -143.2288569,
            21.242563,
            40.937416999999996,
            40.78472,
            22.905889000000002,
        ],
    }
)

df

Unnamed: 0,off_loc,pt_loc,pt_lat,off_lat,pt_long,off_long
0,A,G,100.075482,121.271083,4.47209,-7.188632
1,B,H,75.191326,75.938453,-144.387785,-143.228857
2,C,I,122.651345,135.043791,-40.45611,21.242563
3,D,J,124.135533,134.511284,-46.071562,40.937417
4,E,K,124.135533,134.484374,-46.071562,40.78472
5,F,L,124.010289,137.962195,-46.015943,22.905889


We can unpivot with [pd.wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html) by first reorganising the columns : 

In [11]:
df1 = df.copy()
df1.columns = ["_".join(col.split("_")[::-1])
               for col in df1.columns]
df1

Unnamed: 0,loc_off,loc_pt,lat_pt,lat_off,long_pt,long_off
0,A,G,100.075482,121.271083,4.47209,-7.188632
1,B,H,75.191326,75.938453,-144.387785,-143.228857
2,C,I,122.651345,135.043791,-40.45611,21.242563
3,D,J,124.135533,134.511284,-46.071562,40.937417
4,E,K,124.135533,134.484374,-46.071562,40.78472
5,F,L,124.010289,137.962195,-46.015943,22.905889


Now, we can unpivot : 

In [12]:
pd.wide_to_long(
    df1.reset_index(),
    stubnames=["loc", "lat", "long"],
    sep="_",
    i="index",
    j="set",
    suffix=".+",
)

Unnamed: 0_level_0,Unnamed: 1_level_0,loc,lat,long
index,set,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,off,A,121.271083,-7.188632
0,pt,G,100.075482,4.47209
1,off,B,75.938453,-143.228857
1,pt,H,75.191326,-144.387785
2,off,C,135.043791,21.242563
2,pt,I,122.651345,-40.45611
3,off,D,134.511284,40.937417
3,pt,J,124.135533,-46.071562
4,off,E,134.484374,40.78472
4,pt,K,124.135533,-46.071562


Notice that we had to reset the dataframe to get a unique index variable. We can abstract all that, using [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) :

In [13]:
df.pivot_longer(names_to=["set", ".value"], 
                names_pattern="(.+)_(.+)")

2


Unnamed: 0,set,loc,lat,long
0,off,A,121.271,-7.18863
1,pt,G,100.075,4.47209
2,off,B,75.9385,-143.229
3,pt,H,75.1913,-144.388
4,off,C,135.044,21.2426
5,pt,I,122.651,-40.4561
6,off,D,134.511,40.9374
7,pt,J,124.136,-46.0716
8,off,E,134.484,40.7847
9,pt,K,124.136,-46.0716


In [14]:
# Another way to see the pairings, 
# to see what is linked to `.value`, 

# names_to =     ["set", ".value"]
# names_pattern = "(.+)_(.+)"
# column _names =   off_loc
#                   off_lat
#                   off_long

Again, the key here is the `.value` symbol. Pairing `names_to` with `names_pattern` and its results from [pd.str.extractall](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.extractall.html), we get : 

                            set--> (.+) --> [off, pt] and 
                            .value--> (.+) --> [loc, lat, long] 
                                           
All values associated with `.value`(loc, lat, long) remain as column names, while values not associated with `.value`(off, pt) are lumped into a new column ``set``. 

Notice that we did not have to reset the index - [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) takes care of that internally;  [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) allows you to focus on what you want, so you can get it and move on.


Let's look at another example, from [Stack Overflow](https://stackoverflow.com/questions/45123924/convert-pandas-dataframe-from-wide-to-long/45124130) : 

In [15]:
df = pd.DataFrame([{'a_1': 2, 'ab_1': 3, 
                    'ac_1': 4, 'a_2': 5, 
                    'ab_2': 6, 'ac_2': 7}])
df

Unnamed: 0,a_1,ab_1,ac_1,a_2,ab_2,ac_2
0,2,3,4,5,6,7


The data above requires extracting `a`, `ab` and `ac` from `1` and `2`. This is another example of a paired column. We could solve this using [pd.wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html); infact there is a very good solution from [Stack Overflow](https://stackoverflow.com/a/45124775/7175713)

In [16]:
df1 = df.copy()
df1['id'] = df1.index
pd.wide_to_long(df1, ['a','ab','ac'],i='id',j='num',sep='_')

Unnamed: 0_level_0,Unnamed: 1_level_0,a,ab,ac
id,num,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,1,2,3,4
0,2,5,6,7


Or you could simply pass the buck to [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) : 

In [17]:
df.pivot_longer(names_to=('.value','num'), names_sep='_')

Unnamed: 0,num,a,ab,ac
0,1,2,3,4
1,2,5,6,7


In the solution above, we used the `names_sep` argument, as it is more convenient. A few more examples to get you familiar with the `.value` symbol.

[Source Data](https://stackoverflow.com/questions/55403008/pandas-partial-melt-or-group-melt)

In [18]:
# 
df = pd.DataFrame([[1,1,2,3,4,5,6],
                   [2,7,8,9,10,11,12]], 
                  columns=['id', 'ax','ay','az','bx','by','bz'])

df

Unnamed: 0,id,ax,ay,az,bx,by,bz
0,1,1,2,3,4,5,6
1,2,7,8,9,10,11,12


In [19]:
df.pivot_longer(index='id', 
                names_to=('name','.value'), 
                names_pattern='(.)(.)')

2


Unnamed: 0,id,name,x,y,z
0,1,a,1,2,3
1,1,b,4,5,6
2,2,a,7,8,9
3,2,b,10,11,12


For the code above `.value` is paired with `x`, `y`, `z`(which become the new column names), while `a`, `b` are unpivoted into the `name` column. 

In the dataframe below, we need to unpivot the data, keeping only the suffix `hi`, and pulling out the number between `A` and `g`. [Source Data](https://stackoverflow.com/questions/35929985/melt-a-data-table-with-a-column-pattern)

In [20]:
df = pd.DataFrame([{'id': 1, 'A1g_hi': 2, 
                    'A2g_hi': 3, 'A3g_hi': 4, 
                    'A4g_hi': 5}])
df

Unnamed: 0,id,A1g_hi,A2g_hi,A3g_hi,A4g_hi
0,1,2,3,4,5


In [21]:
df.pivot_longer('id', 
                names_to=['time','.value'], 
                names_pattern="A(\d)g_(hi)")

2


Unnamed: 0,id,time,hi
0,1,1,2
1,1,2,3
2,1,3,4
3,1,4,5


Let's see an example where we have multiple values in a paired column, and we want them into separate columns. [Source Data](https://stackoverflow.com/questions/64107566/how-to-pivot-longer-and-populate-with-fields-from-column-names-at-the-same-tim?noredirect=1#comment113369419_64107566) : 

In [22]:
df = pd.DataFrame(
    {
        "Sony | TV | Model | value": {0: "A222", 1: "A234", 2: "A4345"},
        "Sony | TV | Quantity | value": {0: 5, 1: 5, 2: 4},
        "Sony | TV | Max-quant | value": {0: 10, 1: 9, 2: 9},
        "Panasonic | TV | Model | value": {0: "T232", 1: "S3424", 2: "X3421"},
        "Panasonic | TV | Quantity | value": {0: 1, 1: 5, 2: 1},
        "Panasonic | TV | Max-quant | value": {0: 10, 1: 12, 2: 11},
        "Sanyo | Radio | Model | value": {0: "S111", 1: "S1s1", 2: "S1s2"},
        "Sanyo | Radio | Quantity | value": {0: 4, 1: 2, 2: 4},
        "Sanyo | Radio | Max-quant | value": {0: 9, 1: 9, 2: 10},
    }
)

df

Unnamed: 0,Sony | TV | Model | value,Sony | TV | Quantity | value,Sony | TV | Max-quant | value,Panasonic | TV | Model | value,Panasonic | TV | Quantity | value,Panasonic | TV | Max-quant | value,Sanyo | Radio | Model | value,Sanyo | Radio | Quantity | value,Sanyo | Radio | Max-quant | value
0,A222,5,10,T232,1,10,S111,4,9
1,A234,5,9,S3424,5,12,S1s1,2,9
2,A4345,4,9,X3421,1,11,S1s2,4,10


The goal is to reshape the data into long format, with separate columns for `Manufacturer`(Sony,...), `Device`(TV, Radio), `Model`(S3424, ...), ``maximum quantity`` and ``quantity``. 

Below is the [accepted solution](https://stackoverflow.com/a/64107688/7175713) on Stack Overflow :

In [23]:
df1 = df.copy()
# Create a multiIndex column header
df1.columns = pd.MultiIndex.from_arrays(
    zip(*df1.columns.str.split("\s?\|\s?"))
)

# Reshape the dataframe using 
# `set_index`, `droplevel`, and `stack`
(df1.stack([0, 1])
 .droplevel(1, axis=1)
 .set_index("Model", append=True)
 .rename_axis([None, "Manufacturer", "Device", "Model"])
 .sort_index(level=[1, 2, 3])
 .reset_index()
 .drop("level_0", axis=1)
 )


Unnamed: 0,Manufacturer,Device,Model,Max-quant,Quantity
0,Panasonic,TV,S3424,12.0,5.0
1,Panasonic,TV,T232,10.0,1.0
2,Panasonic,TV,X3421,11.0,1.0
3,Sanyo,Radio,S111,9.0,4.0
4,Sanyo,Radio,S1s1,9.0,2.0
5,Sanyo,Radio,S1s2,10.0,4.0
6,Sony,TV,A222,10.0,5.0
7,Sony,TV,A234,9.0,5.0
8,Sony,TV,A4345,9.0,4.0


Or, we could use [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer), along with `.value` in `names_to` and a regular expression in `names_pattern` : 

In [24]:
result = (df
         .pivot_longer(
             names_to=("Manufacturer", "Device", ".value"),
             names_pattern=r"(.+)\|(.+)\|(.+)\|.*")
        )

result

3


Unnamed: 0,Manufacturer,Device,Model,Quantity,Max-quant
0,Sony,TV,A222,5,10
1,Panasonic,TV,T232,1,10
2,Sanyo,Radio,S111,4,9
3,Sony,TV,A234,5,9
4,Panasonic,TV,S3424,5,12
5,Sanyo,Radio,S1s1,2,9
6,Sony,TV,A4345,4,9
7,Panasonic,TV,X3421,1,11
8,Sanyo,Radio,S1s2,4,10


The cleanup (removal of whitespace in the column names) is left as an exercise for the reader.

One more example on the `.value` symbol for paired columns [Source Data](https://stackoverflow.com/questions/59477686/python-pandas-melt-single-column-into-two-seperate) : 

In [25]:
df = pd.DataFrame({'id': [1, 2], 
                   'A_value': [50, 33], 
                   'D_value': [60, 45]})
df

Unnamed: 0,id,A_value,D_value
0,1,50,60
1,2,33,45


In [26]:
df.pivot_longer('id', 
                names_to=('value_type', '.value'), 
                names_sep='_')

Unnamed: 0,id,value_type,value
0,1,A,50
1,1,D,60
2,2,A,33
3,2,D,45


There are scenarios where we need to unpivot the data, but only keep some specific names in the unpivoted data. Let's see an example below: [Source Data](https://stackoverflow.com/questions/59550804/melt-column-by-substring-of-the-columns-name-in-pandas-python)

In [27]:
df = pd.DataFrame({'subject': [1, 2],
                   'A_target_word_gd': [1, 11],
                   'A_target_word_fd': [2, 12],
                   'B_target_word_gd': [3, 13],
                   'B_target_word_fd': [4, 14],
                   'subject_type': ['mild', 'moderate']})

df

Unnamed: 0,subject,A_target_word_gd,A_target_word_fd,B_target_word_gd,B_target_word_fd,subject_type
0,1,1,2,3,4,mild
1,2,11,12,13,14,moderate


In the dataframe above, `A` and `B` represent conditions, while the suffixes `gd` and `fd` represent value types. We are not interested in the words in the middle (`_target_word`). We could solve it this way (this is the chosen solution, copied from [Stack Overflow](https://stackoverflow.com/a/59550967/7175713)) : 

In [28]:
new_df =(pd.melt(df,
                id_vars=['subject_type','subject'], 
                var_name='abc')
           .sort_values(by=['subject', 'subject_type'])
         )
new_df['cond']=(new_df['abc']
                .apply(lambda x: (x.split('_'))[0])
                )
new_df['value_type']=(new_df
                      .pop('abc')
                      .apply(lambda x: (x.split('_'))[-1])
                      )
new_df


Unnamed: 0,subject_type,subject,value,cond,value_type
0,mild,1,1,A,gd
2,mild,1,2,A,fd
4,mild,1,3,B,gd
6,mild,1,4,B,fd
1,moderate,2,11,A,gd
3,moderate,2,12,A,fd
5,moderate,2,13,B,gd
7,moderate,2,14,B,fd


Or, we could just pass the buck to [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) : 

In [29]:
df.pivot_longer(
    index=["subject", "subject_type"],
    names_to=("cond", "value_type"),
    names_pattern="([A-Z]).*(gd|fd)",
)


2


Unnamed: 0,subject,subject_type,cond,value_type,value
0,1,mild,A,gd,1.0
1,1,mild,A,fd,2.0
2,1,mild,B,gd,3.0
3,1,mild,B,fd,4.0
4,2,moderate,A,gd,11.0
5,2,moderate,A,fd,12.0
6,2,moderate,B,gd,13.0
7,2,moderate,B,fd,14.0


In the above, we pass in the new names of the columns to `names_to`('cond', 'value_type'), and pass the groups to be extracted as a regular expression to `names_pattern`. 

Here's another example where [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) abstracts the process and makes reshaping easier.


In the dataframe below, we would like to unpivot the data and separate the column names into individual columns(`vault` should be an `event` column, `2012` should be a `year` column and `f` should be a `gender` column). [Source Data](https://dcl-wrangle.stanford.edu/pivot-advanced.html)

In [30]:
df = pd.DataFrame(
            {
                "country": ["United States", "Russia", "China"],
                "vault_2012_f": [
                    48.132,
                    46.36600000000001,
                    44.266000000000005,
                ],
                "vault_2012_m": [46.632, 46.86600000000001, 48.316],
                "vault_2016_f": [
                    46.86600000000001,
                    45.733000000000004,
                    44.332,
                ],
                "vault_2016_m": [45.865, 46.033, 45.0],
                "floor_2012_f": [45.36600000000001, 41.599, 40.833],
                "floor_2012_m": [45.266000000000005, 45.308, 45.133],
                "floor_2016_f": [45.998999999999995, 42.032, 42.066],
                "floor_2016_m": [43.757, 44.766000000000005, 43.799],
            }
        )
df


Unnamed: 0,country,vault_2012_f,vault_2012_m,vault_2016_f,vault_2016_m,floor_2012_f,floor_2012_m,floor_2016_f,floor_2016_m
0,United States,48.132,46.632,46.866,45.865,45.366,45.266,45.999,43.757
1,Russia,46.366,46.866,45.733,46.033,41.599,45.308,42.032,44.766
2,China,44.266,48.316,44.332,45.0,40.833,45.133,42.066,43.799


We could achieve this with a combination of [pd.melt](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.melt.html) and pandas string methods (or janitor's [deconcatenate_columns](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.deconcatenate_column.html#janitor.deconcatenate_column) method); or we could, again, pass the buck to [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) : 

In [31]:
df.pivot_longer(
    index="country",
    names_to=["event", "year", "gender"],
    names_sep="_",
    values_to="score",
)

Unnamed: 0,country,event,year,gender,score
0,United States,vault,2012,f,48.132
1,United States,vault,2012,m,46.632
2,United States,vault,2016,f,46.866
3,United States,vault,2016,m,45.865
4,United States,floor,2012,f,45.366
5,United States,floor,2012,m,45.266
6,United States,floor,2016,f,45.999
7,United States,floor,2016,m,43.757
8,Russia,vault,2012,f,46.366
9,Russia,vault,2012,m,46.866


One more feature that [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) offers is to pass a list of regular expressions to `names_pattern`. This comes in handy when one single regex cannot encapsulate similar columns for reshaping to long form. This idea is inspired by the [melt](https://rdatatable.gitlab.io/data.table/reference/melt.data.table.html) function in R's [data.table](https://rdatatable.gitlab.io/data.table/). A couple of examples should make this clear.

[Source Data](https://stackoverflow.com/questions/61138600/tidy-dataset-with-pivot-longer-multiple-columns-into-two-columns)

In [32]:
df = pd.DataFrame(
    [{'title': 'Avatar',
  'actor_1': 'CCH_Poundâ€¦',
  'actor_2': 'Joel_Daviâ€¦',
  'actor_3': 'Wes_Studi',
  'actor_1_FB_likes': 1000,
  'actor_2_FB_likes': 936,
  'actor_3_FB_likes': 855},
 {'title': 'Pirates_of_the_Carâ€¦',
  'actor_1': 'Johnny_Deâ€¦',
  'actor_2': 'Orlando_Bâ€¦',
  'actor_3': 'Jack_Davenâ€¦',
  'actor_1_FB_likes': 40000,
  'actor_2_FB_likes': 5000,
  'actor_3_FB_likes': 1000},
 {'title': 'The_Dark_Knight_Riâ€¦',
  'actor_1': 'Tom_Hardy',
  'actor_2': 'Christianâ€¦',
  'actor_3': 'Joseph_Gorâ€¦',
  'actor_1_FB_likes': 27000,
  'actor_2_FB_likes': 23000,
  'actor_3_FB_likes': 23000},
 {'title': 'John_Carter',
  'actor_1': 'Daryl_Sabâ€¦',
  'actor_2': 'Samantha_â€¦',
  'actor_3': 'Polly_Walkâ€¦',
  'actor_1_FB_likes': 640,
  'actor_2_FB_likes': 632,
  'actor_3_FB_likes': 530},
 {'title': 'Spider-Man_3',
  'actor_1': 'J.K._Simmâ€¦',
  'actor_2': 'James_Fraâ€¦',
  'actor_3': 'Kirsten_Duâ€¦',
  'actor_1_FB_likes': 24000,
  'actor_2_FB_likes': 11000,
  'actor_3_FB_likes': 4000},
 {'title': 'Tangled',
  'actor_1': 'Brad_Garrâ€¦',
  'actor_2': 'Donna_Murâ€¦',
  'actor_3': 'M.C._Gainey',
  'actor_1_FB_likes': 799,
  'actor_2_FB_likes': 553,
  'actor_3_FB_likes': 284}]
)

df

Unnamed: 0,title,actor_1,actor_2,actor_3,actor_1_FB_likes,actor_2_FB_likes,actor_3_FB_likes
0,Avatar,CCH_Poundâ€¦,Joel_Daviâ€¦,Wes_Studi,1000,936,855
1,Pirates_of_the_Carâ€¦,Johnny_Deâ€¦,Orlando_Bâ€¦,Jack_Davenâ€¦,40000,5000,1000
2,The_Dark_Knight_Riâ€¦,Tom_Hardy,Christianâ€¦,Joseph_Gorâ€¦,27000,23000,23000
3,John_Carter,Daryl_Sabâ€¦,Samantha_â€¦,Polly_Walkâ€¦,640,632,530
4,Spider-Man_3,J.K._Simmâ€¦,James_Fraâ€¦,Kirsten_Duâ€¦,24000,11000,4000
5,Tangled,Brad_Garrâ€¦,Donna_Murâ€¦,M.C._Gainey,799,553,284


Above, we have a dataframe of movie titles, actors, and their facebook likes. It would be great if we could transform this into a long form, with just the title, the actor names, and the number of likes. Let's look at a possible solution : 

First, we reshape the columns, so that the numbers appear at the end.

In [33]:
df1 = df.copy()
pat = r"(?P<actor>.+)_(?P<num>\d)_(?P<likes>.+)"
repl = lambda m: f"""{m.group('actor')}_{m.group('likes')}_{m.group('num')}"""
df1.columns = df1.columns.str.replace(pat, repl)
df1

Unnamed: 0,title,actor_1,actor_2,actor_3,actor_FB_likes_1,actor_FB_likes_2,actor_FB_likes_3
0,Avatar,CCH_Poundâ€¦,Joel_Daviâ€¦,Wes_Studi,1000,936,855
1,Pirates_of_the_Carâ€¦,Johnny_Deâ€¦,Orlando_Bâ€¦,Jack_Davenâ€¦,40000,5000,1000
2,The_Dark_Knight_Riâ€¦,Tom_Hardy,Christianâ€¦,Joseph_Gorâ€¦,27000,23000,23000
3,John_Carter,Daryl_Sabâ€¦,Samantha_â€¦,Polly_Walkâ€¦,640,632,530
4,Spider-Man_3,J.K._Simmâ€¦,James_Fraâ€¦,Kirsten_Duâ€¦,24000,11000,4000
5,Tangled,Brad_Garrâ€¦,Donna_Murâ€¦,M.C._Gainey,799,553,284


Now, we can reshape, using `pd.wide_to_long` :

In [34]:
pd.wide_to_long(df1, 
        stubnames=['actor', 'actor_FB_likes'], 
        i='title', j='group', 
        sep='_'
    )

Unnamed: 0_level_0,Unnamed: 1_level_0,actor,actor_FB_likes
title,group,Unnamed: 2_level_1,Unnamed: 3_level_1
Avatar,1,CCH_Poundâ€¦,1000
Pirates_of_the_Carâ€¦,1,Johnny_Deâ€¦,40000
The_Dark_Knight_Riâ€¦,1,Tom_Hardy,27000
John_Carter,1,Daryl_Sabâ€¦,640
Spider-Man_3,1,J.K._Simmâ€¦,24000
Tangled,1,Brad_Garrâ€¦,799
Avatar,2,Joel_Daviâ€¦,936
Pirates_of_the_Carâ€¦,2,Orlando_Bâ€¦,5000
The_Dark_Knight_Riâ€¦,2,Christianâ€¦,23000
John_Carter,2,Samantha_â€¦,632


We could attempt to solve it with [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer), using the `.value` symbol : 

In [35]:
df1.pivot_longer(index='title', 
        names_to=(".value","group"), 
        names_pattern="(.+)_(\d)$"
    )

2


Unnamed: 0,title,group,actor,actor_FB_likes
0,Avatar,1,CCH_Poundâ€¦,1000
1,Avatar,2,Joel_Daviâ€¦,936
2,Avatar,3,Wes_Studi,855
3,Pirates_of_the_Carâ€¦,1,Johnny_Deâ€¦,40000
4,Pirates_of_the_Carâ€¦,2,Orlando_Bâ€¦,5000
5,Pirates_of_the_Carâ€¦,3,Jack_Davenâ€¦,1000
6,The_Dark_Knight_Riâ€¦,1,Tom_Hardy,27000
7,The_Dark_Knight_Riâ€¦,2,Christianâ€¦,23000
8,The_Dark_Knight_Riâ€¦,3,Joseph_Gorâ€¦,23000
9,John_Carter,1,Daryl_Sabâ€¦,640


What if we could just get our data in long form without the massaging? We know our data has a pattern to it --> it either ends in a number or *likes*.  Can't we take advantage of that? Yes, we can(I know, I know; it sounds like a campaign slogan ðŸ¤ª)

In [36]:
df.pivot_longer(index='title',
        names_to=("actor", "num_likes"),
        names_pattern=('\d$', 'likes$'),
    )

Unnamed: 0,title,actor,num_likes
0,Avatar,CCH_Poundâ€¦,1000
1,Avatar,Joel_Daviâ€¦,936
2,Avatar,Wes_Studi,855
3,Pirates_of_the_Carâ€¦,Johnny_Deâ€¦,40000
4,Pirates_of_the_Carâ€¦,Orlando_Bâ€¦,5000
5,Pirates_of_the_Carâ€¦,Jack_Davenâ€¦,1000
6,The_Dark_Knight_Riâ€¦,Tom_Hardy,27000
7,The_Dark_Knight_Riâ€¦,Christianâ€¦,23000
8,The_Dark_Knight_Riâ€¦,Joseph_Gorâ€¦,23000
9,John_Carter,Daryl_Sabâ€¦,640


A pairing of `names_to` and `names_pattern` results in :

                                   {"actor": '\d$', "num_likes": 'likes$'}
                                   
The first regex looks for columns that end with a number, while the other looks for columns that end with *likes*. [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) will then look for columns that end with a number and lump all the values in those columns under the `actor` column, and also look for columns that end with *like* and combine all the values in those columns into a new column -> `num_likes`. Underneath the hood, [numpy select](https://numpy.org/doc/stable/reference/generated/numpy.select.html) and [pd.Series.str.contains](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html) are used to pull apart the columns into the new columns. 

Again, it is about the goal; we are not interested in the numbers (1,2,3), we only need the names of the actors, and their facebook likes. [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) aims to give as much flexibility as possible, in addition to ease of use, to allow the end user focus on the task. 

Let's take a look at another example. [Source Data](https://stackoverflow.com/questions/60439749/pair-wise-melt-in-pandas-dataframe) :

In [37]:
df = pd.DataFrame({'id': [0, 1],
 'Name': ['ABC', 'XYZ'],
 'code': [1, 2],
 'code1': [4, np.nan],
 'code2': ['8', 5],
 'type': ['S', 'R'],
 'type1': ['E', np.nan],
 'type2': ['T', 'U']})

df

Unnamed: 0,id,Name,code,code1,code2,type,type1,type2
0,0,ABC,1,4.0,8,S,E,T
1,1,XYZ,2,,5,R,,U


We cannot directly use [pd.wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html) here without some massaging, as there is no definite suffix(the first `code` does not have a suffix), neither can we use `.value` here, again because there is no suffix. However, we can see a pattern where some columns start with `code`, and others start with `type`. Let's see how [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) solves this, using a sequence of regular expressions in the ``names_pattern`` argument : 

In [38]:
df.pivot_longer(index=["id", "Name"],
        names_to=("code_all", "type_all"), 
        names_pattern=("^code", "^type")
    )

Unnamed: 0,id,Name,code_all,type_all
0,0,ABC,1,S
1,0,ABC,4,E
2,0,ABC,8,T
3,1,XYZ,2,R
4,1,XYZ,5,U


The key here is passing the right regular expression, and ensuring the names in `names_to` is paired with the right regex in `names_pattern`; as such, every column that starts with `code` will be included in the new `code_all` column; the same happens to the `type_all` column. Easy and flexible, right? 

Let's explore another example, from [Stack Overflow](https://stackoverflow.com/questions/12466493/reshaping-multiple-sets-of-measurement-columns-wide-format-into-single-columns) :

In [39]:
df = pd.DataFrame(
            [
                {
                    "ID": 1,
                    "DateRange1Start": "1/1/90",
                    "DateRange1End": "3/1/90",
                    "Value1": 4.4,
                    "DateRange2Start": "4/5/91",
                    "DateRange2End": "6/7/91",
                    "Value2": 6.2,
                    "DateRange3Start": "5/5/95",
                    "DateRange3End": "6/6/96",
                    "Value3": 3.3,
                }
            ])

df

Unnamed: 0,ID,DateRange1Start,DateRange1End,Value1,DateRange2Start,DateRange2End,Value2,DateRange3Start,DateRange3End,Value3
0,1,1/1/90,3/1/90,4.4,4/5/91,6/7/91,6.2,5/5/95,6/6/96,3.3


In the dataframe above, we need to reshape the data to have a start date, end date and value. For the `DateRange` columns, the numbers are embedded within the string, while for `value` it is appended at the end. One possible solution is to reshape the columns so that the numbers are at the end :

In [40]:
df1 = df.copy()
pat = r"(?P<head>.+)(?P<num>\d)(?P<tail>.+)"
repl = lambda m: f"""{m.group('head')}{m.group('tail')}{m.group('num')}"""
df1.columns = df1.columns.str.replace(pat,repl)
df1

Unnamed: 0,ID,DateRangeStart1,DateRangeEnd1,Value1,DateRangeStart2,DateRangeEnd2,Value2,DateRangeStart3,DateRangeEnd3,Value3
0,1,1/1/90,3/1/90,4.4,4/5/91,6/7/91,6.2,5/5/95,6/6/96,3.3


Now, we can unpivot:

In [41]:
pd.wide_to_long(df1, 
        stubnames=['DateRangeStart', 'DateRangeEnd', 'Value'],
         i='ID', 
         j='num'
    )

Unnamed: 0_level_0,Unnamed: 1_level_0,DateRangeStart,DateRangeEnd,Value
ID,num,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,1,1/1/90,3/1/90,4.4
1,2,4/5/91,6/7/91,6.2
1,3,5/5/95,6/6/96,3.3


Using the `.value` symbol in pivot_longer:

In [42]:
df1.pivot_longer(index='ID', 
        names_to=[".value",'num'], 
        names_pattern="(.+)(\d)$"
    )

2


Unnamed: 0,ID,num,DateRangeStart,DateRangeEnd,Value
0,1,1,1/1/90,3/1/90,4.4
1,1,2,4/5/91,6/7/91,6.2
2,1,3,5/5/95,6/6/96,3.3


Since we are not interested in the numbers, we can rewrite our code above :

In [43]:
df1.pivot_longer(index='ID', 
        names_to=".value", 
        names_pattern="(.+)\d$"
    )

1


Unnamed: 0,ID,DateRangeStart,DateRangeEnd,Value
0,1,1/1/90,3/1/90,4.4
1,1,4/5/91,6/7/91,6.2
2,1,5/5/95,6/6/96,3.3


Or, we could allow pivot_longer worry about the massaging; simply pass to `names_pattern` a list of regular expressions that match what we are after : 

In [44]:
df.pivot_longer(index='ID', 
        names_to=("DateRangeStart", "DateRangeEnd", "Value"), 
        names_pattern=("Start$", "End$", "^Value")
    )

Unnamed: 0,ID,DateRangeStart,DateRangeEnd,Value
0,1,1/1/90,3/1/90,4.4
1,1,4/5/91,6/7/91,6.2
2,1,5/5/95,6/6/96,3.3


The code above looks for columns that end with *Start*(`Start$`), aggregates all the values in those columns into `DateRangeStart` column, looks for columns that end with *End*(`End$`), aggregates all the values within those columns into `DateRangeEnd` column, and finally looks for columns that start with *Value*(`^Value`), and aggregates the values in those columns into the `Value` column. Just know the patterns, and pair them accordingly. Again, the goal is a focus on the task, to make it simple for the end user.

Let's look at another example [Source Data](https://stackoverflow.com/questions/64316129/how-to-efficiently-melt-multiple-columns-using-the-module-melt-in-pandas/64316306#64316306) :

In [45]:
df = pd.DataFrame({'Activity': ['P1', 'P2'],
 'General': ['AA', 'BB'],
 'm1': ['A1', 'B1'],
 't1': ['TA1', 'TB1'],
 'm2': ['A2', 'B2'],
 't2': ['TA2', 'TB2'],
 'm3': ['A3', 'B3'],
 't3': ['TA3', 'TB3']})

df

Unnamed: 0,Activity,General,m1,t1,m2,t2,m3,t3
0,P1,AA,A1,TA1,A2,TA2,A3,TA3
1,P2,BB,B1,TB1,B2,TB2,B3,TB3


This is a [solution](https://stackoverflow.com/a/64316306/7175713) provided by yours truly : 

In [46]:
 (pd.wide_to_long(df, 
                  i=["Activity", "General"], 
                  stubnames=["t", "m"], 
                  j="number")
    .set_axis(["Task", "M"], axis="columns")
    .droplevel(-1).reset_index()
     )

Unnamed: 0,Activity,General,Task,M
0,P1,AA,TA1,A1
1,P1,AA,TA2,A2
2,P1,AA,TA3,A3
3,P2,BB,TB1,B1
4,P2,BB,TB2,B2
5,P2,BB,TB3,B3


Or, we could use [pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer), abstract the details, and focus on the task : 

In [47]:
df.pivot_longer(index=['Activity','General'], 
                names_pattern=['^m','^t'],
                names_to=['M','Task'])

Unnamed: 0,Activity,General,M,Task
0,P1,AA,A1,TA1
1,P1,AA,A2,TA2
2,P1,AA,A3,TA3
3,P2,BB,B1,TB1
4,P2,BB,B2,TB2
5,P2,BB,B3,TB3


Alright, one last example : 

In [48]:
df = pd.DataFrame(
            {
                "id": [1, 2, 3],
                "x1": [4, 5, 6],
                "x2": [5, 6, 7],
                "y1": [7, 8, 9],
                "y2": [10, 11, 12],
            }
        )

df

Unnamed: 0,id,x1,x2,y1,y2
0,1,4,5,7,10
1,2,5,6,8,11
2,3,6,7,9,12


In the dataframe above, we are not really interested in the numbers affixed to `x` and `y`. We also notice that there is a pattern ,where some columns start with `x` and others start with `y`. [pd.wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html) can handle this easily :

In [49]:
pd.wide_to_long(df, 
        stubnames=['x','y'],
        i='id', 
        j='num'
    )

Unnamed: 0_level_0,Unnamed: 1_level_0,x,y
id,num,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1,4,7
2,1,5,8
3,1,6,9
1,2,5,10
2,2,6,11
3,2,7,12


To get rid of the `num` variable, we can use [droplevel](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.droplevel.html), combined with [reset_index](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.html) (if you wish to integrate the `id` column back) : 

In [50]:
(pd.wide_to_long(df, 
        stubnames=['x','y'], 
        i='id', 
        j='num')
.droplevel(-1)
.reset_index())

Unnamed: 0,id,x,y
0,1,4,7
1,2,5,8
2,3,6,9
3,1,5,10
4,2,6,11
5,3,7,12


[pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) can handle this as well, by passing a list of regular expressions to `names_pattern`(again, we are not interested in the numbers affixed to `x` and `y`) : 

In [51]:
df.pivot_longer(index = 'id', 
        names_to = ['x','y'], 
        names_pattern = ['^x', '^y']
    )

Unnamed: 0,id,x,y
0,1,4,7
1,1,5,10
2,2,5,8
3,2,6,11
4,3,6,9
5,3,7,12


We could also use `.value` as well to reshape the data : 

In [52]:
df.pivot_longer(index = 'id', 
        names_to = '.value', 
        names_pattern = '(.).'
    )

1


Unnamed: 0,id,x,y
0,1,4,7
1,1,5,10
2,2,5,8
3,2,6,11
4,3,6,9
5,3,7,12


And, if you want the numbers, easy-peasy :

In [53]:
df.pivot_longer(index = 'id', 
        names_to = ['.value','num'], 
        names_pattern = '(.)(.)'
    )

2


Unnamed: 0,id,num,x,y
0,1,1,4,7
1,1,2,5,10
2,2,1,5,8
3,2,2,6,11
4,3,1,6,9
5,3,2,7,12


[pivot_longer](https://pyjanitor.readthedocs.io/reference/janitor.functions/janitor.pivot_longer.html#janitor.pivot_longer) does not solve all problems; no function does. Its aim is to be a single point for unpivoting single indexed dataframes from wide to long form, is easy to use and offers a lot of flexibility.