Skip to content

ENH: add new method DataFrame.repeat #42943

@Zeroto521

Description

@Zeroto521

Is your feature request related to a problem?

I sometimes would generate some faker but having rule data.
Just like in a (2d) data, repeat some rows or columns serial times.

The steps I do:

  1. This is a DataFrame called df.
  2. value = df.values # change DataFrame to ndarray
  3. use np.repeat or np.tile method to repeat.

Describe the solution you'd like

I thought why not add a method repeat to DataFrame, since Series has already owned the repeat method.

About the DataFrame.repeat parameters

Similar to numpy.repeat(a, repeats, axis=None)

def repeat(
    self,
    repeats: int | list[int],
    axis: int | str = 0,
) -> pd.DataFrame | None:
    ...

DataFrame is a 2d data structure, so it is hard to repeat each element.
And the aim of DataFrame.repeat is to repeat one row/column or some rows/columns.

So the axis argument must be specific to 0 or 1.

And if the repeats is a single value int, that means repeat all row/column repeats times.

Examples of this method

>>> df = pd.DataFrame({'a': [1, 2], 'b':[3, 4]})
>>> df
    a  b
0  1  3
1  2  4

Each row repeat two times.

>>> df.repeat(2)
    a  b
0  1  3
0  1  3
1  2  4
1  2  4

Each column repeat two times.

>>> df.repeat(2, 1)
    a  a  b  b
0  1  1  3  3
1  2  2  4  4

``a`` column repeat 1 times, ``b`` column repeat 2 times.

>>> df.repeat([1, 2], 1)
    a  b  b
0  1  3  3
1  2  4  4

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions