## Operations

See the basic section on Binary Ops

### Stats

Operation is general _exclude_ missing data.

Calculate the mean value for each column:

In [None]:
df.mean()

Calculate the mean value for each row:

In [None]:
df.mean(axis=1)

2024-11-01    1.003253
2024-11-02    1.130657
2024-11-03    1.232302
2024-11-04    1.260088
2024-11-05    0.965156
2024-11-06    0.624362
2024-11-07    1.594754
2024-11-08    0.486105
2024-11-09    1.262922
2024-11-10    0.956565
2024-11-11    1.786683
2024-11-12    0.789613
2024-11-13    1.590397
2024-11-14    1.470423
2024-11-15    1.534808
2024-11-16    1.900729
2024-11-17    0.494140
2024-11-18    1.064574
2024-11-19    1.132367
2024-11-20    1.342546
2024-11-21    0.537636
2024-11-22    1.393694
2024-11-23    0.557262
2024-11-24    0.669198
2024-11-25    1.410763
2024-11-26    1.054245
2024-11-27    1.081810
2024-11-28    1.165748
2024-11-29    0.664249
Freq: D, dtype: float64

Operating with another <u>`Series`</u> or <u>`DataFrame`</u> with a different index or column will align the result with the union of the index or column labels. In addition, pandas automatically broadcasts along the specified dimension and will fill unaligned labels with <u>`np.nan`</u>

In [None]:
df.sub(s, axis="index")

### User defined functions

<u>`DataFrame.agg()`</u> and <u>`DataFrame.transform`</u> applies a user defined function that reduces or broadcasts its result respectively.

In [None]:
df.agg(lambda x: np.mean(x) * 5.6)

In [None]:
df.transform(lambda x: x * 101.2)

### Value Counts

See more at <u>Histogramming and Discretization</u>

In [None]:
pd.Series(np.random.randint(0,7, size=29))

In [None]:
s.value_counts()

### String Methods

<u>`Series`</u> is equipped with a set of string processing methods in the <u>`str`</u> attribute that make in easy to operate on each element of the array, as in the code snippet below. See more at <u>Vectorized String Methods.</u>

In [None]:
s = pd.Series(["A", "B", "C", "Aaba", "Baca", np.nan, "CABA", "dog", "cat"])

s.str.lower()

## Merge

### Concat

pandas provides various facilities for easy combining together <u>`Series`</u> and <u>`DataFrame`</u> objects with various kinds of set logic for the indexes and relational algebra functionality in the case of join / merge-type operations.

See the <u>Merging section</u>

Concatenating pandas objects together row-wise with<u>`concat()`</u>:

In [None]:
df = pd.DataFrame(np.random.randn(10, 4))

df

In [None]:
# Break it into pieces
piece = [df[:3], df[3:7], df[7:]]

In [None]:
pd.concat(piece)

<table>
    <th>
        <tb>
            Note
            <tr>
                <td>
                Adding a column to a <code><u>DataFrame</u></code> is relatively fast. However, adding a row requires a copy, and may be expensive. We recommend passing a pre-built list of records to the <code><u>DataFrame</u></code> constructor instead of building a <code><u>DataFrame</u></code> by iteratively appending records to it.
                </td>
            </tr>
        </tb>
    </th>
</table>

### Join

<u>`merge()`</u> enables SQL style join types along specific columns. See the <u>Database style joining</u> section.

In [None]:
left = pd.DataFrame({"key": ["foo", "foo"], "lval": [1, 2]})
right = pd.DataFrame({"key": ["foo", "foo"], "rval": [4, 5]})

In [None]:
left

In [None]:
right

In [None]:
pd.merge(left, right, on="key")

<u>`merge()`</u> on unique keys:

In [None]:
left = pd.DataFrame({"key": [ "foo", "bar"], "lval": [1, 2]})
right = pd.DataFrame({"key": ["foo", "bar"], "rval": [4, 5]})

In [None]:
left

In [None]:
right

In [None]:
pd.merge(left, right, on="key")