Some introduction here

In [None]:
# Import modules
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

In [None]:
# Dataset, using dictionary
dataset = {
    "Subject": ["Pt_1", "Pt_2", "Pt_3", "Pt_4", "Pt_5", "Pt_6"],
    "Ag_HA":np.linspace(31, 78, 6),
    "Ag_NA":np.linspace(12, 44, 6),
    "Ag_M":np.linspace(2, 9, 6),
    "Ag_PA":np.linspace(7, 21, 6)
}

# DataFrame, with .from_dict() method
df = pd.DataFrame.from_dict(dataset)

Pandas has 4 accessor methods:

1. `.loc[]` for label based, e.g. `df.loc[:, ["col 1", "col 2"]]` to get all rows from `col 1` and `col 2`.
2. `.iloc[]`, like `.loc`, but primarily with integer index.
3. `.at`, get single value, label-based.
4. `.iat[]`, like `.at`, but with integer index.

Example below using `.loc[]` and `.at[]` accessor methods.

Not that with `[:,:]` slice notation on a Pandas dataframe, the first (before comma) is to subset row, the second is to subset column. Also support stepping with `::` slice notation.

In [None]:
# using .loc[], specifying row index and column labels
df.loc[4:, ["Ag_HA", "Ag_PA"]]

In [None]:
# using .at[], specifying the target cell
df.at[4, "Ag_HA"]

Now, can we change value at specific location of interest?

First, try using `df["column"][index]` approach, see what we would get

In [None]:
# Ag_HA is column, 0 is row-index
df["Ag_HA"][0] = 30

With this approach, Pandas would complain `SettingWithCopyWarning`, where it concerns about "returning a view versus a copy". So, what is happening here? Note that the command actually worked here.

First of all, the approach using `df["column"][index]` is inappropriate for changing value at specific cell of interest. The `SettingWithCopyWarning` was created to flag a potentially confusing "chained" assignments, because under the hood, it calls `__getitem__` and `__setitem__` (therefore, a "chained" method). Depending on what `__getitem__` returns (a view or a copy), the subsequent `__setitem__` operation **may not work**, see this thread on [Stack Overflow](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas).

The appropriate way to do this is with `.loc[]` accessor, because it directly calls `__setitem__` without going through `__getitem__` first.

In [None]:
# No more warning!
df.loc[0, "Ag_HA"] = 30

The next thing that I would like to do is I would like to change certain range of values into 1 value, an operation I usually perform when specifying a detection limit based on the experimental assay that I carried out.

Let's decide anything below 10 as lower than detection limit, so all values below 10 will be set to 8 instead. To do this, we use `np.where()`, iterate over a list of columns with a `for` loop. This method allows me to change a range of values without using `df.replace(to_replace=old, value=new)` (which does not work for a range of numbers), and without melting the dataframe first (because my raw dataframe tends to be in wide format).

The function `np.where(x, y, z)` returns an array of index where the condition set in `x` is met, then replaces them with value in `y`, back into column/array `z`.

In [None]:
# Get column names except for Subject
cols_ag = list(df.columns[1:])

for ag in cols_ag:
    df[ag] = np.where(df[ag] < 10, 8, df[ag])

Above is a modified function from my A-039's `core.py` script, reproduced below:

In hindsight, I did not have to melt with `pd.melt()` and run `.pivot()` method, but I thought I should include it here because `.pivot()` returns a dataframe with multi-index, which I then resolved with `.rename_axis(None, axis=1).reset_index()`.

```python
def spot_detect_limit(data, limit=5):
    """
    Because I was stupid, I needed this to be done.
    Takes in ELISpot dataset in wide format, turn into long format,
      then replace anything below 7 with 5, then return as wide dataframe.

    Change the resulting plot to start plotting at y=4 and LOD at 8.
    To be used primarily with elispot.pb_proportion()

    Args:
      data  : ELIspot data in wide format
      limit : the designated lowest value (default is 5)
    """
    id_cols = ["Run", "Subject", "Visit", "Day", "Sample"]
    _df = pd.melt(data, id_vars=id_cols, var_name="Antigen", value_name="MASC")
    _df = _df.query(" Subject != 'HD519' ").reset_index(drop=True)
    _df["MASC"] = np.where(_df["MASC"] < 7, limit, _df["MASC"])

    # blackmagicfuckery to return back as wide dataframe
    _df_wide = _df.pivot(index=id_cols, columns="Antigen", values="MASC")
    return _df_wide.rename_axis(None, axis=1).reset_index()
```