## Simple guide on Method Chaning

Method chaining in Python involves calling multiple methods on an object in a single, continuous line of code. Each method returns an object, which allows the next method in the chain to be called on the returned object. This can make code more readable and concise.

To illustrate method chaining, we'll use the UCI Machine Learning Repository's Iris dataset as an example. We'll use the pandas library to load and manipulate the dataset.

First, let's import the necessary libraries and load the dataset:

### Example 1

In [5]:
import pandas as pd

# Load the Iris dataset
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
column_names = ["sepal_length", "sepal_width", "petal_length", "petal_width", "class"]
iris = pd.read_csv(url, header=None, names=column_names)

Here's an example of performing method chaining to filter, sort, and select specific columns in the Iris dataset:

In [8]:
# Method chaining example
result = (iris
          .query("sepal_length < 5.0")  # Filter rows where sepal length is greater than 5.0
          .sort_values(by="petal_length", ascending=False)  # Sort by petal length in descending order
          .loc[:, ["sepal_length", "sepal_width", "petal_length", "class"]]  # Select specific columns
          .reset_index(drop=True)  # Reset the index
         )

print(result.head())

   sepal_length  sepal_width  petal_length            class
0           4.9          2.5           4.5   Iris-virginica
1           4.9          2.4           3.3  Iris-versicolor
2           4.8          3.4           1.9      Iris-setosa
3           4.8          3.4           1.6      Iris-setosa
4           4.8          3.1           1.6      Iris-setosa


#### Explanation:

    1. query("sepal_length > 5.0"): Filters the DataFrame to include only rows where the sepal length is greater than 5.0.
    2. sort_values(by="petal_length", ascending=False): Sorts the DataFrame by the petal length in descending order.
    3. loc[:, ["sepal_length", "sepal_width", "petal_length", "class"]]: Selects the specified columns (sepal length, sepal width, petal length, and class).
    `reset_index(drop=True)`: Resets the DataFrame index, dropping the old index.

Each method returns a DataFrame, allowing the next method to be chained to it.

### Example 2

Let's see another example, where we add a new column to the DataFrame and then filter based on this new column:

In [3]:
# Method chaining example with adding a new column
result = (iris
          .assign(sepal_ratio=lambda x: x.sepal_length / x.sepal_width)  # Add a new column for sepal length to width ratio
          .query("sepal_ratio > 2.0")  # Filter rows where the sepal ratio is greater than 2.0
          .sort_values(by="sepal_ratio", ascending=False)  # Sort by the new sepal ratio in descending order
          .loc[:, ["sepal_length", "sepal_width", "sepal_ratio", "class"]]  # Select specific columns
          .reset_index(drop=True)  # Reset the index
         )

print(result.head())

   sepal_length  sepal_width  sepal_ratio            class
0           7.7          2.6     2.961538   Iris-virginica
1           6.2          2.2     2.818182  Iris-versicolor
2           7.7          2.8     2.750000   Iris-virginica
3           6.3          2.3     2.739130  Iris-versicolor
4           6.0          2.2     2.727273   Iris-virginica


Explanation:

    1. assign(sepal_ratio=lambda x: x.sepal_length / x.sepal_width): Adds a new column sepal_ratio which is the ratio of sepal length to sepal width.
    2. query("sepal_ratio > 2.0"): Filters the DataFrame to include only rows where the sepal ratio is greater than 2.0.
    3. sort_values(by="sepal_ratio", ascending=False): Sorts the DataFrame by the new sepal ratio in descending order.
    4. loc[:, ["sepal_length", "sepal_width", "sepal_ratio", "class"]]: Selects the specified columns (sepal length, sepal width, sepal ratio, and class).
    5. reset_index(drop=True): Resets the DataFrame index, dropping the old index.

These examples demonstrate how method chaining can be used to perform complex data manipulations in a concise and readable manner.