# Python Workshop - Day 1

**Date**: 23 April 2021

**Instructor**: Andres Patrignani | Soil Water Processes | Department of Agronomy | Kansas State University

**Moderator**: Adrian Correndo

**Contact**: andrespatrignani@ksu.edu

Workshop organized by the Agronomy Graduate Students Association Stats & Programming Committee


## Table of Contents
* [Exercise description](#exercise-description)
* [Useful commands](#useful-commands)
* [Tips for writing Python code](#tips-writing-python-code)
* [Basic example](#basic-example)
* [Import modules](#import-modules)
* [Load tabular data](#load-tabular-data)
* [Comvert strings to datetime](#string-to-datetime)
* [Find missing values](#find-missing-values)
* [Replace missing values](#replace-missing-values)
* [Create a plot of weather data](#plot-weather-data)
* [Compute thermal units](#compute-thermal-units)
* [Model potential yield](#model-potential-yield)
* [Practice](#practice)
* [References](#references)
* [Q&A](#q&a)


## Exercise description <a id="exercise-description">

To learn the basic data analysis workflow we will use a dataset of weather data obtained from the Kansas Mesonet. After loading and replacing missing data, our goal is to define the start and end of the winter wheat growing season and compute the cumulative thermal units. The last step consists of implementing a simple model for estimating the potential leaf area index and grain yield based on the computed thermal units. In this tutorial you will learn how to:

- load tabular data
- replace missing values
- select data between a specific date range
- plot time series
- define and use python functions


## Useful commands <a id="useful-commands">

- `cmd + Enter` Evaluates the code in the current cell

- `shift + Enter` Evaluates the code in the current cell and moves to a new cell

- `cmd + s` Save current notebook

- Use the `Tab` key after typing few letters of a variable or function to autocomplete

## Tips for writting clear Python code <a id="tips-writing-python-code">

Python code was developed with better code readibility in mind.

- Indentation and white space matters

- Comment your code using `#`

- Use descriptive variable names. Usually variables should be named using all lower case. Constants are usually all upper case. No spaces allowed. Use underscores to join words. Feel free to deviate from these conventions if you are following a set of equations from a paper or book and you want to match the notation in the example.


Learn more about Python coding conventions at: 'https://www.python.org/dev/peps/pep-0008/'

## Basic computation example <a id='basic-example'>

A simple example to highlight few concepts mentioned above.

In [1]:
soil_temperature_F = 56.0 # degrees Fahrenheit
soil_temperature_C = (soil_temperature_F-32) * 5/9 # Celsius

print('Hello world, today the soil temperature in Manhattan, KS is', round(soil_temperature_C, 1), 'Celsius')


Hello world, today the soil temperature in Manhattan, KS is 13.3 Celsius


In [2]:
# Show all the varaibles defined so far
%whos

Variable             Type     Data/Info
---------------------------------------
soil_temperature_C   float    13.333333333333334
soil_temperature_F   float    56.0


## Importing modules <a id='import-modules'>

Python relies on modules containing more Python code that provide new utilities. Think of the basic language as a toolbox with only a few tools (e.g. a screwdrive, a hammer, and a wrench) and think of modules as an additional kit of tools (e.g. a new set of wrenches of different sizes) that will allow you to do something very specific without having to write a large amount of code. In other words, modules were written by other programmers and that code can now be used to resolve a specific task.

The Anaconda package comes with hundreds of pre-installed modules. You can also install other modules available in the web that are not included with Anaconda.

A common convention is to use notation similar tot hat in the official documentation of each module.

In [3]:
# Import modules
import numpy as np
import pandas as pd
from bokeh.plotting import figure, show, output_notebook
from bokeh.layouts import gridplot
output_notebook() # This ensures that figures display in the notebook

## Load tabular data <a id='load-tabular-data'>

In [4]:
# Load data
#df = pd.read_csv('ashland_bottoms_2019_2020.csv')
URL = 'https://raw.githubusercontent.com/soilwater/python-workshop/main/Day_1/ashland_bottoms_2019_2020.csv'
df = pd.read_csv(URL, comment='#')

# Read the file from local drive using df = pd.read_csv(ashland_bottoms_2019_2020.csv, comment='#')
# If the file is not in the same directory as the notebook, then you need to provide the path to the file.


In [5]:
# Check loaded data
df.head(3)

Unnamed: 0,TIMESTAMP,PRESSUREAVG,TEMP2MMIN,TEMP2MMAX,RELHUM2MMAX,RELHUM2MMIN,PRECIP,SR,WSPD2MAVG,WDIR2M,SOILTMP5AVG,VWC5CM
0,1/1/2019,97.27,-9.77,2.76,90.74,67.15,0.0,2.14,3.62,273.2,-0.14,0.3034
1,1/2/2019,99.05,-11.67,-7.85,77.15,62.65,0.0,3.74,2.82,299.05,-1.27,0.1902
2,1/3/2019,98.33,-7.85,3.03,86.36,35.25,0.0,11.57,2.85,197.42,-1.03,0.1744


In [6]:
# Check object type
type(df)

pandas.core.frame.DataFrame

## Convert timestamps to datetime format <a id='string-to-datetime'>

`%d` = day represented as a one or two-digit number

`%m` = month represented as a one or two-digit number

`%Y` = year represented as a four-digit number (use `%y` for a two-digit representation)


In [7]:
# Convert dates from string to datetime format
df['TIMESTAMP'] = pd.to_datetime(df['TIMESTAMP'], format='%m/%d/%Y')

# Check our work
df.head(3)

Unnamed: 0,TIMESTAMP,PRESSUREAVG,TEMP2MMIN,TEMP2MMAX,RELHUM2MMAX,RELHUM2MMIN,PRECIP,SR,WSPD2MAVG,WDIR2M,SOILTMP5AVG,VWC5CM
0,2019-01-01,97.27,-9.77,2.76,90.74,67.15,0.0,2.14,3.62,273.2,-0.14,0.3034
1,2019-01-02,99.05,-11.67,-7.85,77.15,62.65,0.0,3.74,2.82,299.05,-1.27,0.1902
2,2019-01-03,98.33,-7.85,3.03,86.36,35.25,0.0,11.57,2.85,197.42,-1.03,0.1744


>In the previous step we overwrote the column `df['TIMESTAMP']` that was a string format (this is how it was initially imported by Pandas) to another version of itself in `datetime` format. In other words, because of there are many different ways of representing dates, we need to indicate the Python interpreter what each part of the string represents.

## Find missing values <a id='find-missing-values'>

In [8]:
# Check for entries denoted as missing values
df.isna()

Unnamed: 0,TIMESTAMP,PRESSUREAVG,TEMP2MMIN,TEMP2MMAX,RELHUM2MMAX,RELHUM2MMIN,PRECIP,SR,WSPD2MAVG,WDIR2M,SOILTMP5AVG,VWC5CM
0,False,False,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...
726,False,False,False,False,False,False,False,False,False,False,False,False
727,False,False,False,False,False,False,False,False,False,False,False,False
728,False,False,False,False,False,False,False,False,False,False,False,False
729,False,False,False,False,False,False,False,False,False,False,False,False


In [9]:
# Summary of columns with missing values
df.isna().sum()

TIMESTAMP       0
PRESSUREAVG     3
TEMP2MMIN       4
TEMP2MMAX       4
RELHUM2MMAX     4
RELHUM2MMIN     4
PRECIP          3
SR              3
WSPD2MAVG       3
WDIR2M          3
SOILTMP5AVG     3
VWC5CM         10
dtype: int64

## Replace missing values <a id='replace-missing-values'>

In [10]:
# Replace missing records for each variable separately 
# to gain more control of the interpolation method
df['TEMP2MMIN'].interpolate(method='pchip', limit_area='inside', inplace=True)
df['TEMP2MMAX'].interpolate(method='pchip', limit_area='inside', inplace=True)
df['SR'].interpolate(method='pchip', limit_area='inside', inplace=True) # We will need this later for the crop model

In [11]:
# Compute the sum of all missing values (represented as NaN) for each variable of the DataFrame
df.isna().sum()

TIMESTAMP       0
PRESSUREAVG     3
TEMP2MMIN       0
TEMP2MMAX       0
RELHUM2MMAX     4
RELHUM2MMIN     4
PRECIP          3
SR              0
WSPD2MAVG       3
WDIR2M          3
SOILTMP5AVG     3
VWC5CM         10
dtype: int64

## Plot weather data <a id='plot-weather-data'>

It's always a good idea to visualize your dataset (or at least the varaibles in question) before you get too far. Figures can reveal anomalies, values out of range, and gaps in the weather record that we failed to account with our previous code.

>Tip: Always plot your data, particularly during code development.

In [12]:
# Create a figure to visualize temperature data
f1 = figure(x_axis_type='datetime', plot_width=800, plot_height=300, title='Ashland Bottoms, KS')
f1.line(df['TIMESTAMP'], df['TEMP2MMAX'], legend_label='Tmax', color='tomato')
f1.line(df['TIMESTAMP'], df['TEMP2MMIN'], legend_label='Tmin', color='navy')
f1.yaxis.axis_label = 'Air Temperature (Celsius)'
f1.legend.location = 'top_right'
f1.legend.click_policy="hide"
show(f1)

# Try clicking on the legend to hide the selected timeseries.

In Bokeh the final size is dictated by the width and height of the figure, so rather than setting a specific DPI (dots per square inch), we can directly set the dimensions of the figure size.

At this point the easiest way to save a figure is to do it directly from the plot toolbar. To do it programmatically, follow [this link](https://docs.bokeh.org/en/latest/docs/user_guide/export.html?highlight=export_png) and the code snippet below:

`from bokeh.io import export_png
export_png(f1, filename="weather_ashland_2019_2020.png", width=800, height=300)`

>Unlike JPG, the PNG format is a lossless data compression. Also conisder exporting figures in SVG format.

Learn more about Bokeh plotting configuration and options:

- [Basic charts](https://docs.bokeh.org/en/latest/docs/user_guide/plotting.html)
- [Figure options](https://docs.bokeh.org/en/latest/docs/user_guide/tools.html)
- [Layouts](https://docs.bokeh.org/en/latest/docs/user_guide/layout.html)
- [Annotations](https://docs.bokeh.org/en/latest/docs/user_guide/annotations.html)

## Define growing season period <a id='define-growing-season'>

In [14]:
# Define the start and end of the growing season for winter wheat in central Kansas
planting_date = pd.to_datetime('15-Oct-2019', format='%d-%b-%Y')
harvest_date = pd.to_datetime('1-Jun-2020', format='%d-%b-%Y')
growing_season_duration = harvest_date - planting_date

print('Planting date:', planting_date)
print('Harvest date:', harvest_date)
print('Growing season duration:', growing_season_duration.days + 1) # Add one day to include the last day


Planting date: 2019-10-15 00:00:00
Harvest date: 2020-06-01 00:00:00
Growing season duration: 231


## Select weather data for the growing season <a id='select-weather-growing-season'>

Our dataset set spans two years (i.e. 2019 and 2020), but we are only interested in computing thermal units for the duration of the growing season.

In [15]:
# First, select rows that belong to the growing season
idx_season = (df['TIMESTAMP'] >= planting_date) & (df['TIMESTAMP'] <= harvest_date)
print(idx_season) # Use print(idx_season[280:295]) to see the transition

0      False
1      False
2      False
3      False
4      False
       ...  
726    False
727    False
728    False
729    False
730    False
Name: TIMESTAMP, Length: 731, dtype: bool


In [16]:
# Now that we have the boolean, we can use to select the rows within the growing season
# We will create another dataframe

df_season = df.loc[idx_season,:] # Alternatively you can do df_season = df[idx_season]
df_season.head()

Unnamed: 0,TIMESTAMP,PRESSUREAVG,TEMP2MMIN,TEMP2MMAX,RELHUM2MMAX,RELHUM2MMIN,PRECIP,SR,WSPD2MAVG,WDIR2M,SOILTMP5AVG,VWC5CM
287,2019-10-15,97.47,1.66,25.02,99.35,33.92,0.0,18.64,1.64,143.85,13.25,0.42474
288,2019-10-16,97.83,5.05,17.89,99.76,25.41,0.0,18.1,2.88,299.71,13.8,0.42305
289,2019-10-17,98.32,2.66,15.85,92.79,37.5,0.0,18.09,1.32,238.41,12.57,0.41737
290,2019-10-18,97.39,3.02,24.75,96.25,26.42,0.0,18.03,1.94,152.22,13.15,0.41356
291,2019-10-19,96.67,12.25,24.88,91.98,35.83,1.27,17.54,3.32,141.37,14.43,0.40949


In [17]:
# The last step, although not necessary, is to reset the index
df_season.reset_index(inplace=True)
df_season.head()

Unnamed: 0,index,TIMESTAMP,PRESSUREAVG,TEMP2MMIN,TEMP2MMAX,RELHUM2MMAX,RELHUM2MMIN,PRECIP,SR,WSPD2MAVG,WDIR2M,SOILTMP5AVG,VWC5CM
0,287,2019-10-15,97.47,1.66,25.02,99.35,33.92,0.0,18.64,1.64,143.85,13.25,0.42474
1,288,2019-10-16,97.83,5.05,17.89,99.76,25.41,0.0,18.1,2.88,299.71,13.8,0.42305
2,289,2019-10-17,98.32,2.66,15.85,92.79,37.5,0.0,18.09,1.32,238.41,12.57,0.41737
3,290,2019-10-18,97.39,3.02,24.75,96.25,26.42,0.0,18.03,1.94,152.22,13.15,0.41356
4,291,2019-10-19,96.67,12.25,24.88,91.98,35.83,1.27,17.54,3.32,141.37,14.43,0.40949


In [18]:
# Plot air temperature for growing season only (NOTE: We are using an alternative plotting notation in Bokeh)
f2 = figure(x_axis_type='datetime', plot_width=800, plot_height=300, title='Ashland Bottoms, KS | 2019-2020 Wheat growing season')
f2.line(source=df_season, x='TIMESTAMP', y='TEMP2MMAX', legend_label='Tmax', color='tomato')
f2.line(source=df_season, x='TIMESTAMP', y='TEMP2MMIN', legend_label='Tmin', color='navy')
f2.yaxis.axis_label = 'Air Temperature (Celsius)'
f2.legend.location = 'bottom_right'
f2.legend.click_policy="hide"
show(f2)

In [19]:
# Find day of highest air temperature

T_max = df_season['TEMP2MMAX'].max()
idx_T_max = df_season['TEMP2MMAX'].idxmax()
T_max_day = df_season.loc[idx_T_max, 'TIMESTAMP'] # Add .date() to only display the date portion of the timestamp

print('The maximum air temperature was', T_max, 'Celsius on', T_max_day)


# Find day of lowest air temperature

T_min = df_season['TEMP2MMIN'].min()
idx_T_min = df_season['TEMP2MMAX'].idxmin()
T_min_day = df_season.loc[idx_T_min, 'TIMESTAMP']

print('The minimum air temperature was', T_min, 'Celsius on', T_min_day)

The maximum air temperature was 30.2 Celsius on 2020-04-08 00:00:00
The minimum air temperature was -16.75 Celsius on 2020-02-14 00:00:00


## Compute thermal units <a id='compute-thermal-units'>

Since there are few ways of computing thermal units, we will formally define our method(McMaster and Wilhelm, 1997):

$$ if \ T_{avg} < T_{base}, \ then \ TU = T_{base} $$

$$ if \ T_{avg} > T_{upper}, \ then \ TU = T_{upper} $$

$$ if \ T_{avg} >= T_{base} and \ T_{avg} <= T_{upper}, \ then \ \ TU = T_{avg} - T_{base}$$


In [20]:
# Define winter wheat cardinal temperatures
T_base = 0   # Celsius
T_upper = 35 # Celsius

In [21]:
# Compute average temperature
T_avg = (df_season['TEMP2MMIN'] + df_season['TEMP2MMAX'])/2

# Insert Tavg into DataFrame after Tmax using same notation as other temperature variables
df_season.insert(5, 'TEMP2MAVG', T_avg)

# Check our work
df_season.head()

Unnamed: 0,index,TIMESTAMP,PRESSUREAVG,TEMP2MMIN,TEMP2MMAX,TEMP2MAVG,RELHUM2MMAX,RELHUM2MMIN,PRECIP,SR,WSPD2MAVG,WDIR2M,SOILTMP5AVG,VWC5CM
0,287,2019-10-15,97.47,1.66,25.02,13.34,99.35,33.92,0.0,18.64,1.64,143.85,13.25,0.42474
1,288,2019-10-16,97.83,5.05,17.89,11.47,99.76,25.41,0.0,18.1,2.88,299.71,13.8,0.42305
2,289,2019-10-17,98.32,2.66,15.85,9.255,92.79,37.5,0.0,18.09,1.32,238.41,12.57,0.41737
3,290,2019-10-18,97.39,3.02,24.75,13.885,96.25,26.42,0.0,18.03,1.94,152.22,13.15,0.41356
4,291,2019-10-19,96.67,12.25,24.88,18.565,91.98,35.83,1.27,17.54,3.32,141.37,14.43,0.40949


In [22]:
# Now we can compute thermal units
TU_daily = np.minimum(np.maximum(df_season['TEMP2MAVG'], T_base), T_upper)

# Cumulative thermal time
TU_cumulative = TU_daily.cumsum() # To directly insert into DataFrame use df['TU_cumulative'] = TU_daily.cumsum()

# Add the cumulative thermal units to the DataFrame for the growing season (just for completeness, but not really needed)
df_season.insert(len(df.columns), 'TU_cumulative', TU_cumulative)

In [23]:
# Plot cumulative thermal units
f3 = figure(x_axis_type='datetime', plot_width=800, plot_height=300, title='2019-2020 Wheat growing season')
f3.line(source=df_season, x='TIMESTAMP', y='TU_cumulative', name='Thermal Units', color='navy')
f3.yaxis.axis_label = 'Thermal Units (C-day)'
show(f3)

In [24]:
# Find total thermal units for growing season
TU_total = df_season['TU_cumulative'].iloc[-1] 

# Alternatively:
# TU_total = TU_cumulative.values[-1]
# This is possible because we already defined TU_cuulative (which is a Pandas series) 
# and because Pandas values are ultimatedly Numpy arrays.

print('The 2019-2020 winter wheat growing season had a total of', round(TU_total), 'thermal units')

The 2019-2020 winter wheat growing season had a total of 1656 thermal units


## Implement simple crop model <a id='model-potential-yield'>

This part consists of three steps:

1. Define model equations
2. Define model parameters and units
3. Compute potential leaf area index and above-ground plant biomass


The model is simple and assumes no water and nutrient limitations:

**Leaf Area Index (LAI)**
The LAI represents the area covered by all leaves in a unit area.

$$ LAI_t = LAI_{max} \Bigg[ \frac{1}{1+e^{-\alpha(T_t - T_1)}} -e^{\beta(T_t-T_2)} \Bigg] $$



**Above-ground plant biomass**

$$ B_t = B_{t-1} + E_b \; E_{imax} \Bigg[ 1 - e^{-K \; LAI_t} \Bigg] PAR_t $$

where the parameter $T_2$ is defined by:

$$ T_2 = \frac{1}{\beta} log[1 + e^{(\alpha \; T_1)}]$$


$AGB$ is above-ground plant dry biomass in $g m^{-2}$
$T$ is cumulative growing degree days

$t$ is time in days

$LAI$ is the leaf area index

$LAI_{max}$ is the maximum leaf area index during the entire growing season.

$PAR$ is the photosynthetically active radiation

$K$ is the coefficient of extiction

$T_1$ is a growth threshold

$E_b$ is the intercepted radiation use efficiency

$E_{imax}$ is the maximal value of intercepted to incident solar radiation 

$\alpha$ and $\beta$ are empirical parameters

In [25]:
# Define model parameters

# Estimate Photosynthetically Active Radiation
PAR = df_season['SR'] * 0.48

# Define model parameters
Eb = 1.85 # g/MJ/m^2
Eimax = 0.95
K = 0.7
LAI_max = 7
T1 = 700 # Thermal units
alpha = 0.005
beta = 0.002
T2 = 1/beta * np.log(1 + np.exp(alpha*T1)) # Thermal units
HI = 0.45 # Approximate harvest index


In [27]:
# Compute LAI
LAI = LAI_max * ( 1/(1+np.exp(-alpha*(TU_cumulative-T1))) - np.exp(beta*(TU_cumulative-T2)) )
LAI = np.maximum(LAI, 0)

# Compute biomass
AGB = Eb * Eimax * (1-np.exp(-K*LAI)) * PAR
AGB_cumulative = AGB.cumsum()

# Print estimated total above-ground plant biomass
AGB_total = AGB_cumulative.iloc[-1]
print('Potential above-ground biomass was:', round(AGB_total), 'g/m^2' )

# Print estimated potential grain yield
grain_yield = AGB_total * 10 * HI
print('Potential grain yield was:', round(grain_yield), 'kg/ha')

Potential above-ground biomass was: 1623 g/m^2
Potential grain yield was: 7303 kg/ha


In [28]:
# Plot LAI and biomass
f4 = figure(x_axis_type='datetime', title='2019-2020 Wheat growing season')
f4.line(df_season['TIMESTAMP'], LAI, name='Leaf Area Index', color='green')
f4.yaxis.axis_label = 'Leaf Area Index'

f5 = figure(x_axis_type='datetime', title='2019-2020 Wheat growing season')
f5.line(df_season['TIMESTAMP'], AGB_cumulative, name='Leaf Area Index', color='tomato')
f5.yaxis.axis_label = 'Above-ground plant biomass (g/m^2)'

grid = gridplot([[f4, f5]], plot_width=500, plot_height=300)

show(grid)


## Practice <a id="practice">

- Find the total amount of precipitation during the growing season
- Find the day with the largest daily rainfall event
- Generate a figure with two subplots: one showing daily rainfall and the other subplot showing the cumulative rainfall during the growing season.

## References <a id='references'>

Baret, F. 1986. Contribution au suivi radiometriqué de cultures de céréales. Ph.D. Dissertation, Université Orsay, Orsay, France.

McMaster, G.S. and Wilhelm, W.W., 1997. Growing degree-days: one equation, two interpretations. Agricultural and forest meteorology, 87(4), pp.291-300.

Wallach, D., Makowski, D., and Jones, J.W. 2006. Working with dynamic crop models. Chapter 3: UNcertainty and sensitivity analysis for crop models by Monod, H., Naud, C., and Makowski, D.

Patrignani, A., Knapp, M., Redmond, C. and Santos, E., 2020. Technical overview of the Kansas Mesonet. Journal of Atmospheric and Oceanic Technology, 37(12), pp.2167-2183.

Van Rossum, G., Warsaw, B. and Coghlan, N., 2001. PEP 8: style guide for Python code. Python Software Foundation. Link: https://www.python.org/dev/peps/pep-0008/ Accessed: 25-April-2021.

## Q&A <a id='q&a'>

**Question:** I don't have a result on the screen

**Answer instructor:** Few things could be going on here:
   1. Make sure you run the code cell. You can do this by pressing `ctrl + Enter`
   2. Make sure your cell is a `code` cell (default) and not a `markdown` cell.
   3. It is possible that the result is saved into a variable. Use the `print()` function to display the content of the varaible.

<br/>

**Question:** Can I choose DPI, size, format before exporting the figure? How to export?

**Answer instructor:** Bokeh does not have a way to specify the DPI. The resolution is set up by the width and the height of the plot. For a high resolution simply set up a larger figure. Read [this discussion](https://github.com/bokeh/bokeh/issues/8807) from one of the main Bokeh developers.

**Answer by another student:** If you use a plotting library called Matplotlib, the following code should save your figure `plt.savefig('weather.png', dpi=500, bbox_inches='tight')`. This code will not work in Bokeh.

<br/>

**Question:**
Is there a way to show a floating TOC on notebooks?

**Answer instructor:** Floating TOC are possible by installing some extensions in Jupyter Lab. Look at the Extension Manager (icon of a puzzle piece to your left). Some extensions are useful and handy for improving the styling of the notebook. Many features that involve interactivity likely involve another language called Javscript, which means that you will end up installing several more dependencies. I added a TOC to this notebook using basic Markdown and HTML, which does not require additional installations. [Check this extension](https://github.com/jupyterlab/jupyterlab-toc). 

<br/>

**Question:**:I think you used a dot to perform a sequence of math operations, but does it work for more complex functions? I'm thinking if there's an analog to the pipe in R (i.e. use the object at the left as the first argument of the function at the right). Thanks!!!

**Answer instructor:** The pipe `|` operator in the *R* programming language represents the *bitwise OR* operator in Python, which is widely used to perform vectorized boolean operations, notably when using the Numpy module or modules heavily based on Numpy. Python, and particularly modules like Pandas, allow the user to chain multiple methods. I typically advise against chaining more than two operations to maintain code readability, but this is a personal choice. One-line statements are neat and force the programmer to test their skill and learn tricks, just make sure your code does not become cryptic.


