#### Pandas Tutorial - Part 33

This notebook covers:
- Advanced operations with Timedeltas
- Timedelta reductions
- Pandas options and settings

In [1]:
import pandas as pd
import numpy as np
import datetime

%matplotlib inline

##### Advanced Operations with Timedeltas

Continuing from Part 32, let's explore more operations with Timedeltas.

In [2]:
# Create a Series of dates
s = pd.Series(pd.date_range('2012-1-1', periods=3, freq='D'))
s

0   2012-01-01
1   2012-01-02
2   2012-01-03
dtype: datetime64[ns]

In [3]:
# Create a Series of timedeltas by subtracting the first date
y = s - s[0]
y

0   0 days
1   1 days
2   2 days
dtype: timedelta64[ns]

### Setting NaT in Timedeltas

Elements can be set to NaT using np.nan analogously to datetimes:

In [4]:
# Set an element to NaT
y[1] = np.nan
y

0   0 days
1      NaT
2   2 days
dtype: timedelta64[ns]

### Reversed Order Operations

Operands can also appear in a reversed order (a singular object operated with a Series):

In [5]:
# Subtract a Series from a scalar
s.max() - s

0   2 days
1   1 days
2   0 days
dtype: timedelta64[ns]

In [6]:
# Subtract a Series from a datetime
datetime.datetime(2011, 1, 1, 3, 5) - s

0   -365 days +03:05:00
1   -366 days +03:05:00
2   -367 days +03:05:00
dtype: timedelta64[ns]

In [7]:
# Add a timedelta to a Series
datetime.timedelta(minutes=5) + s

0   2012-01-01 00:05:00
1   2012-01-02 00:05:00
2   2012-01-03 00:05:00
dtype: datetime64[ns]

### Min, Max and Index Operations

min, max and the corresponding idxmin, idxmax operations are supported on frames:

In [8]:
# Create two Series of timedeltas
A = s - pd.Timestamp('20120101') - pd.Timedelta('00:05:05')
B = s - pd.Series(pd.date_range('2012-1-2', periods=3, freq='D'))

# Create a DataFrame with these Series
df = pd.DataFrame({'A': A, 'B': B})
df

Unnamed: 0,A,B
0,-1 days +23:54:55,-1 days
1,0 days 23:54:55,-1 days
2,1 days 23:54:55,-1 days


In [9]:
# Find the minimum value in each column
df.min()

A   -1 days +23:54:55
B   -1 days +00:00:00
dtype: timedelta64[ns]

In [10]:
# Find the minimum value in each row
df.min(axis=1)

0   -1 days
1   -1 days
2   -1 days
dtype: timedelta64[ns]

In [11]:
# Find the index of the minimum value in each column
df.idxmin()

A    0
B    0
dtype: int64

In [12]:
# Find the index of the maximum value in each column
df.idxmax()

A    2
B    0
dtype: int64

min, max, idxmin, idxmax operations are supported on Series as well. A scalar result will be a Timedelta.

In [13]:
# Find the maximum of the minimum values in each column
df.min().max()

Timedelta('-1 days +23:54:55')

In [14]:
# Find the minimum of the minimum values in each row
df.min(axis=1).min()

Timedelta('-1 days +00:00:00')

In [15]:
# Find the column name with the maximum of the minimum values
df.min().idxmax()

'A'

In [16]:
# Find the row index with the minimum of the minimum values
df.min(axis=1).idxmin()

0

### Filling NaT Values

You can fillna on timedeltas, passing a timedelta to get a particular value.

In [17]:
# Create a Series with NaT values
y = s - s.shift()
y

0      NaT
1   1 days
2   1 days
dtype: timedelta64[ns]

In [18]:
# Fill NaT values with 0 days
y.fillna(pd.Timedelta(0))

0   0 days
1   1 days
2   1 days
dtype: timedelta64[ns]

In [19]:
# Fill NaT values with 10 seconds
y.fillna(pd.Timedelta(10, unit='s'))

0   0 days 00:00:10
1   1 days 00:00:00
2   1 days 00:00:00
dtype: timedelta64[ns]

In [20]:
# Fill NaT values with a negative timedelta
y.fillna(pd.Timedelta('-1 days, 00:00:05'))

0   -1 days +00:00:05
1     1 days 00:00:00
2     1 days 00:00:00
dtype: timedelta64[ns]

### Negation, Multiplication, and Absolute Value

You can also negate, multiply and use abs on Timedeltas:

In [21]:
# Create a negative timedelta
td1 = pd.Timedelta('-1 days 2 hours 3 seconds')
td1

Timedelta('-2 days +21:59:57')

In [22]:
# Multiply by -1
-1 * td1

Timedelta('1 days 02:00:03')

In [23]:
# Negate
- td1

Timedelta('1 days 02:00:03')

In [24]:
# Absolute value
abs(td1)

Timedelta('1 days 02:00:03')

##### Timedelta Reductions

Numeric reduction operations for timedelta64[ns] will return Timedelta objects. As usual, NaT values are skipped during evaluation.

In [25]:
# Create a Series with timedeltas and NaT
y2 = pd.Series(pd.to_timedelta(['-1 days +00:00:05', 'nat',
                               '-1 days +00:00:05', '1 days']))
y2

0   -1 days +00:00:05
1                 NaT
2   -1 days +00:00:05
3     1 days 00:00:00
dtype: timedelta64[ns]

In [26]:
# Calculate the mean
y2.mean()

Timedelta('-1 days +16:00:03.333333334')

In [27]:
# Calculate the median
y2.median()

Timedelta('-1 days +00:00:05')

In [28]:
# Calculate the 10th percentile
y2.quantile(.1)

Timedelta('-1 days +00:00:05')

##### Pandas Options and Settings

Pandas has an options system that lets you customize some aspects of its behavior, with display-related options being those the user is most likely to adjust.

### Overview

Options have a full "dotted-style", case-insensitive name (e.g., display.max_rows). You can get/set options directly as attributes of the top-level options attribute.

In [29]:
# Get the current value of display.max_rows
pd.options.display.max_rows

60

In [30]:
# Set a new value
pd.options.display.max_rows = 999
pd.options.display.max_rows

999

### Getting and Setting Options

The API is composed of 5 relevant functions, available directly from the pandas namespace:
- `get_option()` / `set_option()` - get/set the value of a single option.
- `reset_option()` - reset one or more options to their default value.
- `describe_option()` - print the descriptions of one or more options.
- `option_context()` - execute a codeblock with a set of options that revert to prior settings after execution.

In [31]:
# Get an option using get_option
pd.get_option("display.max_rows")

999

In [32]:
# Set an option using set_option
pd.set_option("display.max_rows", 101)
pd.get_option("display.max_rows")

101

In [34]:
# Using a more specific substring to avoid ambiguity
pd.set_option("display.max_r", 102)
print("Current value of display.max_rows:", pd.get_option("display.max_rows"))

# Or using the fully qualified name (most reliable)
pd.set_option("display.max_rows", 102)
print("Current value of display.max_rows:", pd.get_option("display.max_rows"))

# Let's see what other options might match "max_r"
print("\nOptions that might match 'max_r':")
print("- display.max_rows")
print("- display.max_rows_info")
print("- display.max_repr_rows")

Current value of display.max_rows: 102
Current value of display.max_rows: 102

Options that might match 'max_r':
- display.max_rows
- display.max_rows_info
- display.max_repr_rows


In [35]:
# This will not work because it matches multiple option names
try:
    pd.get_option("column")
except KeyError as e:
    print(e)

Pattern matched multiple keys


In [36]:
# Get and set a different option
pd.get_option('mode.sim_interactive')

False

In [37]:
pd.set_option('mode.sim_interactive', True)
pd.get_option('mode.sim_interactive')

True

### Resetting Options

All options have a default value, and you can use reset_option to revert to that default:

In [38]:
# Check current value
pd.get_option("display.max_rows")

102

In [39]:
# Set to a new value
pd.set_option("display.max_rows", 999)
pd.get_option("display.max_rows")

999

In [40]:
# Reset to default
pd.reset_option("display.max_rows")
pd.get_option("display.max_rows")

60

##### Conclusion

In this notebook, we've explored:

1. Advanced operations with Timedeltas, including:
   - Setting NaT values
   - Reversed order operations
   - Min, max, and index operations
   - Filling NaT values
   - Negation, multiplication, and absolute value

2. Timedelta reductions like mean, median, and quantile

3. Pandas options and settings system, including:
   - Getting and setting options
   - Using the options API functions
   - Resetting options to defaults

These features provide powerful tools for working with time-related data and customizing pandas behavior.