# Calendar construction and properties

This tutorial covers calendar construction and offers a walk through of `ExchangeCalendar` properties (for methods see the [calendar methods](./calendar_methods.ipynb) tutorial).

NB properties that _define_ a calendar (`open_times`, `special_closes_adhoc` etc) are not covered by this tutorial (see the [How can I create a new calendar](https://github.com/gerrymanoim/exchange_calendars/tree/master#how-can-i-create-a-new-calendar) section of the [README](https://github.com/gerrymanoim/exchange_calendars/tree/master)).

In [1]:
# set up
import exchange_calendars as xcals
import pandas as pd

hkg = xcals.get_calendar("XHKG")  # Hong Kong Stock Exchange
nys = xcals.get_calendar("XNYS")  # New York Stock Exchange

### Calendar construction

The **`default_start`** and **`default_end`** class methods return the default calendar bounds, usually '20 years ago' and '1 year from now' respectively. These bounds represent the limits within which sessions and minutes can be interrogated and queried via the calendar's properties and methods.

In [2]:
nys.default_start(), nys.default_end()

(Timestamp('2002-06-27 00:00:00'), Timestamp('2023-06-27 00:00:00'))

The `sessions` property returns all sessions covered by a calendar. The New York calendar created above can be seen to cover sessions over the period defined by the default bounds.

In [3]:
nys.sessions

DatetimeIndex(['2002-06-27', '2002-06-28', '2002-07-01', '2002-07-02',
               '2002-07-03', '2002-07-05', '2002-07-08', '2002-07-09',
               '2002-07-10', '2002-07-11',
               ...
               '2023-06-13', '2023-06-14', '2023-06-15', '2023-06-16',
               '2023-06-20', '2023-06-21', '2023-06-22', '2023-06-23',
               '2023-06-26', '2023-06-27'],
              dtype='datetime64[ns]', length=5286, freq='C')

Alternatively the calendar bounds can be defined by passing the `start` and/or `end` parameters to `get_calendar` (or directly to a calendar class).

In [4]:
cal = xcals.get_calendar("XNYS", start="2020-02-19", end="2025-12-31")
cal.sessions

DatetimeIndex(['2020-02-19', '2020-02-20', '2020-02-21', '2020-02-24',
               '2020-02-25', '2020-02-26', '2020-02-27', '2020-02-28',
               '2020-03-02', '2020-03-03',
               ...
               '2025-12-17', '2025-12-18', '2025-12-19', '2025-12-22',
               '2025-12-23', '2025-12-24', '2025-12-26', '2025-12-29',
               '2025-12-30', '2025-12-31'],
              dtype='datetime64[ns]', length=1477, freq='C')

Some calendars have min and/or max bounds outside of which a calendar cannot be created. The **`bound_min`/`bound_max`** class methods return the earliest/latest date from/to which a calendar class can be constructed, or None if there is no limit:

In [5]:
hkg.bound_min(), hkg.bound_max()

(Timestamp('1960-01-01 00:00:00'), Timestamp('2049-12-31 00:00:00'))

In [6]:
nys.bound_min() == nys.bound_max() == None

True

**`valid_sides`** is a class method that returns all side values that can be passed to the constructor's `side` parameter:

In [7]:
nys.valid_sides()

['both', 'left', 'right', 'neither']

The side values available to 24 hour calendars is necessarily more limited.

In [8]:
open24 = xcals.get_calendar("24/7", side="right")
open24.valid_sides()

['left', 'right']

NB the actual calendar side is returned by the **`side`** property (the default side for all calendars is "left").

In [9]:
nys.side, open24.side

('left', 'right')

See the [minutes](./minutes.ipynb) tutorial for an explanation of how a calendar's side determines which minutes are treated as trading minutes.

### General Calendar Properties

In [10]:
nys.name

'XNYS'

In [11]:
nys.tz  # timezone

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

In [12]:
nys.side

'left'

**`day`** returns a `CustomBusinessDay` for the calendar:

In [13]:
nys.day

<CustomBusinessDay>

In [14]:
# for reference
nys.sessions_in_range("2021-12-29", "2022-01-05")

DatetimeIndex(['2021-12-29', '2021-12-30', '2021-12-31', '2022-01-03',
               '2022-01-04', '2022-01-05'],
              dtype='datetime64[ns]', freq='C')

In [15]:
pd.Timestamp("2021-12-31") + (nys.day * 2)

Timestamp('2022-01-04 00:00:00')

**`has_break`** queries if _any_ calendar session has a break

In [16]:
nys.has_break

False

### Sessions and Minutes

**NOTE:** This section does little more than catalogue the calendar properties concerned with sessions or minutes. See the [sessions](./sessions.ipynb) and [minutes](./minutes.ipynb) tutorials for notes on 'working with sessions/minutes'.

#### All of them

In [17]:
nys.sessions  # DatetimeIndex representing all calendar sessions

DatetimeIndex(['2002-06-27', '2002-06-28', '2002-07-01', '2002-07-02',
               '2002-07-03', '2002-07-05', '2002-07-08', '2002-07-09',
               '2002-07-10', '2002-07-11',
               ...
               '2023-06-13', '2023-06-14', '2023-06-15', '2023-06-16',
               '2023-06-20', '2023-06-21', '2023-06-22', '2023-06-23',
               '2023-06-26', '2023-06-27'],
              dtype='datetime64[ns]', length=5286, freq='C')

In [18]:
nys.minutes  # DatetimeIndex representing every calendar trading minute

DatetimeIndex(['2002-06-27 13:30:00+00:00', '2002-06-27 13:31:00+00:00',
               '2002-06-27 13:32:00+00:00', '2002-06-27 13:33:00+00:00',
               '2002-06-27 13:34:00+00:00', '2002-06-27 13:35:00+00:00',
               '2002-06-27 13:36:00+00:00', '2002-06-27 13:37:00+00:00',
               '2002-06-27 13:38:00+00:00', '2002-06-27 13:39:00+00:00',
               ...
               '2023-06-27 19:50:00+00:00', '2023-06-27 19:51:00+00:00',
               '2023-06-27 19:52:00+00:00', '2023-06-27 19:53:00+00:00',
               '2023-06-27 19:54:00+00:00', '2023-06-27 19:55:00+00:00',
               '2023-06-27 19:56:00+00:00', '2023-06-27 19:57:00+00:00',
               '2023-06-27 19:58:00+00:00', '2023-06-27 19:59:00+00:00'],
              dtype='datetime64[ns, UTC]', length=2053440, freq=None)

#### Calendar Bounds

In [19]:
nys.first_session, nys.last_session

(Timestamp('2002-06-27 00:00:00', freq='C'),
 Timestamp('2023-06-27 00:00:00', freq='C'))

In [20]:
nys.first_minute, nys.last_minute

(Timestamp('2002-06-27 13:30:00+0000', tz='UTC'),
 Timestamp('2023-06-27 19:59:00+0000', tz='UTC'))

In [21]:
nys.first_session_open, nys.last_session_close

(Timestamp('2002-06-27 13:30:00+0000', tz='UTC'),
 Timestamp('2023-06-27 20:00:00+0000', tz='UTC'))

#### Schedule (open, close and break times)

In [22]:
sch = hkg.schedule
sch

Unnamed: 0,open,break_start,break_end,close
2002-06-27,2002-06-27 02:00:00+00:00,2002-06-27 04:00:00+00:00,2002-06-27 05:00:00+00:00,2002-06-27 08:00:00+00:00
2002-06-28,2002-06-28 02:00:00+00:00,2002-06-28 04:00:00+00:00,2002-06-28 05:00:00+00:00,2002-06-28 08:00:00+00:00
2002-07-02,2002-07-02 02:00:00+00:00,2002-07-02 04:00:00+00:00,2002-07-02 05:00:00+00:00,2002-07-02 08:00:00+00:00
2002-07-03,2002-07-03 02:00:00+00:00,2002-07-03 04:00:00+00:00,2002-07-03 05:00:00+00:00,2002-07-03 08:00:00+00:00
2002-07-04,2002-07-04 02:00:00+00:00,2002-07-04 04:00:00+00:00,2002-07-04 05:00:00+00:00,2002-07-04 08:00:00+00:00
...,...,...,...,...
2023-06-20,2023-06-20 01:30:00+00:00,2023-06-20 04:00:00+00:00,2023-06-20 05:00:00+00:00,2023-06-20 08:00:00+00:00
2023-06-21,2023-06-21 01:30:00+00:00,2023-06-21 04:00:00+00:00,2023-06-21 05:00:00+00:00,2023-06-21 08:00:00+00:00
2023-06-23,2023-06-23 01:30:00+00:00,2023-06-23 04:00:00+00:00,2023-06-23 05:00:00+00:00,2023-06-23 08:00:00+00:00
2023-06-26,2023-06-26 01:30:00+00:00,2023-06-26 04:00:00+00:00,2023-06-26 05:00:00+00:00,2023-06-26 08:00:00+00:00


The schedule is a pandas DataFrame. It shows the dates the exchange is open and describes the bounds when the exchange is open for regular trading on each of those dates.

The index represents the calendar's sessions as a timezone-naive pandas `DatetimeIndex`. For each session, the columns offer the open, close and, if applicable, break-start and break-end time. **The times are defined in UTC terms** and columns have dtype as `datetime64[ns, UTC]`.

In [23]:
sch.open

2002-06-27   2002-06-27 02:00:00+00:00
2002-06-28   2002-06-28 02:00:00+00:00
2002-07-02   2002-07-02 02:00:00+00:00
2002-07-03   2002-07-03 02:00:00+00:00
2002-07-04   2002-07-04 02:00:00+00:00
                        ...           
2023-06-20   2023-06-20 01:30:00+00:00
2023-06-21   2023-06-21 01:30:00+00:00
2023-06-23   2023-06-23 01:30:00+00:00
2023-06-26   2023-06-26 01:30:00+00:00
2023-06-27   2023-06-27 01:30:00+00:00
Freq: C, Name: open, Length: 5182, dtype: datetime64[ns, UTC]

The break_start/break_end columns take `pd.NaT` in the event that a session does not have a break.

In [24]:
hkg.schedule.loc["2021-12-23":"2021-12-28"]

Unnamed: 0,open,break_start,break_end,close
2021-12-23,2021-12-23 01:30:00+00:00,2021-12-23 04:00:00+00:00,2021-12-23 05:00:00+00:00,2021-12-23 08:00:00+00:00
2021-12-24,2021-12-24 01:30:00+00:00,NaT,NaT,2021-12-24 04:00:00+00:00
2021-12-28,2021-12-28 01:30:00+00:00,2021-12-28 04:00:00+00:00,2021-12-28 05:00:00+00:00,2021-12-28 08:00:00+00:00


The calendar properties **`opens`**, **`closes`**, **`break_starts`** and **`break_ends`** return the corresponding column of the schedule as a `pd.Series`.

In [25]:
hkg.opens

2002-06-27   2002-06-27 02:00:00+00:00
2002-06-28   2002-06-28 02:00:00+00:00
2002-07-02   2002-07-02 02:00:00+00:00
2002-07-03   2002-07-03 02:00:00+00:00
2002-07-04   2002-07-04 02:00:00+00:00
                        ...           
2023-06-20   2023-06-20 01:30:00+00:00
2023-06-21   2023-06-21 01:30:00+00:00
2023-06-23   2023-06-23 01:30:00+00:00
2023-06-26   2023-06-26 01:30:00+00:00
2023-06-27   2023-06-27 01:30:00+00:00
Freq: C, Name: open, Length: 5182, dtype: datetime64[ns, UTC]

In [26]:
hkg.closes["2021-12-23":"2021-12-28"]

2021-12-23   2021-12-23 08:00:00+00:00
2021-12-24   2021-12-24 04:00:00+00:00
2021-12-28   2021-12-28 08:00:00+00:00
Freq: C, Name: close, dtype: datetime64[ns, UTC]

The **`late_opens`** and **`early_closes`** properties return a `DatetimeIndex` of those sessions that have a later open/earlier close than the prevailing regular open/close time.

In [27]:
nys.early_closes[-10:]

DatetimeIndex(['2018-07-03', '2018-11-23', '2018-12-24', '2019-07-03',
               '2019-11-29', '2019-12-24', '2020-11-27', '2020-12-24',
               '2021-11-26', '2022-11-25'],
              dtype='datetime64[ns]', freq=None)

#### Bound Trading Minutes by-sessions

The **`first_*`** and **`last_*`** methods return `pd.Series` with index as `sessions` and value as the corresponding trading minute (UTC) for the session. NB As explained in the [minutes tutorial](./minutes.ipynb), these minutes are not necessarily the same as the session open/close/break times.

In [28]:
nys.first_minutes

2002-06-27   2002-06-27 13:30:00+00:00
2002-06-28   2002-06-28 13:30:00+00:00
2002-07-01   2002-07-01 13:30:00+00:00
2002-07-02   2002-07-02 13:30:00+00:00
2002-07-03   2002-07-03 13:30:00+00:00
                        ...           
2023-06-21   2023-06-21 13:30:00+00:00
2023-06-22   2023-06-22 13:30:00+00:00
2023-06-23   2023-06-23 13:30:00+00:00
2023-06-26   2023-06-26 13:30:00+00:00
2023-06-27   2023-06-27 13:30:00+00:00
Freq: C, Name: first_minutes, Length: 5286, dtype: datetime64[ns, UTC]

In [29]:
nys.last_minutes

2002-06-27   2002-06-27 19:59:00+00:00
2002-06-28   2002-06-28 19:59:00+00:00
2002-07-01   2002-07-01 19:59:00+00:00
2002-07-02   2002-07-02 19:59:00+00:00
2002-07-03   2002-07-03 19:59:00+00:00
                        ...           
2023-06-21   2023-06-21 19:59:00+00:00
2023-06-22   2023-06-22 19:59:00+00:00
2023-06-23   2023-06-23 19:59:00+00:00
2023-06-26   2023-06-26 19:59:00+00:00
2023-06-27   2023-06-27 19:59:00+00:00
Freq: C, Name: last_minutes, Length: 5286, dtype: datetime64[ns, UTC]

In [30]:
# in terms of local times...
nys.last_minutes.dt.tz_convert(nys.tz)

2002-06-27   2002-06-27 15:59:00-04:00
2002-06-28   2002-06-28 15:59:00-04:00
2002-07-01   2002-07-01 15:59:00-04:00
2002-07-02   2002-07-02 15:59:00-04:00
2002-07-03   2002-07-03 15:59:00-04:00
                        ...           
2023-06-21   2023-06-21 15:59:00-04:00
2023-06-22   2023-06-22 15:59:00-04:00
2023-06-23   2023-06-23 15:59:00-04:00
2023-06-26   2023-06-26 15:59:00-04:00
2023-06-27   2023-06-27 15:59:00-04:00
Freq: C, Name: last_minutes, Length: 5286, dtype: datetime64[ns, America/New_York]

In [31]:
# last pre-break minute of am subsession
hkg.last_am_minutes["2021-12-23":"2021-12-28"]

2021-12-23   2021-12-23 03:59:00+00:00
2021-12-24                         NaT
2021-12-28   2021-12-28 03:59:00+00:00
Freq: C, Name: last_am_minutes, dtype: datetime64[ns, UTC]

In [32]:
# first post-break minute of pm subsession
hkg.first_pm_minutes.loc["2021-12-23":"2021-12-28"]

2021-12-23   2021-12-23 05:00:00+00:00
2021-12-24                         NaT
2021-12-28   2021-12-28 05:00:00+00:00
Freq: C, Name: first_pm_minutes, dtype: datetime64[ns, UTC]

In [33]:
# in local time...
hkg.first_pm_minutes.loc["2021-12-23":"2021-12-28"].dt.tz_convert(hkg.tz)

2021-12-23   2021-12-23 13:00:00+08:00
2021-12-24                         NaT
2021-12-28   2021-12-28 13:00:00+08:00
Freq: C, Name: first_pm_minutes, dtype: datetime64[ns, Asia/Hong_Kong]

#### Nanos

Many sessions/minutes properties that return a `DatetimeIndex` have a 'nanos' equivalent that returns a numpy ndarray of integers.

Why? Internally `ExchangeCalendar` uses these nano arrays because they are faster to operate on than `DatetimeIndex`. Indeed, for some operations, working with nanos can be **much** faster.

In [34]:
hkg.sessions_nanos, hkg.minutes_nanos

(array([1025136000000000000, 1025222400000000000, 1025568000000000000, ...,
        1687478400000000000, 1687737600000000000, 1687824000000000000],
       dtype=int64),
 array([1025143200000000000, 1025143260000000000, 1025143320000000000, ...,
        1687852620000000000, 1687852680000000000, 1687852740000000000],
       dtype=int64))

In [35]:
hkg.opens_nanos, hkg.closes_nanos

(array([1025143200000000000, 1025229600000000000, 1025575200000000000, ...,
        1687483800000000000, 1687743000000000000, 1687829400000000000],
       dtype=int64),
 array([1025164800000000000, 1025251200000000000, 1025596800000000000, ...,
        1687507200000000000, 1687766400000000000, 1687852800000000000],
       dtype=int64))

In [36]:
hkg.break_starts_nanos, hkg.break_ends_nanos, 

(array([1025150400000000000, 1025236800000000000, 1025582400000000000, ...,
        1687492800000000000, 1687752000000000000, 1687838400000000000],
       dtype=int64),
 array([1025154000000000000, 1025240400000000000, 1025586000000000000, ...,
        1687496400000000000, 1687755600000000000, 1687842000000000000],
       dtype=int64))

In [37]:
hkg.first_minutes_nanos, hkg.last_minutes_nanos

(array([1025143200000000000, 1025229600000000000, 1025575200000000000, ...,
        1687483800000000000, 1687743000000000000, 1687829400000000000],
       dtype=int64),
 array([1025164740000000000, 1025251140000000000, 1025596740000000000, ...,
        1687507140000000000, 1687766340000000000, 1687852740000000000],
       dtype=int64))

In [38]:
hkg.last_am_minutes_nanos, hkg.first_pm_minutes_nanos

(array([1025150340000000000, 1025236740000000000, 1025582340000000000, ...,
        1687492740000000000, 1687751940000000000, 1687838340000000000],
       dtype=int64),
 array([1025154000000000000, 1025240400000000000, 1025586000000000000, ...,
        1687496400000000000, 1687755600000000000, 1687842000000000000],
       dtype=int64))