# Quickstart tutorial 

## Sections
* [Periods](#Periods)
* [Anchor](#Anchor)
* [Data availability](#Data-availability)
* [Multiple exchanges](#Multiple-exchanges)
* [.pt accessor](#.pt-accessor)
* [Specific query methods](#Specific-query-methods)
* [Further documentation](#Further-documentation)

#### Notes
* The cell **outputs** shown in this tutorial are based on executing the cells at **2022-05-12 13:11 UTC** (before the NYSE open). Simply rerun the cells to bring any dynamic output up to date.

## Setup

Run the following cell to import tutorial dependencies.

In [2]:
from market_prices import PricesYahoo
import pandas as pd
from zoneinfo import ZoneInfo
from market_prices.support import tutorial_helpers as th
from market_prices import helpers

Run the following cell to define values used in this tutorial.

In [3]:
_prices_mix = PricesYahoo("MSFT, 9988.HK")
xnys = _prices_mix.calendars["MSFT"]
xhkg = _prices_mix.calendars["9988.HK"]
_calendars = [xnys, xhkg]
_session_length = [
    pd.Timedelta(hours=6, minutes=30),
    pd.Timedelta(hours=6, minutes=30),
]
# get sessions for which price data available at all base intervals
_sessions_range = th.get_sessions_range_for_bi(
    _prices_mix, _prices_mix.bis.T1
)
session = th.get_conforming_sessions(
    _calendars, _session_length, *_sessions_range, 5
)[-1]

_prices_us = PricesYahoo("MSFT")
start_T1, end_T1 = th.get_sessions_range_for_bi(_prices_us, _prices_us.bis.T1)
start_T5, end_T5 = th.get_sessions_range_for_bi(_prices_us, _prices_us.bis.T5)
start_H1 = th.get_sessions_range_for_bi(_prices_us, _prices_us.bis.H1)[0]
start_T5_oob = helpers.to_tz_naive(xnys.session_offset(start_T5, -2))
start_H1_oob = helpers.to_tz_naive(xnys.session_offset(start_H1, -2))

## Periods

Create a prices object for one or more symbols...

In [4]:
prices = PricesYahoo("MSFT, GOOG")

Get some prices...

In [5]:
# last 45 minutes of data at 5 minute intervals
df_intraday = prices.get(interval="5min", minutes=45)
df_intraday

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-05-11 15:15:00, 2022-05-11 15:20:00)",261.445007,261.950012,260.75,261.540009,513921,2295.77002,2300.60498,2291.817139,2298.800049,14237
"[2022-05-11 15:20:00, 2022-05-11 15:25:00)",261.519989,261.670013,260.859985,261.020111,465300,2297.850098,2298.870117,2292.320068,2293.300049,16273
"[2022-05-11 15:25:00, 2022-05-11 15:30:00)",261.01001,261.859985,260.920105,261.23999,432710,2296.0,2299.97998,2293.159912,2295.351074,16135
"[2022-05-11 15:30:00, 2022-05-11 15:35:00)",261.209991,261.670013,260.910004,261.100006,621908,2295.419922,2300.199951,2295.0,2296.25,18337
"[2022-05-11 15:35:00, 2022-05-11 15:40:00)",261.079987,261.140015,260.209991,260.304993,809542,2295.280029,2295.280029,2286.0,2286.0,28436
"[2022-05-11 15:40:00, 2022-05-11 15:45:00)",260.279999,260.575012,260.019989,260.154999,962406,2287.0,2291.429932,2287.0,2289.22998,22339
"[2022-05-11 15:45:00, 2022-05-11 15:50:00)",260.144989,260.440002,259.570007,259.700012,1259836,2286.23999,2290.52002,2283.709961,2287.199951,26622
"[2022-05-11 15:50:00, 2022-05-11 15:55:00)",259.709991,260.320007,259.299988,260.070007,1183760,2287.0,2287.0,2277.475098,2281.439941,70314
"[2022-05-11 15:55:00, 2022-05-11 16:00:00)",260.100006,260.790009,259.857697,260.660004,1666678,2281.580078,2282.870117,2274.189941,2279.219971,101702


Prices are returned as a pandas `DataFrame` with 'open', 'high', 'low', 'close', 'volume' columns for each symbol. For intraday data, rows are indexed with an `IntervalIndex` that describes the period over which the row's data corresponds. Intervals are closed on the 'left' such that the right side is NOT included in the period covered by a row.

In [6]:
# last ten sessions at 1 hour intervals
prices.get("1H", days=10)

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-04-28 09:30:00, 2022-04-28 10:30:00)",285.184998,287.609985,281.456207,283.359985,8087499,2342.300049,2349.489990,2302.877686,2323.570068,573390
"[2022-04-28 10:30:00, 2022-04-28 11:30:00)",283.329987,284.975006,282.410004,284.290009,4022154,2322.139893,2351.780029,2319.720215,2344.899902,195341
"[2022-04-28 11:30:00, 2022-04-28 12:30:00)",284.359985,286.029999,282.920013,285.984985,2845340,2345.576416,2358.500000,2330.020020,2357.570068,178195
"[2022-04-28 12:30:00, 2022-04-28 13:30:00)",286.000000,289.329987,285.890015,289.079987,3322418,2358.500000,2401.033447,2356.193115,2390.770020,184522
"[2022-04-28 13:30:00, 2022-04-28 14:30:00)",289.100006,289.619995,288.429993,289.350006,2633731,2388.949951,2396.703857,2384.360107,2395.435059,124659
...,...,...,...,...,...,...,...,...,...,...
"[2022-05-11 11:30:00, 2022-05-11 12:30:00)",266.140015,267.519989,264.130005,264.359985,2959808,2310.979980,2324.000000,2292.879883,2293.239990,147100
"[2022-05-11 12:30:00, 2022-05-11 13:30:00)",264.329987,265.329987,262.269989,263.306000,4269418,2291.699951,2306.219971,2288.320068,2295.435059,214189
"[2022-05-11 13:30:00, 2022-05-11 14:30:00)",263.277191,263.839996,261.119995,262.579987,4538704,2294.669922,2317.060059,2286.219971,2307.704102,194410
"[2022-05-11 14:30:00, 2022-05-11 15:30:00)",262.579987,264.427887,260.750000,261.239990,5379009,2309.699951,2322.090088,2291.817139,2295.351074,166048


Daily data is indexed with a `DatetimeIndex`.

In [7]:
# last month of daily prices
df_daily = prices.get("1D", months=1)
df_daily

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
2022-04-12,289.23999,290.73999,280.48999,282.059998,30966700,2648.469971,2648.469971,2551.52002,2567.48999,1150200
2022-04-13,282.730011,288.579987,281.299988,287.619995,21907200,2572.530029,2613.11499,2568.771973,2605.719971,977100
2022-04-14,288.089996,288.309998,279.320007,279.829987,28221600,2612.98999,2614.205078,2542.22998,2545.060059,1174200
2022-04-18,278.910004,282.459991,278.339996,280.519989,20778000,2548.199951,2574.23999,2531.569092,2559.219971,745900
2022-04-19,279.380005,286.170013,278.410004,285.299988,22297700,2561.540039,2618.074951,2549.030029,2610.620117,1136000
2022-04-20,289.399994,289.700012,285.369995,286.359985,22906700,2625.679932,2638.469971,2557.881104,2564.909912,1130500
2022-04-21,288.579987,293.299988,280.059998,280.809998,29454600,2587.0,2606.149902,2493.0,2498.75,1507900
2022-04-22,281.679993,283.200012,273.380005,274.029999,29405800,2500.0,2509.040039,2382.810059,2392.280029,2320500
2022-04-25,273.290009,281.109985,270.769989,280.720001,35678900,2388.590088,2465.560059,2375.38501,2465.0,1726100
2022-04-26,277.5,278.359985,270.0,270.220001,46518400,2455.0,2455.0,2383.237061,2390.120117,2469700


In [8]:
# last two years of prices at 5 month intervals
prices.get("5M", years=2)

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2020-10-01, 2021-03-01)",213.490005,246.130005,199.619995,232.380005,2938861000.0,1484.27002,2152.679932,1436.0,2036.859985,168267200.0
"[2021-03-01, 2021-08-01)",235.899994,290.149994,224.259995,284.910004,2819937000.0,2056.52002,2800.219971,2010.0,2704.419922,141246200.0
"[2021-08-01, 2022-01-01)",286.359985,349.670013,280.25,336.320007,2596303000.0,2709.689941,3037.0,2623.330078,2893.590088,114825700.0
"[2022-01-01, 2022-06-01)",335.350006,338.0,259.299988,260.549988,3317905000.0,2889.51001,3042.0,2251.030029,2279.219971,140786100.0


Or pass `start` and/or `end` to define the period from/to/between fixed minutes or dates.

In [9]:
# get some values for start and end
start, end = df_intraday.index[-4].left, df_intraday.index[-2].right
start, end  # for reference

(Timestamp('2022-05-11 15:40:00-0400', tz='America/New_York'),
 Timestamp('2022-05-11 15:55:00-0400', tz='America/New_York'))

In [10]:
prices.get("3T", start, end)

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-05-11 15:42:00, 2022-05-11 15:45:00)",260.309998,260.575012,260.019989,260.154999,463166.0,2290.199951,2291.419922,2287.149902,2289.22998,13354.0
"[2022-05-11 15:45:00, 2022-05-11 15:48:00)",260.144989,260.190002,259.570007,259.779999,803120.0,2286.23999,2288.620117,2283.709961,2287.100098,14385.0
"[2022-05-11 15:48:00, 2022-05-11 15:51:00)",259.779999,260.440002,259.600006,259.700012,748407.0,2287.0,2290.52002,2283.080078,2283.219971,22378.0
"[2022-05-11 15:51:00, 2022-05-11 15:54:00)",259.690002,260.269989,259.355011,259.519989,602194.0,2282.534912,2286.100098,2278.5,2278.620117,41772.0


In [11]:
# get some alternative values that represent sessions for start and end
start_daily, end_daily = df_daily.index[4], df_daily.index[-4]
start_daily, end_daily  # for reference

(Timestamp('2022-04-19 00:00:00', freq='C'),
 Timestamp('2022-05-06 00:00:00', freq='C'))

In [12]:
# interval as 3 sessions (not 3 calendar days!)
prices.get("3D", start_daily, end_daily)

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-04-21, 2022-04-26)",288.579987,293.299988,270.769989,280.720001,94539300.0,2587.0,2606.149902,2375.38501,2465.0,5554500.0
"[2022-04-26, 2022-04-29)",277.5,290.980011,270.0,289.630005,143642700.0,2455.0,2455.0,2262.485107,2388.22998,7421100.0
"[2022-04-29, 2022-05-04)",288.609985,289.880005,276.220001,281.779999,98154700.0,2351.560059,2386.0,2267.98999,2362.590088,4258300.0
"[2022-05-04, 2022-05-07)",282.589996,290.880005,271.269989,274.730011,114608000.0,2360.070068,2462.860107,2282.860107,2313.199951,5580100.0


Above prices for the 'multiple sessions' interval are indexed with an `IntervalIndex`. Intervals are closed on the left such that rows' data does not include any session that is represented by the right side of the indice (any such session will be represented in the following row).

In [13]:
end_time = start
start_date = end_daily
start_date, end_time # for reference

(Timestamp('2022-05-06 00:00:00', freq='C'),
 Timestamp('2022-05-11 15:40:00-0400', tz='America/New_York'))

In [14]:
# prices starting on a specific session ending on a specific minute
prices.get("1T", start_date, end_time)

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-05-06 09:30:00, 2022-05-06 09:31:00)",274.804993,275.640015,274.200012,274.500000,950286.0,2307.439941,2307.544922,2307.199951,2307.199951,50651.0
"[2022-05-06 09:31:00, 2022-05-06 09:32:00)",274.380005,276.100006,274.380005,276.100006,91778.0,2307.199951,2314.266602,2305.425049,2311.860107,7028.0
"[2022-05-06 09:32:00, 2022-05-06 09:33:00)",276.029999,276.049988,274.399994,274.510010,96677.0,2312.000000,2312.000000,2295.449951,2295.449951,14278.0
"[2022-05-06 09:33:00, 2022-05-06 09:34:00)",274.450012,274.649994,273.660004,274.019989,136213.0,2295.419922,2300.000000,2293.409912,2295.340088,7782.0
"[2022-05-06 09:34:00, 2022-05-06 09:35:00)",274.059998,274.751709,273.880096,273.950012,83052.0,2293.870117,2300.000000,2293.870117,2293.995117,6934.0
...,...,...,...,...,...,...,...,...,...,...
"[2022-05-11 15:35:00, 2022-05-11 15:36:00)",261.079987,261.140015,260.410004,260.619995,228377.0,2295.280029,2295.280029,2291.239990,2291.239990,4004.0
"[2022-05-11 15:36:00, 2022-05-11 15:37:00)",260.609985,260.989990,260.510010,260.750000,116746.0,2290.979980,2291.929932,2290.175049,2290.290039,7836.0
"[2022-05-11 15:37:00, 2022-05-11 15:38:00)",260.790009,261.059998,260.529999,260.609985,134740.0,2290.250000,2292.215088,2288.360107,2290.620117,8137.0
"[2022-05-11 15:38:00, 2022-05-11 15:39:00)",260.609985,260.769989,260.309998,260.709991,199713.0,2288.000000,2289.909912,2287.649902,2289.665039,3776.0


In [15]:
# Define period as a duration from a session open
prices.get("37T", start_date, hours=4)

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-05-06 09:30:00, 2022-05-06 10:07:00)",274.804993,276.100006,271.360291,274.006897,4965855.0,2307.439941,2314.266602,2282.860107,2305.439941,358940.0
"[2022-05-06 10:07:00, 2022-05-06 10:44:00)",274.029999,277.940002,271.790009,274.279999,4077356.0,2305.439941,2340.0,2288.889893,2314.0,191993.0
"[2022-05-06 10:44:00, 2022-05-06 11:21:00)",274.309998,278.73999,273.920013,278.119995,3684361.0,2310.100098,2333.48999,2304.77002,2328.477539,178208.0
"[2022-05-06 11:21:00, 2022-05-06 11:58:00)",278.130005,279.25,275.600006,276.269989,2576684.0,2332.405029,2349.969971,2315.94751,2322.600098,140844.0
"[2022-05-06 11:58:00, 2022-05-06 12:35:00)",276.23999,278.269989,275.380005,277.399994,1976187.0,2323.090088,2346.410645,2318.110107,2341.689941,77140.0
"[2022-05-06 12:35:00, 2022-05-06 13:12:00)",277.420013,278.790009,276.080109,277.940002,1928744.0,2342.409912,2348.580078,2333.830078,2340.409912,93734.0


In [16]:
# Define period as a duration to a minute
print(f"{end_time=}\n")  # for reference
prices.get("2T", end=end_time, days=1)

end_time=Timestamp('2022-05-11 15:40:00-0400', tz='America/New_York')



symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-05-10 15:40:00, 2022-05-10 15:42:00)",269.559998,269.632202,269.200012,269.519989,169139.0,2295.715088,2297.020020,2293.283936,2295.540039,8131.0
"[2022-05-10 15:42:00, 2022-05-10 15:44:00)",269.540009,269.690002,269.279999,269.584991,131122.0,2295.669922,2296.689941,2294.570068,2295.570068,6818.0
"[2022-05-10 15:44:00, 2022-05-10 15:46:00)",269.589996,269.644989,269.239990,269.549988,213282.0,2295.540039,2296.000000,2290.294922,2292.810059,7173.0
"[2022-05-10 15:46:00, 2022-05-10 15:48:00)",269.559998,270.220001,269.200012,270.110901,265996.0,2294.020020,2299.620117,2289.459961,2299.620117,11534.0
"[2022-05-10 15:48:00, 2022-05-10 15:50:00)",270.140015,270.239990,269.720001,269.729889,207759.0,2297.310059,2301.780029,2296.185059,2296.395020,9959.0
...,...,...,...,...,...,...,...,...,...,...
"[2022-05-11 15:30:00, 2022-05-11 15:32:00)",261.209991,261.429993,260.910004,261.179993,326958.0,2295.419922,2299.179932,2295.199951,2297.000000,8075.0
"[2022-05-11 15:32:00, 2022-05-11 15:34:00)",261.160004,261.670013,261.089996,261.332092,183539.0,2298.760010,2300.199951,2297.139893,2297.139893,7367.0
"[2022-05-11 15:34:00, 2022-05-11 15:36:00)",261.320007,261.359985,260.410004,260.619995,339788.0,2296.729980,2296.729980,2291.239990,2291.239990,6899.0
"[2022-05-11 15:36:00, 2022-05-11 15:38:00)",260.609985,261.059998,260.510010,260.609985,251486.0,2290.979980,2292.215088,2288.360107,2290.620117,15973.0


See the [periods](./periods.ipynb) tutorial for further examples and explanation of how the period over which prices are returned in evaluated.

## Anchor

By default indices are evaluated from ('anchored' on) each session's open. Alternatively they can be anchored on the period end and evaluted in terms of trading minutes. Compare the following.

In [17]:
end = end_time - pd.Timedelta(2, "T")
start = start_date
start, end  # for reference

(Timestamp('2022-05-06 00:00:00', freq='C'),
 Timestamp('2022-05-11 15:38:00-0400', tz='America/New_York'))

In [18]:
# default "open" anchor
prices.get("4H", start, end, anchor="open")

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-05-06 09:30:00, 2022-05-06 13:30:00)",274.804993,279.25,271.360291,276.720001,20043126.0,2307.439941,2349.969971,2282.860107,2334.0,1075458.0
"[2022-05-06 13:30:00, 2022-05-06 17:30:00)",276.76001,277.299988,271.269989,274.730011,12733310.0,2336.830078,2340.725098,2293.340088,2311.689941,527669.0
"[2022-05-09 09:30:00, 2022-05-09 13:30:00)",271.0,272.359985,264.549988,267.149994,24196295.0,2266.070068,2311.258057,2261.879883,2290.0,911007.0
"[2022-05-09 13:30:00, 2022-05-09 17:30:00)",267.195007,267.5,263.320007,264.589996,14108311.0,2290.810059,2293.219971,2251.030029,2261.659912,584935.0
"[2022-05-10 09:30:00, 2022-05-10 13:30:00)",272.75,273.559998,265.070007,269.701294,22119339.0,2320.810059,2333.080078,2267.665771,2303.5,835057.0
"[2022-05-10 13:30:00, 2022-05-10 17:30:00)",269.682587,273.75,268.73999,269.450012,11793026.0,2303.25,2333.820068,2287.139893,2291.070068,497866.0
"[2022-05-11 09:30:00, 2022-05-11 13:30:00)",265.679993,271.070007,262.269989,263.306,21432202.0,2274.209961,2333.419922,2273.0,2295.435059,940311.0


In [19]:
prices.get("4H", start, end, anchor="workback")

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-05-06 11:08:00, 2022-05-06 15:08:00)",276.779999,279.25,272.540009,272.619995,14305807.0,2314.409912,2349.969971,2300.580078,2301.580078,649269.0
"[2022-05-06 15:08:00, 2022-05-09 12:38:00)",272.570007,275.559998,264.549988,267.959991,28531203.0,2301.284912,2321.899902,2261.879883,2297.452148,1087998.0
"[2022-05-09 12:38:00, 2022-05-10 10:08:00)",267.929993,273.559998,263.320007,273.059998,24811954.0,2292.64502,2333.080078,2251.030029,2326.090088,950134.0
"[2022-05-10 10:08:00, 2022-05-10 14:08:00)",273.079987,273.75,265.070007,272.649994,17339630.0,2327.52002,2331.080078,2267.665771,2324.209961,684646.0
"[2022-05-10 14:08:00, 2022-05-11 11:38:00)",272.63501,273.609985,263.790314,267.019989,23028039.0,2323.0,2333.820068,2273.0,2321.324951,999015.0
"[2022-05-11 11:38:00, 2022-05-11 15:38:00)",267.059998,267.519989,260.410004,260.609985,17796049.0,2324.0,2324.0,2286.219971,2290.620117,728309.0


Where sessions have a break, indices will respect that break where posssible. This can be seen in the following prices for Alibaba's Hong Kong listing where the exchange closes between 12:00 and 13:00. The prices cover a single session.

In [20]:
prices_hk = PricesYahoo("9988.HK")

In [21]:
print(f"{session=}\n")  # for reference
prices_hk.get("40T", session, session)

session=Timestamp('2022-04-25 00:00:00')



symbol,9988.HK,9988.HK,9988.HK,9988.HK,9988.HK
Unnamed: 0_level_1,open,high,low,close,volume
"[2022-04-25 09:30:00, 2022-04-25 10:10:00)",84.800003,84.949997,83.0,83.650002,3576365.0
"[2022-04-25 10:10:00, 2022-04-25 10:50:00)",83.650002,83.949997,83.099998,83.650002,2721809.0
"[2022-04-25 10:50:00, 2022-04-25 11:30:00)",83.699997,84.949997,83.5,83.550003,3234464.0
"[2022-04-25 11:30:00, 2022-04-25 12:10:00)",83.550003,83.650002,83.349998,83.550003,1437730.0
"[2022-04-25 13:00:00, 2022-04-25 13:40:00)",83.5,83.599998,82.800003,82.949997,4775242.0
"[2022-04-25 13:40:00, 2022-04-25 14:20:00)",82.900002,83.099998,82.150002,82.699997,4282124.0
"[2022-04-25 14:20:00, 2022-04-25 15:00:00)",82.699997,82.849998,81.199997,81.800003,4624993.0
"[2022-04-25 15:00:00, 2022-04-25 15:40:00)",81.800003,82.550003,81.650002,81.800003,6633358.0
"[2022-04-25 15:40:00, 2022-04-25 16:20:00)",81.800003,82.099998,80.949997,81.599998,5671481.0


The period between the last indice of the morning session (ending 12.10) and the first indice of the afternoon session (starting 13.00) is not covered by the data.

The `force` option can be used to force the final indices of each (sub)session to the (sub)session close...

In [22]:
prices_hk.get("40T", session, session, force=True)

symbol,9988.HK,9988.HK,9988.HK,9988.HK,9988.HK
Unnamed: 0_level_1,open,high,low,close,volume
"[2022-04-25 09:30:00, 2022-04-25 10:10:00)",84.800003,84.949997,83.0,83.650002,3576365.0
"[2022-04-25 10:10:00, 2022-04-25 10:50:00)",83.650002,83.949997,83.099998,83.650002,2721809.0
"[2022-04-25 10:50:00, 2022-04-25 11:30:00)",83.699997,84.949997,83.5,83.550003,3234464.0
"[2022-04-25 11:30:00, 2022-04-25 12:00:00)",83.550003,83.650002,83.349998,83.550003,1437730.0
"[2022-04-25 13:00:00, 2022-04-25 13:40:00)",83.5,83.599998,82.800003,82.949997,4775242.0
"[2022-04-25 13:40:00, 2022-04-25 14:20:00)",82.900002,83.099998,82.150002,82.699997,4282124.0
"[2022-04-25 14:20:00, 2022-04-25 15:00:00)",82.699997,82.849998,81.199997,81.800003,4624993.0
"[2022-04-25 15:00:00, 2022-04-25 15:40:00)",81.800003,82.550003,81.650002,81.800003,6633358.0
"[2022-04-25 15:40:00, 2022-04-25 16:00:00)",81.800003,82.099998,80.949997,81.599998,5671481.0


Notice that above the last indice of the morning subsession now ends at 12.00, the morning session close, and the last indice of the session now ends at 16.00, the session close.

See the [anchors](./anchors.ipynb) tutorial for further examples and explanation of options that determine how indices are evaluted.

## Data availability

`market_prices` gets price data from a data provider. If the provider makes data available from which a request for prices at a specific interval over a specific period can be fulfilled, then `get` will fulfil it.

However, if data is not available to fulfil a request then an error will be raised which advises why the request cannot be fulfilled and, if relevant, offers some options that will return prices given the data that is available.

Consider the following request over a period for which data is not available at a sufficiently low interval to evaluate prices at the requested 4 minute interval.

In [23]:
start, end = start_T5, end_T1
print(f"{start=}\n{end=}")  # for reference

start=Timestamp('2022-03-14 00:00:00', freq='C')
end=Timestamp('2022-05-11 00:00:00', freq='C')


In [None]:
prices.get("4T", start, end)

```
---------------------------------------------------------------------------
PricesIntradayUnavailableError            Traceback (most recent call last)
<ipython-input-24-c48503aa405d> in <module>
----> 1 prices.get("4T", start, end)

PricesIntradayUnavailableError: Data is unavailable at a sufficiently low base interval to evaluate prices at interval 0 days 00:04:00 anchored 'Anchor.OPEN'.
Base intervals that are a factor of 0 days 00:04:00:
	[<BaseInterval.T1: Timedelta('0 days 00:01:00')>, <BaseInterval.T2: Timedelta('0 days 00:02:00')>].
The earliest minute from which data is available at 0 days 00:02:00 is 2022-03-30 13:30:00+00:00, although at this base interval the requested period evaluates to (Timestamp('2022-03-14 13:30:00+0000', tz='UTC'), Timestamp('2022-05-11 20:02:00+0000', tz='UTC')).
Period evaluated from parameters: {'minutes': 0, 'hours': 0, 'days': 0, 'weeks': 0, 'months': 0, 'years': 0, 'start': Timestamp('2022-03-14 00:00:00', freq='C'), 'end': Timestamp('2022-05-11 00:00:00', freq='C'), 'add_a_row': False}.
Data is available from 2022-03-30 13:30:00+00:00 through to the end of the requested period. Consider passing `strict` as False to return prices for this part of the period.
```

Pass `strict` as False to get prices over the period for which data is available.

In [25]:
df = prices.get("4T", start, end, strict=False)
df[:2]  # only showing first two rows

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-03-30 09:30:00, 2022-03-30 09:34:00)",313.76001,314.799988,313.410004,314.529999,953401.0,2857.399902,2862.679932,2854.449951,2855.850098,53371.0
"[2022-03-30 09:34:00, 2022-03-30 09:38:00)",314.600006,314.959991,313.380005,313.529999,360997.0,2855.610107,2862.47998,2854.870117,2857.938721,17188.0


An error will also be raised if it's only possible to return prices EITHER over the full requested period OR that express the period end with the greatest possible accuracy.

In [26]:
end = xnys.session_close(end) - pd.Timedelta(13, "T")
print(f"{start=}\n{end=}")  # for reference

start=Timestamp('2022-03-14 00:00:00', freq='C')
end=Timestamp('2022-05-11 19:47:00+0000', tz='UTC')


In [None]:
prices.get(start=start, end=end)

```
---------------------------------------------------------------------------
LastIndiceInaccurateError                 Traceback (most recent call last)
<ipython-input-27-c5a0068ab121> in <module>
----> 1 prices.get(start=start, end=end)

LastIndiceInaccurateError: Full period available at the following intraday base intervals although these do not allow for representing the end indice with the greatest possible accuracy:
	[<BaseInterval.T5: Timedelta('0 days 00:05:00')>, <BaseInterval.H1: Timedelta('0 days 01:00:00')>].
The following base intervals could represent the end indice with the greatest possible accuracy although have insufficient data available to cover the full period:
	[<BaseInterval.T1: Timedelta('0 days 00:01:00')>].
The earliest minute from which data is available at 0 days 00:01:00 is 2022-04-12 13:30:00+00:00, although at this base interval the requested period evaluates to (Timestamp('2022-03-14 13:30:00+0000', tz='UTC'), Timestamp('2022-05-11 19:47:00+0000', tz='UTC')).
Period evaluated from parameters: {'minutes': 0, 'hours': 0, 'days': 0, 'weeks': 0, 'months': 0, 'years': 0, 'start': Timestamp('2022-03-14 00:00:00', freq='C'), 'end': Timestamp('2022-05-11 19:47:00+0000', tz='UTC'), 'add_a_row': False}.
Data that can express the period end with the greatest possible accuracy is available from 2022-04-12 13:30:00+00:00. Pass `strict` as False to return prices for this part of the period.
Alternatively, consider creating a composite table (pass `composite` as True) or passing `priority` as 'period'.
```

One option is to create a composite table comprised of two different intervals. The earlier part of the table has a higher interval at which data is available over the full period whilst the later part has a lower interval able to express the period end with the greatest possible accuracy.

In [28]:
df_comp = prices.get(start=start, end=end, composite=True)
df_comp

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-03-14 09:30:00, 2022-03-14 09:35:00)",280.250000,282.359985,280.010010,281.869995,1721154.0,2611.459961,2620.520020,2610.110107,2611.310059,51968.0
"[2022-03-14 09:35:00, 2022-03-14 09:40:00)",281.890015,282.850006,281.040009,281.880005,720010.0,2612.254883,2612.530029,2592.870117,2598.800049,37872.0
"[2022-03-14 09:40:00, 2022-03-14 09:45:00)",281.829987,284.769989,281.369995,284.290009,808963.0,2596.054932,2610.689941,2593.939941,2608.560059,20875.0
"[2022-03-14 09:45:00, 2022-03-14 09:50:00)",284.269989,284.329987,282.480011,283.109985,593177.0,2606.000000,2611.300049,2603.959961,2603.959961,18132.0
"[2022-03-14 09:50:00, 2022-03-14 09:55:00)",283.019989,283.699005,282.470001,283.109985,367264.0,2606.540039,2609.879883,2600.342041,2601.419922,22693.0
...,...,...,...,...,...,...,...,...,...,...
"[2022-05-11 15:42:00, 2022-05-11 15:43:00)",260.309998,260.575012,260.140015,260.184998,124295.0,2290.199951,2290.199951,2289.929932,2290.080078,2213.0
"[2022-05-11 15:43:00, 2022-05-11 15:44:00)",260.190002,260.309998,260.089996,260.212097,143383.0,2288.985107,2291.419922,2288.985107,2290.520020,6573.0
"[2022-05-11 15:44:00, 2022-05-11 15:45:00)",260.190002,260.269989,260.019989,260.154999,195488.0,2290.639893,2290.729980,2287.149902,2289.229980,4568.0
"[2022-05-11 15:45:00, 2022-05-11 15:46:00)",260.144989,260.190002,259.739990,259.789886,471234.0,2286.239990,2287.330078,2283.709961,2284.699951,3533.0


The [data availability](./data_availability.ipynb) tutorial covers other options, further examples and further explanation including how `market_prices` defines base intervals from which all other price data is evaluted.

## Multiple exchanges

`market_prices` really comes into its own when combining symbols that trade on different exchanges with different opening hours over different timezones.

In [29]:
# Note: this cell might take a little while to execute given the number of
# different trading calendars that will be created.
symbols = [
    "MSFT",  # us stock
    "AZN.L",  # uk stock
    "9988.HK",  # hong kong stock
    "PETR3.SA",  # brazilan stock
    "^FTSE",  # equity index
    "ES=F",  # futures
    "CL=F",  # oil
    "GC=F",  # gold
    "GBPEUR=X",  # currency pair
    "BTC-USD",  # crypto
]
prices_mult = PricesYahoo(symbols)

The `lead_symbol` determines the exchange against which the period should be evaluted.

In [30]:
# 30 mins of data at 10min intervals ending on the most recent
# minute that the Hong Kong equity market was open
df_mult = prices_mult.get(
    "10min", minutes=30, lead_symbol="9988.HK", anchor="workback"
)
df_mult

symbol,9988.HK,9988.HK,9988.HK,9988.HK,9988.HK,AZN.L,AZN.L,AZN.L,AZN.L,AZN.L,...,PETR3.SA,PETR3.SA,PETR3.SA,PETR3.SA,PETR3.SA,^FTSE,^FTSE,^FTSE,^FTSE,^FTSE
Unnamed: 0_level_1,close,high,low,open,volume,close,high,low,open,volume,...,close,high,low,open,volume,close,high,low,open,volume
"[2022-05-12 15:30:00, 2022-05-12 15:40:00)",80.0,80.300003,79.900002,80.199997,2499259.0,9801.0,9821.0,9785.0,9799.0,55810.0,...,,,,,,7181.529785,7194.009766,7171.970215,7191.529785,0.0
"[2022-05-12 15:40:00, 2022-05-12 15:50:00)",80.300003,80.349998,79.949997,80.0,2462802.0,9862.0,9862.0,9794.0,9799.0,39472.0,...,,,,,,7220.209961,7220.77002,7176.339844,7181.540039,0.0
"[2022-05-12 15:50:00, 2022-05-12 16:00:00)",79.949997,80.449997,79.800003,80.349998,2616271.0,9824.0,9866.0,9822.0,9863.0,39555.0,...,,,,,,7208.950195,7221.839844,7204.810059,7220.169922,0.0


Note:
* prices will show as missing for any instruments not trading over the evaluated period.
* the default timezone for the index will be the timezone associated with `lead_symbol`, i.e. Hong Kong in the above example (pass `tzout` to change this).

The [periods](./periods.ipynb) and [anchor](./anchor.ipynb) tutorials offer further examples and explanation of getting prices for symbols trading on multiple exchanges.

## .pt accessor

The .pt accessor provides access to a wealth of funcationality to interrogate and operate on the DataFrame returned by `get`.

This section offers examples of only a few of the properties and methods available. See the [.pt accessor](./pt_accessor.ipynb) tutorial for a comprehensive overview.

In [31]:
df_mult.pt.symbols

['9988.HK',
 'AZN.L',
 'BTC-USD',
 'CL=F',
 'ES=F',
 'GBPEUR=X',
 'GC=F',
 'MSFT',
 'PETR3.SA',
 '^FTSE']

In [32]:
df_mult.pt.first_ts

Timestamp('2022-05-12 15:30:00+0800', tz='Asia/Hong_Kong')

In [33]:
df_mult.pt.is_daily, df_mult.pt.is_intraday

(False, True)

In [34]:
df_mult.pt.interval

<TDInterval.T10: Timedelta('0 days 00:10:00')>

Query how many trading minutes are in each indice, with trading minutes evaluated against a specfic calendar...

In [35]:
df_mult.pt.indices_trading_minutes(xnys)
# New York is closed over this period

[2022-05-12 15:30:00, 2022-05-12 15:40:00)    0
[2022-05-12 15:40:00, 2022-05-12 15:50:00)    0
[2022-05-12 15:50:00, 2022-05-12 16:00:00)    0
Name: trading_mins, dtype: int64

In [36]:
df_mult.pt.indices_trading_minutes(xhkg)

[2022-05-12 15:30:00, 2022-05-12 15:40:00)    10
[2022-05-12 15:40:00, 2022-05-12 15:50:00)    10
[2022-05-12 15:50:00, 2022-05-12 16:00:00)    10
Name: trading_mins, dtype: int64

Going back to the initial intraday dataframe...

In [37]:
df_intraday.pt.interval

<TDInterval.T5: Timedelta('0 days 00:05:00')>

In [38]:
# downsample the 5 minute data to 15 minute intervals
df_intraday.pt.downsample("15T")

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-05-11 15:15:00, 2022-05-11 15:30:00)",261.445007,261.950012,260.75,261.23999,1411931.0,2295.77002,2300.60498,2291.817139,2295.351074,46645.0
"[2022-05-11 15:30:00, 2022-05-11 15:45:00)",261.209991,261.670013,260.019989,260.154999,2393856.0,2295.419922,2300.199951,2286.0,2289.22998,69112.0
"[2022-05-11 15:45:00, 2022-05-11 16:00:00)",260.144989,260.790009,259.299988,260.660004,4110274.0,2286.23999,2290.52002,2274.189941,2279.219971,198638.0


In [39]:
# Downsample the daily table so that each indice contains 4 sessions.
df_daily.pt.downsample("4D", xnys)

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
"[2022-04-13, 2022-04-20)",282.730011,288.579987,278.339996,285.299988,93204500.0,2572.530029,2618.074951,2531.569092,2610.620117,4033200.0
"[2022-04-20, 2022-04-26)",289.399994,293.299988,270.769989,280.720001,117446000.0,2625.679932,2638.469971,2375.38501,2465.0,6685000.0
"[2022-04-26, 2022-05-02)",277.5,290.980011,270.0,277.519989,180667700.0,2455.0,2455.0,2262.485107,2299.330078,9104600.0
"[2022-05-02, 2022-05-06)",277.709991,290.880005,274.339996,277.350006,137989400.0,2278.129883,2462.860107,2267.98999,2334.929932,6390900.0
"[2022-05-06, 2022-05-12)",274.809998,279.25,259.299988,260.549988,173705200.0,2310.379883,2349.969971,2251.030029,2279.219971,6870400.0


In [40]:
# session mapping
df_intraday.pt.sessions(xnys)

[2022-05-11 15:15:00, 2022-05-11 15:20:00)   2022-05-11
[2022-05-11 15:20:00, 2022-05-11 15:25:00)   2022-05-11
[2022-05-11 15:25:00, 2022-05-11 15:30:00)   2022-05-11
[2022-05-11 15:30:00, 2022-05-11 15:35:00)   2022-05-11
[2022-05-11 15:35:00, 2022-05-11 15:40:00)   2022-05-11
[2022-05-11 15:40:00, 2022-05-11 15:45:00)   2022-05-11
[2022-05-11 15:45:00, 2022-05-11 15:50:00)   2022-05-11
[2022-05-11 15:50:00, 2022-05-11 15:55:00)   2022-05-11
[2022-05-11 15:55:00, 2022-05-11 16:00:00)   2022-05-11
Name: session, dtype: datetime64[ns]

## Specific query methods

Aside from `get`, the `Prices` class has a few other methods that return more specific price data.

This section offers a quick example of each. Check out the [Specific query methods](./specific_query_methods.ipynb) tutorial for a comprehensive treatment.

`session_prices` will return prices for a specific session.

In [41]:
prices.session_prices("2022-04-26")

symbol,MSFT,MSFT,MSFT,MSFT,MSFT,GOOG,GOOG,GOOG,GOOG,GOOG
Unnamed: 0_level_1,open,high,low,close,volume,open,high,low,close,volume
2022-04-26,277.5,278.359985,270.0,270.220001,46518400,2455.0,2455.0,2383.237061,2390.120117,2469700


`close_at` returns the most recent close prices as at a specific date.

In [42]:
prices_mult.close_at("2021-12-25")

symbol,MSFT,AZN.L,9988.HK,PETR3.SA,^FTSE,ES=F,CL=F,GC=F,GBPEUR=X,BTC-USD
2021-12-25,334.690002,8611.0,113.0,30.440001,7372.100098,4715.75,73.790001,1811.199951,1.1834,50429.859375


`price_at` returns the most recent prices as at a specific minute.

In [43]:
minute = xnys.session_close(session) - pd.Timedelta(47, "T")
print(f"{minute=}\n")  # for reference

prices_mult.price_at(minute, tz=ZoneInfo("UTC"))

minute=Timestamp('2022-04-25 19:13:00+0000', tz='UTC')



symbol,MSFT,AZN.L,9988.HK,PETR3.SA,^FTSE,ES=F,CL=F,GC=F,GBPEUR=X,BTC-USD
2022-04-25 19:13:00+00:00,276.769989,10244.0,81.599998,33.240002,7392.310059,4251.5,98.989998,1901.099976,1.18908,39835.207031


Or the most recent prices available as at 'now'.

In [44]:
prices_mult.price_at(tz="MSFT")

symbol,MSFT,AZN.L,9988.HK,PETR3.SA,^FTSE,ES=F,CL=F,GC=F,GBPEUR=X,BTC-USD
2022-05-12 09:12:00-04:00,260.549988,9826.0,80.0,36.439999,7200.560059,3898.25,104.480003,1843.199951,1.17387,28412.521484


`price_range` returns OHLCV data for a period evalauted from period parameters.

In [45]:
prices_mult.price_range(days=5, stack=True)

Unnamed: 0_level_0,Unnamed: 1_level_0,open,high,low,close,volume
Unnamed: 0_level_1,symbol,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",9988.HK,92.050003,92.849998,79.800003,79.949997,198434600.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",AZN.L,10612.0,10616.0,9764.0,9828.0,7804363.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",BTC-USD,39519.300781,39519.667969,26350.490234,28411.1875,187410500000.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",CL=F,109.300003,111.370003,98.199997,104.059998,1613489.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",ES=F,4264.0,4264.5,3884.75,3899.5,11113460.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",GBPEUR=X,1.17383,1.17612,1.16,1.17385,0.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",GC=F,1907.599976,1909.199951,1830.599976,1844.599976,1025028.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",MSFT,285.540009,286.350006,259.299988,260.660004,178991500.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",PETR3.SA,34.5,36.860001,33.349998,36.439999,99843800.0
"(2022-05-05 14:00:00, 2022-05-12 14:00:00]",^FTSE,7603.399902,7603.799805,7158.529785,7196.100098,0.0


## Further documentation

All documentation can be found [here](../tutorials_docs.md), including a wealth of tutorials that collectively cover all aspects of `get` and the specific query methods.

Also, you should find each method's own documentation pretty comprehensive...

In [None]:
# or prices.get?
help(prices.get)