**Requirement**: *"I want a script that can handle a set of data that includes the ST-DST transition, if needed."*

👉Time-zone-aware👈 input data can be either in UTC time (**CASE I**) or in any arbitrary time zone (**CASE II**).

**CASE I**: Input is in UTC time -- dataset DOES NOT SPAN any DST-ST threshold(s), so...
  * Perform your analyses in UTC time
  * Convert to local tz to present results

**CASE II**: Input is in a *non*-UTC time zone -- dataset MAY SPAN DST-ST threshold(s), so...
  * Convert to UTC to remove DST-ST mishigas
  * Perform your analyses in UTC time
  * Convert back to local tz to present results

**TODO**: Handle case when input data are *not* tz-aware. (**Q**: *Is that ever the case for this project?*)

--Nick/ (__aquacalc@gmail.com__)

In [27]:
# My environment: 
#  Python 3.11.3  -- pandas 2.2.1 
#  VS Code 1.87.2 -- MacOS 12.7.4
import pandas as pd

For this requirement, we only need to test a few dates, with those in **CASE II** straddling a seasonal time-zone change.

 **NB**: To run either **CASE I** (UTC input) or **CASE II** (*non*-UTC input), in the next code block, un-comment the appropriate

  * `input_tz` (a string, either  'UTC' or 'America/New_York')

  and 

  * `ts` (an array/list which, for the *non*-UTC case, annoyingly straddles a time-zone change. For 'America/New_York', the autumn tz offset switches from -04:00 to -05:00)

In [32]:
## CASE I: Input is in UTC time -- dataset DOES NOT SPAN any DST-ST threshold(s)
## --------------------------------------------- //
## // To run CASE I...
## //  1. UNCOMMENT the next input_tz and ts for CASE I   //
## //  2. COMMENT OUT input_tz and ts of CASE II (below CASE II comments) //
# input_tz = 'UTC'
# ts = ['2023-11-05 05:59:58.194000+00:00', '2023-11-05 05:59:59.197000+00:00', '2023-11-05 06:00:00.194000+00:00', '2023-11-05 06:00:01.197000+00:00']
## --------------------------------------------- //

## CASE II: Input is in a non-UTC time zone -- dataset MAY SPAN DST-ST threhhold(s)
## --------------------------------------------- //
## // To run CASE II...
## //  1. UNCOMMENT the next input_tz and ts for CASE II   //
## //  2. COMMENT OUT input_tz and ts of CASE I (above) //
input_tz = 'America/New_York'
ts = ['2023-11-05T01:59:58.194000-04:00', '2023-11-05T01:59:59.197000-04:00', '2023-11-05T01:00:00.194000-05:00', '2023-11-05T01:00:01.197000-05:00']
## --------------------------------------------- //

# Your preferred time zone -- the one in which you want to present your data
localized_tz = 'America/New_York'

# Covert to UTC
my_utc = pd.to_datetime(ts, utc=True)
## NB: Inefficiency here, as unnecessarily converts input UTC data to...UTC!
## TODO: Add 'if' statement to check if input data already in UTC, 
##       then direct flow accordingly
## if input_tz != 'UTC':
##   my_utc = pd.to_datetime(ts, utc=True)

# .................................................. #
# .................................................. #
# ... 👉👉 PERFORM YOUR ANALYSES IN UTC TIME 👈👈 ... #
# .................................................. #
# .................................................. #

# Covert to your local/preferred time zone to present your results
# my_localized = pd.to_datetime(my_utc).tz_convert(localized_tz)
my_localized = my_utc.tz_convert(localized_tz)

print('INPUT ', input_tz)
print(ts)
print(' ')
print('UTC')
print(my_utc)
print(' ')
print('--👉 Your analyses performed here 👈--')
print(' ')
print('LOCALIZED', localized_tz)
print(my_localized)

INPUT  America/New_York
['2023-11-05T01:59:58.194000-04:00', '2023-11-05T01:59:59.197000-04:00', '2023-11-05T01:00:00.194000-05:00', '2023-11-05T01:00:01.197000-05:00']
 
UTC
DatetimeIndex(['2023-11-05 05:59:58.194000+00:00',
               '2023-11-05 05:59:59.197000+00:00',
               '2023-11-05 06:00:00.194000+00:00',
               '2023-11-05 06:00:01.197000+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)
 
--👉 Your analyses performed here 👈--
 
LOCALIZED America/New_York
DatetimeIndex(['2023-11-05 01:59:58.194000-04:00',
               '2023-11-05 01:59:59.197000-04:00',
               '2023-11-05 01:00:00.194000-05:00',
               '2023-11-05 01:00:01.197000-05:00'],
              dtype='datetime64[ns, America/New_York]', freq=None)
