- working with time zones can be one of the most unpleasant parts of time series manipulation.
- as a result, many time series users choose to work with time series in coordinated universal time or UTC, which is geography-independent international standard.
- time zones are expressed as offsets from UTC; for example - 
    - new york is four hours behind UTC during daylight saving time (DST) and five hours behind the rest of the year

- in Python, time zones information comes from the third-party `pytz` library, which exposes the `Olson database`, a compilation of world time zone information.
- this is especially important for historical data because the DST transition dates (and even UTC offsets) have been changed numerous times depending on the regional laws.

- pandas wraps pytz's functionality so you can ignore its API outside of the time zone names.
- since pandas has a hard dependency on pytz, it isn't necesary to install it separately. Time zones names can be found interactively and in the docs.



### 🌍 **UTC – Coordinated Universal Time**

- **UTC** stands for **Coordinated Universal Time**.
- It is the **primary time standard** used across the world.
- **Does not change** with seasons, time zones, or daylight saving.
- All time zones are defined relative to UTC.  
  For example:
  - `UTC+05:30` → India Standard Time (IST)
  - `UTC-08:00` → Pacific Standard Time (PST)

🔹 **Example**:  
If it's `12:00 UTC`, then:
- In London (winter): it's `12:00`
- In India: it's `17:30`
- In New York (winter): it's `07:00`

### 🌞 **DST – Daylight Saving Time**

- **DST** stands for **Daylight Saving Time**.
- It's a practice of **setting the clock forward by 1 hour** during summer to get more daylight in the evening.
- In regions that observe DST, the time changes **twice a year**:
  - **Spring (March)** – clocks are moved **forward** (+1 hour)
  - **Fall (November)** – clocks are moved **backward** (-1 hour)

🔹 **Example**:  
- New York:
  - In winter: `UTC-05:00` (Eastern Standard Time – EST)
  - In summer: `UTC-04:00` (Eastern Daylight Time – EDT)


### ⚠️ Key Difference:

| Term | Full Form | Changes with Season? | Use |
|------|-----------|----------------------|-----|
| **UTC** | Coordinated Universal Time | ❌ No | Global time standard |
| **DST** | Daylight Saving Time | ✅ Yes | More daylight during summer |


### 🧠 In programming:
- Use `UTC` internally to **avoid errors** caused by DST changes.
- Convert to local time (with or without DST) only for **display** or **user input**.


In [31]:
import numpy as np 
import pandas as pd 
from datetime import datetime
import pytz

pytz.common_timezones[:]

['Africa/Abidjan',
 'Africa/Accra',
 'Africa/Addis_Ababa',
 'Africa/Algiers',
 'Africa/Asmara',
 'Africa/Bamako',
 'Africa/Bangui',
 'Africa/Banjul',
 'Africa/Bissau',
 'Africa/Blantyre',
 'Africa/Brazzaville',
 'Africa/Bujumbura',
 'Africa/Cairo',
 'Africa/Casablanca',
 'Africa/Ceuta',
 'Africa/Conakry',
 'Africa/Dakar',
 'Africa/Dar_es_Salaam',
 'Africa/Djibouti',
 'Africa/Douala',
 'Africa/El_Aaiun',
 'Africa/Freetown',
 'Africa/Gaborone',
 'Africa/Harare',
 'Africa/Johannesburg',
 'Africa/Juba',
 'Africa/Kampala',
 'Africa/Khartoum',
 'Africa/Kigali',
 'Africa/Kinshasa',
 'Africa/Lagos',
 'Africa/Libreville',
 'Africa/Lome',
 'Africa/Luanda',
 'Africa/Lubumbashi',
 'Africa/Lusaka',
 'Africa/Malabo',
 'Africa/Maputo',
 'Africa/Maseru',
 'Africa/Mbabane',
 'Africa/Mogadishu',
 'Africa/Monrovia',
 'Africa/Nairobi',
 'Africa/Ndjamena',
 'Africa/Niamey',
 'Africa/Nouakchott',
 'Africa/Ouagadougou',
 'Africa/Porto-Novo',
 'Africa/Sao_Tome',
 'Africa/Tripoli',
 'Africa/Tunis',
 'Africa/Wi

In [32]:
# to get a timezone object from pytz.timezone
tz = pytz.timezone("Asia/Kolkata")
tz

<DstTzInfo 'Asia/Kolkata' LMT+5:53:00 STD>

> Methods in pandas will accept either time zone names or these objects

## [ Time Zone Localization and Conversion ]

In [33]:
# by default, time series in pandas are time zone naive.
# for example, consider this time series

dates = pd.date_range("2025-03-09", periods=6)
ts = pd.Series(np.random.standard_normal(len(dates)), index=dates)

ts

2025-03-09   -0.537624
2025-03-10    1.171044
2025-03-11   -1.062123
2025-03-12    1.055754
2025-03-13   -0.001854
2025-03-14    1.599188
Freq: D, dtype: float64

In [34]:
# the index's tz (timezone) field is None
print(ts.index.tz)

None


In [35]:
# date ranges can be generated with a time zone set
pd.date_range("2025-03-09 9:30", periods=10, tz="UTC")

# tz="UTC" - setting the timezone of your datetime or date range to UTC
    # UTC (coordinated universal time) is the global time standard
    # it doesn't observe daylight saving time
    # often used in programming, APIs, and databases to avoid timezone issues

    # the +00:00 indicates UTC offset (i.e., no time difference)

DatetimeIndex(['2025-03-09 09:30:00+00:00', '2025-03-10 09:30:00+00:00',
               '2025-03-11 09:30:00+00:00', '2025-03-12 09:30:00+00:00',
               '2025-03-13 09:30:00+00:00', '2025-03-14 09:30:00+00:00',
               '2025-03-15 09:30:00+00:00', '2025-03-16 09:30:00+00:00',
               '2025-03-17 09:30:00+00:00', '2025-03-18 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [36]:
# conversion from naive to localized (reinterpreted as having been observed in a particular time zone) is handled by the tz_localize method

ts

2025-03-09   -0.537624
2025-03-10    1.171044
2025-03-11   -1.062123
2025-03-12    1.055754
2025-03-13   -0.001854
2025-03-14    1.599188
Freq: D, dtype: float64

In [37]:
ts_utc = ts.tz_localize("UTC")
ts_utc

2025-03-09 00:00:00+00:00   -0.537624
2025-03-10 00:00:00+00:00    1.171044
2025-03-11 00:00:00+00:00   -1.062123
2025-03-12 00:00:00+00:00    1.055754
2025-03-13 00:00:00+00:00   -0.001854
2025-03-14 00:00:00+00:00    1.599188
Freq: D, dtype: float64

In [38]:
ts_utc.index

DatetimeIndex(['2025-03-09 00:00:00+00:00', '2025-03-10 00:00:00+00:00',
               '2025-03-11 00:00:00+00:00', '2025-03-12 00:00:00+00:00',
               '2025-03-13 00:00:00+00:00', '2025-03-14 00:00:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [39]:
# once a time series has been localized to a particular time zone it can be converted to another time zone with tz_convert

ts_utc.tz_convert("America/New_York")

2025-03-08 19:00:00-05:00   -0.537624
2025-03-09 20:00:00-04:00    1.171044
2025-03-10 20:00:00-04:00   -1.062123
2025-03-11 20:00:00-04:00    1.055754
2025-03-12 20:00:00-04:00   -0.001854
2025-03-13 20:00:00-04:00    1.599188
Freq: D, dtype: float64

In [40]:
# If you have a time series that covers a DST change (like when clocks go forward or backward), and it's in a timezone that observes DST (e.g., America/New_York), then:

    # You can localize it — that is, assign a timezone to a naive (timezone-unaware) datetime.

    # Then, you can convert it to another timezone — like UTC or Europe/Berlin.

ts_eastern = ts.tz_localize("America/New_York")

ts_eastern.tz_convert("UTC")

2025-03-09 05:00:00+00:00   -0.537624
2025-03-10 04:00:00+00:00    1.171044
2025-03-11 04:00:00+00:00   -1.062123
2025-03-12 04:00:00+00:00    1.055754
2025-03-13 04:00:00+00:00   -0.001854
2025-03-14 04:00:00+00:00    1.599188
dtype: float64

In [41]:
ts_eastern.tz_convert("Europe/Berlin")

2025-03-09 06:00:00+01:00   -0.537624
2025-03-10 05:00:00+01:00    1.171044
2025-03-11 05:00:00+01:00   -1.062123
2025-03-12 05:00:00+01:00    1.055754
2025-03-13 05:00:00+01:00   -0.001854
2025-03-14 05:00:00+01:00    1.599188
dtype: float64

> `tz_localize` and `tz_convert` are also instance methods on DatetimeIndex

In [42]:
ts.index.tz_localize("Asia/Shanghai")

DatetimeIndex(['2025-03-09 00:00:00+08:00', '2025-03-10 00:00:00+08:00',
               '2025-03-11 00:00:00+08:00', '2025-03-12 00:00:00+08:00',
               '2025-03-13 00:00:00+08:00', '2025-03-14 00:00:00+08:00'],
              dtype='datetime64[ns, Asia/Shanghai]', freq=None)

## [ Operations with Time Zone-Aware Timestamp Objects ]
A time zone-aware timestamp is a datetime object that includes both the date/time and the time zone information.

#### Why it's useful
- Correct time math — handles daylight saving changes automatically.
- Safe conversions — can convert between time zones without errors.
- Reliable across regions — timestamps make sense globally.

In [43]:
# similar to time series and date ranges, individual Timestamp objects similarly can be localized from naive to time zone-aware and converted from one time zone to another

stamp = pd.Timestamp("2025-04-16 10:35")
stamp_utc = stamp.tz_localize("utc")

stamp_utc.tz_convert("Asia/Kolkata")

Timestamp('2025-04-16 16:05:00+0530', tz='Asia/Kolkata')

In [44]:
# we can also pass a time zone when creating the Timestamp
stamp_moscow = pd.Timestamp("2025-07-4 4:00", tz="Europe/Moscow")
stamp_moscow

Timestamp('2025-07-04 04:00:00+0300', tz='Europe/Moscow')


1. **Internally**, all time zone-aware timestamps are stored in **UTC (Coordinated Universal Time)**.
2. When you **change the time zone**, you're just **changing how the time is *displayed***, **not the actual point in time**.
3. The timestamp is stored as **nanoseconds since Unix epoch** (i.e., since 1970-01-01 00:00:00 UTC).

### 🧠 Analogy:

Imagine a **universal clock** that always runs in UTC.

If you go to:
- New York (UTC-4): you'll see 8:00 AM
- London (UTC+0): you'll see 12:00 PM
- Tokyo (UTC+9): you'll see 9:00 PM

But it's **the same moment in time**, just shown differently depending on your timezone.



In [45]:
stamp_utc.value

1744799700000000000

In [46]:
stamp_utc.tz_convert("Asia/Shanghai").value

1744799700000000000

In [47]:
# when performing time arithmetic using pandas's DateOffset objects, pandas respects daylight saving time transitions where possible.

# Here we construct timestamps that occur right before DST transitions (forward and backward)

# First, 30 minutes before transitioning to DST
stamp = pd.Timestamp("2025-03-11 1:30", tz="US/Eastern")
stamp

Timestamp('2025-03-11 01:30:00-0400', tz='US/Eastern')

In [48]:
from pandas.tseries.offsets import Hour

stamp + Hour()

Timestamp('2025-03-11 02:30:00-0400', tz='US/Eastern')

In [49]:
# Then, 90 minutes before transitioning out of DST
stamp = pd.Timestamp("2012-11-04 00:30", tz="US/Eastern")
stamp

Timestamp('2012-11-04 00:30:00-0400', tz='US/Eastern')

In [50]:
stamp + 2 * Hour()

Timestamp('2012-11-04 01:30:00-0500', tz='US/Eastern')

## [ Operations Between Different Time Zones ]
- if two time series with different time zones are combined, the result will be UTC
- since the timestamps are stored under the hood in UTC, this is a straightforward operation and requires no conversion

In [51]:
dates = pd.date_range("2012-03-07 09:30", periods=10, freq="B")

ts = pd.Series(np.random.standard_normal(len(dates)), index=dates)
ts

2012-03-07 09:30:00   -1.042971
2012-03-08 09:30:00   -0.863138
2012-03-09 09:30:00   -0.259794
2012-03-12 09:30:00    0.088674
2012-03-13 09:30:00   -1.729164
2012-03-14 09:30:00    1.269072
2012-03-15 09:30:00    0.720667
2012-03-16 09:30:00    0.898982
2012-03-19 09:30:00   -0.187890
2012-03-20 09:30:00    1.275884
Freq: B, dtype: float64

In [54]:
ts1 = ts[:7].tz_localize("Europe/London")
ts2 = ts1[2:].tz_convert("Europe/Moscow")

result = ts1 + ts2
result.index

DatetimeIndex(['2012-03-07 09:30:00+00:00', '2012-03-08 09:30:00+00:00',
               '2012-03-09 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)

> Operations between time zone-naive and time zone-aware data are not supported and will raise an exception.