### Objective:

pandas contains extensive capabilities and features for working with time series data for all domains. Using the NumPy datetime64 and timedelta64 dtypes, pandas has consolidated a large number of features from other Python libraries like scikits.timeseries as well as created a tremendous amount of new functionality for manipulating time series data.In this task we will try to implement these operations and gain some insights.

This notebook contains the operations from [Time series / date functionality](https://pandas.pydata.org/docs/user_guide/timeseries.html) guide from Pandas Official Documentation. It is performed on the dataset obtained from Kaggle - [Solar energy power generation dataset](https://www.kaggle.com/datasets/stucom/solar-energy-power-generation-dataset).

### Table of Contents

* [Timestamps vs. time spans](#Timestamps-vs-time-spans)
* [Converting to timestamps](#Converting-to-timestamps)
* [Generating ranges of timestamps](#Generating-ranges-of-timestamps)
* [Timestamp limitations](#Timestamp-limitations)
* [Indexing](#Indexing)
* [Time/date components](#Time/date-components)
* [DateOffset objects](#DateOffset-objects)
* [Time Series-related instance methods](#Time-Series-related-instance-methods)
* [Resampling](#Resampling)
* [Time span representation](#Time-span-representation)
* [Converting between representations](#Converting-between-representations)
* [Time zone handling](#Time-zone-handling)

#### DATA PREPROCESSING

In [1]:
import pandas as pd
import numpy as np
import datetime 

In [2]:
df=pd.read_csv('Data/spgdataset.csv')

In [3]:
df

Unnamed: 0,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,shortwave_radiation_backwards_sfc,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
0,2.17,31,1035.0,0.0,0.0,0.0,0,0,0,0.00,...,312.71,9.36,22.62,6.62,337.62,24.48,58.753108,83.237322,128.33543,454.100950
1,2.31,27,1035.1,0.0,0.0,0.0,0,0,0,1.78,...,294.78,5.99,32.74,4.61,321.34,21.96,45.408585,75.143041,139.65530,1411.999400
2,3.65,33,1035.4,0.0,0.0,0.0,0,0,0,108.58,...,270.00,3.89,56.31,3.76,286.70,14.04,32.848282,68.820648,152.53769,2214.849300
3,5.82,30,1035.4,0.0,0.0,0.0,0,0,0,258.10,...,323.13,3.55,23.96,3.08,339.44,19.80,22.699288,64.883536,166.90159,2527.609200
4,7.73,27,1034.4,0.0,0.0,0.0,0,0,0,375.58,...,10.01,6.76,25.20,6.62,22.38,16.56,19.199908,63.795208,182.13526,2640.203400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4208,8.69,66,1025.1,0.0,0.0,100.0,100,100,100,257.21,...,285.42,25.50,283.06,24.01,282.99,25.20,22.700907,64.952098,167.06794,173.410560
4209,7.57,90,1026.1,0.0,0.0,100.0,79,100,100,210.04,...,6.91,13.00,4.76,11.61,7.13,10.44,19.307487,63.895118,182.28247,403.792220
4210,7.27,90,1026.3,0.1,0.0,100.0,73,100,100,113.92,...,2.29,8.71,352.87,7.95,354.81,22.32,25.249506,65.827032,197.33868,158.367780
4211,8.25,81,1025.5,0.0,0.0,100.0,74,66,100,186.90,...,11.31,6.29,13.24,5.94,14.04,15.48,36.361161,70.496313,211.30027,126.953060


In [4]:
type(df)

pandas.core.frame.DataFrame

In [5]:
df.dtypes

temperature_2_m_above_gnd            float64
relative_humidity_2_m_above_gnd        int64
mean_sea_level_pressure_MSL          float64
total_precipitation_sfc              float64
snowfall_amount_sfc                  float64
total_cloud_cover_sfc                float64
high_cloud_cover_high_cld_lay          int64
medium_cloud_cover_mid_cld_lay         int64
low_cloud_cover_low_cld_lay            int64
shortwave_radiation_backwards_sfc    float64
wind_speed_10_m_above_gnd            float64
wind_direction_10_m_above_gnd        float64
wind_speed_80_m_above_gnd            float64
wind_direction_80_m_above_gnd        float64
wind_speed_900_mb                    float64
wind_direction_900_mb                float64
wind_gust_10_m_above_gnd             float64
angle_of_incidence                   float64
zenith                               float64
azimuth                              float64
generated_power_kw                   float64
dtype: object

In [6]:
df.columns

Index(['temperature_2_m_above_gnd', 'relative_humidity_2_m_above_gnd',
       'mean_sea_level_pressure_MSL', 'total_precipitation_sfc',
       'snowfall_amount_sfc', 'total_cloud_cover_sfc',
       'high_cloud_cover_high_cld_lay', 'medium_cloud_cover_mid_cld_lay',
       'low_cloud_cover_low_cld_lay', 'shortwave_radiation_backwards_sfc',
       'wind_speed_10_m_above_gnd', 'wind_direction_10_m_above_gnd',
       'wind_speed_80_m_above_gnd', 'wind_direction_80_m_above_gnd',
       'wind_speed_900_mb', 'wind_direction_900_mb',
       'wind_gust_10_m_above_gnd', 'angle_of_incidence', 'zenith', 'azimuth',
       'generated_power_kw'],
      dtype='object')

In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4213 entries, 0 to 4212
Data columns (total 21 columns):
 #   Column                             Non-Null Count  Dtype  
---  ------                             --------------  -----  
 0   temperature_2_m_above_gnd          4213 non-null   float64
 1   relative_humidity_2_m_above_gnd    4213 non-null   int64  
 2   mean_sea_level_pressure_MSL        4213 non-null   float64
 3   total_precipitation_sfc            4213 non-null   float64
 4   snowfall_amount_sfc                4213 non-null   float64
 5   total_cloud_cover_sfc              4213 non-null   float64
 6   high_cloud_cover_high_cld_lay      4213 non-null   int64  
 7   medium_cloud_cover_mid_cld_lay     4213 non-null   int64  
 8   low_cloud_cover_low_cld_lay        4213 non-null   int64  
 9   shortwave_radiation_backwards_sfc  4213 non-null   float64
 10  wind_speed_10_m_above_gnd          4213 non-null   float64
 11  wind_direction_10_m_above_gnd      4213 non-null   float

#### Adding a column to the dataframe using date_range method

In [8]:
timeseries = pd.Series(pd.date_range("2015-05-02 00:00:00", periods = len(df), freq = "H"))

In [9]:
timeseries

0      2015-05-02 00:00:00
1      2015-05-02 01:00:00
2      2015-05-02 02:00:00
3      2015-05-02 03:00:00
4      2015-05-02 04:00:00
               ...        
4208   2015-10-24 08:00:00
4209   2015-10-24 09:00:00
4210   2015-10-24 10:00:00
4211   2015-10-24 11:00:00
4212   2015-10-24 12:00:00
Length: 4213, dtype: datetime64[ns]

In [10]:
df.insert(0,'Time', timeseries.values)

In [11]:
df

Unnamed: 0,Time,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
0,2015-05-02 00:00:00,2.17,31,1035.0,0.0,0.0,0.0,0,0,0,...,312.71,9.36,22.62,6.62,337.62,24.48,58.753108,83.237322,128.33543,454.100950
1,2015-05-02 01:00:00,2.31,27,1035.1,0.0,0.0,0.0,0,0,0,...,294.78,5.99,32.74,4.61,321.34,21.96,45.408585,75.143041,139.65530,1411.999400
2,2015-05-02 02:00:00,3.65,33,1035.4,0.0,0.0,0.0,0,0,0,...,270.00,3.89,56.31,3.76,286.70,14.04,32.848282,68.820648,152.53769,2214.849300
3,2015-05-02 03:00:00,5.82,30,1035.4,0.0,0.0,0.0,0,0,0,...,323.13,3.55,23.96,3.08,339.44,19.80,22.699288,64.883536,166.90159,2527.609200
4,2015-05-02 04:00:00,7.73,27,1034.4,0.0,0.0,0.0,0,0,0,...,10.01,6.76,25.20,6.62,22.38,16.56,19.199908,63.795208,182.13526,2640.203400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4208,2015-10-24 08:00:00,8.69,66,1025.1,0.0,0.0,100.0,100,100,100,...,285.42,25.50,283.06,24.01,282.99,25.20,22.700907,64.952098,167.06794,173.410560
4209,2015-10-24 09:00:00,7.57,90,1026.1,0.0,0.0,100.0,79,100,100,...,6.91,13.00,4.76,11.61,7.13,10.44,19.307487,63.895118,182.28247,403.792220
4210,2015-10-24 10:00:00,7.27,90,1026.3,0.1,0.0,100.0,73,100,100,...,2.29,8.71,352.87,7.95,354.81,22.32,25.249506,65.827032,197.33868,158.367780
4211,2015-10-24 11:00:00,8.25,81,1025.5,0.0,0.0,100.0,74,66,100,...,11.31,6.29,13.24,5.94,14.04,15.48,36.361161,70.496313,211.30027,126.953060


**Series** and **DataFrame** have extended data type support and functionality for **datetime**, **timedelta** and **Period** data when passed into those constructors. **DateOffset** data however will be stored as **object** data.

In [12]:
type(timeseries)

pandas.core.series.Series

In [13]:
timeseries #date_Range

0      2015-05-02 00:00:00
1      2015-05-02 01:00:00
2      2015-05-02 02:00:00
3      2015-05-02 03:00:00
4      2015-05-02 04:00:00
               ...        
4208   2015-10-24 08:00:00
4209   2015-10-24 09:00:00
4210   2015-10-24 10:00:00
4211   2015-10-24 11:00:00
4212   2015-10-24 12:00:00
Length: 4213, dtype: datetime64[ns]

#### As it can be observed that here datatype is datetime64[ns] (in nano seconds)

#### Similarly for **period_range**

In [14]:
pd.Series(pd.period_range("1/1/2022", freq="M", periods=3))

0    2022-01
1    2022-02
2    2022-03
dtype: period[M]

#### It can be observed that the datatype is period[M]

#### Which implies that **datetime**, **Period** and **timedelta** has extended data type support when passed into series and dataframe constructors.

In [15]:
pd.Series([pd.DateOffset(1), pd.DateOffset(2)])

0         <DateOffset>
1    <2 * DateOffsets>
dtype: object

But as it can be observed above the **DataOffset** attribute will be stored in **object** data when passed through these series or dataframe constructors

## Timestamps vs. time spans

#### Timestamp and **Period** can serve as an index. Lists of **Timestamp** and **Period** are automatically coerced to **DatetimeIndex** and **PeriodIndex** respectively.

#### For timestamp

In [16]:
ts = pd.Series(np.random.randn(len(df)), df.loc[:,"Time"])

In [17]:
ts

Time
2015-05-02 00:00:00   -0.879028
2015-05-02 01:00:00   -0.272703
2015-05-02 02:00:00   -1.731871
2015-05-02 03:00:00    0.143216
2015-05-02 04:00:00   -0.281630
                         ...   
2015-10-24 08:00:00    0.769954
2015-10-24 09:00:00    1.329114
2015-10-24 10:00:00   -0.729129
2015-10-24 11:00:00    1.236616
2015-10-24 12:00:00    0.042773
Length: 4213, dtype: float64

In [18]:
type(ts.index)

pandas.core.indexes.datetimes.DatetimeIndex

In [19]:
df.iat[1,0] #to access a single value from a dataframe

Timestamp('2015-05-02 01:00:00')

#### Similarly for period

In [20]:
ts1 = pd.Series(np.random.randn(3),pd.period_range("2022-10", periods = 3, freq = "M"))

In [21]:
type(ts1.index)

pandas.core.indexes.period.PeriodIndex

In [22]:
ts1

2022-10    1.141777
2022-11   -0.607684
2022-12   -0.927996
Freq: M, dtype: float64

In [23]:
k = pd.Period("2022-10")

In [24]:
type(k)

pandas._libs.tslibs.period.Period

#### pandas represents timestamps using instances of **Timestamp** and sequences of timestamps using instances of **DatetimeIndex**

#### For regular time spans, pandas uses **Period** objects for scalar values and **PeriodIndex** for sequences of spans.

## Converting to timestamps 


In [25]:
randomdates = ["Jul 31, 2022", "2022-05-10", None]

for i in range(len(df)-3):
    randomdates.append(None)
    
randomdates

['Jul 31, 2022',
 '2022-05-10',
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 N

In [26]:
ser = pd.to_datetime(pd.Series(randomdates))

In [27]:
df.insert(1,'randTime', ser.values)

In [28]:
df

Unnamed: 0,Time,randTime,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
0,2015-05-02 00:00:00,2022-07-31,2.17,31,1035.0,0.0,0.0,0.0,0,0,...,312.71,9.36,22.62,6.62,337.62,24.48,58.753108,83.237322,128.33543,454.100950
1,2015-05-02 01:00:00,2022-05-10,2.31,27,1035.1,0.0,0.0,0.0,0,0,...,294.78,5.99,32.74,4.61,321.34,21.96,45.408585,75.143041,139.65530,1411.999400
2,2015-05-02 02:00:00,NaT,3.65,33,1035.4,0.0,0.0,0.0,0,0,...,270.00,3.89,56.31,3.76,286.70,14.04,32.848282,68.820648,152.53769,2214.849300
3,2015-05-02 03:00:00,NaT,5.82,30,1035.4,0.0,0.0,0.0,0,0,...,323.13,3.55,23.96,3.08,339.44,19.80,22.699288,64.883536,166.90159,2527.609200
4,2015-05-02 04:00:00,NaT,7.73,27,1034.4,0.0,0.0,0.0,0,0,...,10.01,6.76,25.20,6.62,22.38,16.56,19.199908,63.795208,182.13526,2640.203400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4208,2015-10-24 08:00:00,NaT,8.69,66,1025.1,0.0,0.0,100.0,100,100,...,285.42,25.50,283.06,24.01,282.99,25.20,22.700907,64.952098,167.06794,173.410560
4209,2015-10-24 09:00:00,NaT,7.57,90,1026.1,0.0,0.0,100.0,79,100,...,6.91,13.00,4.76,11.61,7.13,10.44,19.307487,63.895118,182.28247,403.792220
4210,2015-10-24 10:00:00,NaT,7.27,90,1026.3,0.1,0.0,100.0,73,100,...,2.29,8.71,352.87,7.95,354.81,22.32,25.249506,65.827032,197.33868,158.367780
4211,2015-10-24 11:00:00,NaT,8.25,81,1025.5,0.0,0.0,100.0,74,66,...,11.31,6.29,13.24,5.94,14.04,15.48,36.361161,70.496313,211.30027,126.953060


#### Here when you take a string like date and convert it into datetime you get the standard datetime and the ones the datetime doesn't recognize is stored as **NaT** which is similar to np.nan.

In [29]:
pd.to_datetime(["04-01-2012 10:00"], dayfirst=True)

DatetimeIndex(['2012-01-04 10:00:00'], dtype='datetime64[ns]', freq=None)

#### If you use dates which start with the day first (i.e. European style), we can pass the dayfirst flag as done above

In [30]:
pd.to_datetime("2010/11/12")

Timestamp('2010-11-12 00:00:00')

In [31]:
pd.Timestamp("2010/11/12")

Timestamp('2010-11-12 00:00:00')

#### If we pass a single string to **to_datetime**, it returns a single **Timestamp**. **Timestamp** can also accept string input, but it doesn’t accept string parsing options like **dayfirst** or **format**, so use **to_datetime** if these are required.

#### We can also use the **DatetimeIndex** constructor directly:

In [32]:
pd.DatetimeIndex(["2018-01-01", "2018-01-03", "2018-01-05"])

DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], dtype='datetime64[ns]', freq=None)

#### The string ‘infer’ can be passed in order to set the frequency of the index as the inferred frequency upon creation:

In [33]:
pd.DatetimeIndex(["2018-01-01", "2018-01-03", "2018-01-05"], freq="infer")


DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], dtype='datetime64[ns]', freq='2D')

### Providing a format argument

In [34]:
pd.to_datetime("2010/11/12", format="%Y/%m/%d")

Timestamp('2010-11-12 00:00:00')

In [35]:
pd.to_datetime("12-11-2010 00:00", format="%d-%m-%Y %H:%M")


Timestamp('2010-11-12 00:00:00')

### Assembling datetime from multiple DataFrame columns

In [36]:
df1 = pd.DataFrame(
    {"year": [2015, 2016], "month": [2, 3], "day": [4, 5], "hour": [2, 3]}
)

In [37]:
pd.to_datetime(df1)

0   2015-02-04 02:00:00
1   2016-03-05 03:00:00
dtype: datetime64[ns]

#### You can pass only the columns that you need to assemble.

In [38]:
pd.to_datetime(df1[["year", "month", "day"]])

0   2015-02-04
1   2016-03-05
dtype: datetime64[ns]

#### Required: **year, month, day**

#### Optional: **hour, minute, second, millisecond, microsecond, nanosecond**

### Invalid data


#### Type of errors values:

**errors = "raise"**, raises errors if there are any errors in the code.

**errors = "ignore"**, raises errors if there are any errors in the code.

**errors = "coerce"**, raises errors if there are any errors in the code.

In [39]:
df.iat[4,1] = "asd"

In [40]:
df

Unnamed: 0,Time,randTime,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
0,2015-05-02 00:00:00,2022-07-31 00:00:00,2.17,31,1035.0,0.0,0.0,0.0,0,0,...,312.71,9.36,22.62,6.62,337.62,24.48,58.753108,83.237322,128.33543,454.100950
1,2015-05-02 01:00:00,2022-05-10 00:00:00,2.31,27,1035.1,0.0,0.0,0.0,0,0,...,294.78,5.99,32.74,4.61,321.34,21.96,45.408585,75.143041,139.65530,1411.999400
2,2015-05-02 02:00:00,NaT,3.65,33,1035.4,0.0,0.0,0.0,0,0,...,270.00,3.89,56.31,3.76,286.70,14.04,32.848282,68.820648,152.53769,2214.849300
3,2015-05-02 03:00:00,NaT,5.82,30,1035.4,0.0,0.0,0.0,0,0,...,323.13,3.55,23.96,3.08,339.44,19.80,22.699288,64.883536,166.90159,2527.609200
4,2015-05-02 04:00:00,asd,7.73,27,1034.4,0.0,0.0,0.0,0,0,...,10.01,6.76,25.20,6.62,22.38,16.56,19.199908,63.795208,182.13526,2640.203400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4208,2015-10-24 08:00:00,,8.69,66,1025.1,0.0,0.0,100.0,100,100,...,285.42,25.50,283.06,24.01,282.99,25.20,22.700907,64.952098,167.06794,173.410560
4209,2015-10-24 09:00:00,,7.57,90,1026.1,0.0,0.0,100.0,79,100,...,6.91,13.00,4.76,11.61,7.13,10.44,19.307487,63.895118,182.28247,403.792220
4210,2015-10-24 10:00:00,,7.27,90,1026.3,0.1,0.0,100.0,73,100,...,2.29,8.71,352.87,7.95,354.81,22.32,25.249506,65.827032,197.33868,158.367780
4211,2015-10-24 11:00:00,,8.25,81,1025.5,0.0,0.0,100.0,74,66,...,11.31,6.29,13.24,5.94,14.04,15.48,36.361161,70.496313,211.30027,126.953060


In [42]:
pd.to_datetime(df.iat[4,1], errors = 'raise')

ParserError: Unknown string format: asd

In [None]:
pd.to_datetime(df.iat[4,1], errors = 'ignore')

In [None]:
pd.to_datetime(df.iat[4,1], errors = 'coerce')

### Epoch timestamps

#### Pandas supports converting integer or float epoch times to **Timestamp** and **DatetimeIndex**. The default unit is nanoseconds, since that is how **Timestamp** objects are stored internally. However, epochs are often stored in another **unit** which can be specified. These are computed from the starting point specified by the **origin** parameter.

In [None]:
pd.to_datetime(
    [1349720105, 1349806505, 1349892905, 1349979305, 1350065705], unit="s"
)

In [None]:
pd.to_datetime(
    [1349720105100, 1349720105200, 1349720105300, 1349720105400, 1349720105500],
    unit="ms",
)

#### Constructing a **Timestamp** or **DatetimeIndex** with an epoch timestamp with the **tz** argument specified will raise a ValueError. If you have epochs in wall time in another timezone, you can read the epochs as timezone-naive timestamps and then localize to the appropriate timezone:

In [None]:
pd.Timestamp(1262347200000000000).tz_localize("US/Pacific")

In [None]:
pd.DatetimeIndex([1262347200000000000]).tz_localize("US/Pacific")

In [None]:
pd.to_datetime([1490195805.433, 1490195805.433502912], unit="s")

In [None]:
pd.to_datetime(1490195805433502912, unit="ns")

### From timestamps to epoch

****The Unix epoch (or Unix time or POSIX time or Unix timestamp) is the number of seconds that have elapsed since January 1, 1970 (midnight UTC/GMT), not counting leap seconds (in ISO 8601: 1970-01-01T00:00:00Z).****

In [None]:
stamps = df.loc[:,"Time"]

In [None]:
stamps

In [None]:
(stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta("1s")

### Using the origin parameter


#### Using the **origin** parameter, one can specify an alternative starting point for creation of a **DatetimeIndex**. For example, to use 1960-01-01 as the starting date:

In [None]:
pd.to_datetime([1, 2, 3], unit="D", origin=pd.Timestamp("1960-01-01"))

#### The default is set at **origin='unix'**, which defaults to **1970-01-01 00:00:00**. Commonly called ‘unix epoch’ or POSIX time.

In [None]:
pd.to_datetime([1, 2, 3], unit="D")


## Generating ranges of timestamps

#### To generate an index with timestamps, you can use either the DatetimeIndex or Index constructor and pass in a list of datetime objects:

In [None]:
dates = [
    datetime.datetime(2012, 5, 1),
    datetime.datetime(2012, 5, 2),
    datetime.datetime(2012, 5, 3),
]

In [None]:
index = pd.DatetimeIndex(dates)

In [None]:
index

In [None]:
index = pd.Index(dates) #automatically converted into datetimeindex

In [None]:
index

#### Using this method to generate an index with timestamps is cumbersome as in general we often need a very long index with large number of timestamps. 

#### If we need timestamps on a regular frequency, we can use the **date_range()** and **bdate_range()** functions to create a **DatetimeIndex**. The default frequency for ***date_range*** is a **calendar day** while the default for ***bdate_range*** is a **business day**

In [None]:
start = datetime.datetime(2011, 1, 1)
end = datetime.datetime(2012, 1, 1)

In [None]:
index = pd.date_range(start, end)

In [None]:
index

In [None]:
index_b = pd.bdate_range(start, end) #business day index

In [None]:
index_b

#### There are lot of frequency values we can use while using **date_range** or **bdate_range** functions. 

In [None]:
pd.date_range(start, periods=1000, freq="M") # month end frequency

In [None]:
pd.bdate_range(start, periods=250, freq="BQS") #business quarter start frequency

In [None]:
pd.date_range(start, end, freq="BM") #business month end frequency

In [None]:
pd.date_range(start, end, freq="W") #weekly frequency

#### We can also generate periods of dates from the start or end or between any 2 dates respectively evenly spaced.

In [None]:
pd.bdate_range(end=end, periods=20)


In [None]:
pd.bdate_range(start=start, periods=20)


In [None]:
pd.date_range("2018-01-01", "2018-01-05", periods=5)


In [None]:
pd.date_range("2018-01-01", "2018-01-05", periods=10)


### Custom frequency ranges

#### We can use the **weekmask** and **holidays** parameters to customize the dates in **bdate_range** and is only to be used when **freq** parameter is being used.

In [None]:
weekmask = "Mon Wed Fri"
holidays = [datetime.datetime(2022, 1, 5), datetime.datetime(2022, 3, 14)]

In [None]:
pd.bdate_range(start, end, freq="C", weekmask=weekmask, holidays=holidays)
# C: custom business day frequency

In [None]:
pd.bdate_range(start, end, freq="CBMS", weekmask=weekmask)
# CBMS: custom business month start frequency

## Timestamp limitations


#### Since pandas represents timestamps in nanosecond resolution, the time span that can be represented using a 64-bit integer is limited to approximately **584 years**

In [43]:
pd.Timestamp.min

Timestamp('1677-09-21 00:12:43.145224193')

In [44]:
pd.Timestamp.max

Timestamp('2262-04-11 23:47:16.854775807')

## Indexing

#### One of the main uses for **DatetimeIndex** is as an index for pandas objects. The **DatetimeIndex** class contains many time series related optimizations

#### DatetimeIndex objects have all the basic functionality of regular **Index** objects, and a variety of advanced time series specific methods for easy frequency processing.

In [45]:
df_Time = df.loc[:,"Time"]

In [46]:
df_ser = pd.Series(np.random.randn(len(df)), df.loc[:,"Time"])

In [47]:
df_ser

Time
2015-05-02 00:00:00    0.932816
2015-05-02 01:00:00    0.011310
2015-05-02 02:00:00    1.720286
2015-05-02 03:00:00    0.078407
2015-05-02 04:00:00   -1.229160
                         ...   
2015-10-24 08:00:00    1.745733
2015-10-24 09:00:00    0.558440
2015-10-24 10:00:00   -2.085374
2015-10-24 11:00:00    0.510149
2015-10-24 12:00:00    1.029666
Length: 4213, dtype: float64

In [48]:
df_ser.index

DatetimeIndex(['2015-05-02 00:00:00', '2015-05-02 01:00:00',
               '2015-05-02 02:00:00', '2015-05-02 03:00:00',
               '2015-05-02 04:00:00', '2015-05-02 05:00:00',
               '2015-05-02 06:00:00', '2015-05-02 07:00:00',
               '2015-05-02 08:00:00', '2015-05-02 09:00:00',
               ...
               '2015-10-24 03:00:00', '2015-10-24 04:00:00',
               '2015-10-24 05:00:00', '2015-10-24 06:00:00',
               '2015-10-24 07:00:00', '2015-10-24 08:00:00',
               '2015-10-24 09:00:00', '2015-10-24 10:00:00',
               '2015-10-24 11:00:00', '2015-10-24 12:00:00'],
              dtype='datetime64[ns]', name='Time', length=4213, freq=None)

In [49]:
df_ser[:5].index

DatetimeIndex(['2015-05-02 00:00:00', '2015-05-02 01:00:00',
               '2015-05-02 02:00:00', '2015-05-02 03:00:00',
               '2015-05-02 04:00:00'],
              dtype='datetime64[ns]', name='Time', freq=None)

In [50]:
df_ser[::2].index

DatetimeIndex(['2015-05-02 00:00:00', '2015-05-02 02:00:00',
               '2015-05-02 04:00:00', '2015-05-02 06:00:00',
               '2015-05-02 08:00:00', '2015-05-02 10:00:00',
               '2015-05-02 12:00:00', '2015-05-02 14:00:00',
               '2015-05-02 16:00:00', '2015-05-02 18:00:00',
               ...
               '2015-10-23 18:00:00', '2015-10-23 20:00:00',
               '2015-10-23 22:00:00', '2015-10-24 00:00:00',
               '2015-10-24 02:00:00', '2015-10-24 04:00:00',
               '2015-10-24 06:00:00', '2015-10-24 08:00:00',
               '2015-10-24 10:00:00', '2015-10-24 12:00:00'],
              dtype='datetime64[ns]', name='Time', length=2107, freq=None)

### Partial string indexing

In [51]:
df_ser["2015-10-24 05:00:00"]

0.8920374511071717

In [52]:
df_ser[datetime.datetime(2015, 10, 22):]

Time
2015-10-22 00:00:00    0.042364
2015-10-22 01:00:00    1.901362
2015-10-22 02:00:00   -0.002129
2015-10-22 03:00:00   -0.142479
2015-10-22 04:00:00    1.678754
                         ...   
2015-10-24 08:00:00    1.745733
2015-10-24 09:00:00    0.558440
2015-10-24 10:00:00   -2.085374
2015-10-24 11:00:00    0.510149
2015-10-24 12:00:00    1.029666
Length: 61, dtype: float64

In [53]:
df_ser["2015/10/22":"2015/10/23"]

Time
2015-10-22 00:00:00    0.042364
2015-10-22 01:00:00    1.901362
2015-10-22 02:00:00   -0.002129
2015-10-22 03:00:00   -0.142479
2015-10-22 04:00:00    1.678754
2015-10-22 05:00:00   -0.400665
2015-10-22 06:00:00    1.679410
2015-10-22 07:00:00    1.197448
2015-10-22 08:00:00    1.166137
2015-10-22 09:00:00   -1.244254
2015-10-22 10:00:00   -1.854270
2015-10-22 11:00:00    2.284689
2015-10-22 12:00:00   -0.236776
2015-10-22 13:00:00   -0.487805
2015-10-22 14:00:00    0.087282
2015-10-22 15:00:00    1.364226
2015-10-22 16:00:00   -0.099761
2015-10-22 17:00:00    1.443609
2015-10-22 18:00:00   -0.780816
2015-10-22 19:00:00    0.262277
2015-10-22 20:00:00    1.091570
2015-10-22 21:00:00    0.327569
2015-10-22 22:00:00   -1.298351
2015-10-22 23:00:00   -1.604891
2015-10-23 00:00:00    0.704172
2015-10-23 01:00:00   -1.880864
2015-10-23 02:00:00    0.521882
2015-10-23 03:00:00   -2.251269
2015-10-23 04:00:00    0.526005
2015-10-23 05:00:00   -0.514141
2015-10-23 06:00:00   -0.031147
201

#### To provide convenience for accessing longer time series, you can also pass in the year or year and month as strings

In [54]:
df_ser["2015"]

Time
2015-05-02 00:00:00    0.932816
2015-05-02 01:00:00    0.011310
2015-05-02 02:00:00    1.720286
2015-05-02 03:00:00    0.078407
2015-05-02 04:00:00   -1.229160
                         ...   
2015-10-24 08:00:00    1.745733
2015-10-24 09:00:00    0.558440
2015-10-24 10:00:00   -2.085374
2015-10-24 11:00:00    0.510149
2015-10-24 12:00:00    1.029666
Length: 4213, dtype: float64

In [55]:
df_ser["2015-10"]

Time
2015-10-01 00:00:00   -0.040547
2015-10-01 01:00:00    0.163139
2015-10-01 02:00:00   -0.308947
2015-10-01 03:00:00   -1.119768
2015-10-01 04:00:00   -0.258740
                         ...   
2015-10-24 08:00:00    1.745733
2015-10-24 09:00:00    0.558440
2015-10-24 10:00:00   -2.085374
2015-10-24 11:00:00    0.510149
2015-10-24 12:00:00    1.029666
Length: 565, dtype: float64

#### Indexing done till now is done on series but this type of thing can be done on dataframes as well

#### Since the partial string selection is a form of label slicing, the endpoints will be included. This would include matching times on an included date

#### Let's first make the **"Time"** column as index in the dataframe

In [56]:
df = df.set_index("Time")

In [57]:
df

Unnamed: 0_level_0,randTime,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-05-02 00:00:00,2022-07-31 00:00:00,2.17,31,1035.0,0.0,0.0,0.0,0,0,0,...,312.71,9.36,22.62,6.62,337.62,24.48,58.753108,83.237322,128.33543,454.100950
2015-05-02 01:00:00,2022-05-10 00:00:00,2.31,27,1035.1,0.0,0.0,0.0,0,0,0,...,294.78,5.99,32.74,4.61,321.34,21.96,45.408585,75.143041,139.65530,1411.999400
2015-05-02 02:00:00,NaT,3.65,33,1035.4,0.0,0.0,0.0,0,0,0,...,270.00,3.89,56.31,3.76,286.70,14.04,32.848282,68.820648,152.53769,2214.849300
2015-05-02 03:00:00,NaT,5.82,30,1035.4,0.0,0.0,0.0,0,0,0,...,323.13,3.55,23.96,3.08,339.44,19.80,22.699288,64.883536,166.90159,2527.609200
2015-05-02 04:00:00,asd,7.73,27,1034.4,0.0,0.0,0.0,0,0,0,...,10.01,6.76,25.20,6.62,22.38,16.56,19.199908,63.795208,182.13526,2640.203400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2015-10-24 08:00:00,,8.69,66,1025.1,0.0,0.0,100.0,100,100,100,...,285.42,25.50,283.06,24.01,282.99,25.20,22.700907,64.952098,167.06794,173.410560
2015-10-24 09:00:00,,7.57,90,1026.1,0.0,0.0,100.0,79,100,100,...,6.91,13.00,4.76,11.61,7.13,10.44,19.307487,63.895118,182.28247,403.792220
2015-10-24 10:00:00,,7.27,90,1026.3,0.1,0.0,100.0,73,100,100,...,2.29,8.71,352.87,7.95,354.81,22.32,25.249506,65.827032,197.33868,158.367780
2015-10-24 11:00:00,,8.25,81,1025.5,0.0,0.0,100.0,74,66,100,...,11.31,6.29,13.24,5.94,14.04,15.48,36.361161,70.496313,211.30027,126.953060


In [58]:
df.loc["2015"]

Unnamed: 0_level_0,randTime,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-05-02 00:00:00,2022-07-31 00:00:00,2.17,31,1035.0,0.0,0.0,0.0,0,0,0,...,312.71,9.36,22.62,6.62,337.62,24.48,58.753108,83.237322,128.33543,454.100950
2015-05-02 01:00:00,2022-05-10 00:00:00,2.31,27,1035.1,0.0,0.0,0.0,0,0,0,...,294.78,5.99,32.74,4.61,321.34,21.96,45.408585,75.143041,139.65530,1411.999400
2015-05-02 02:00:00,NaT,3.65,33,1035.4,0.0,0.0,0.0,0,0,0,...,270.00,3.89,56.31,3.76,286.70,14.04,32.848282,68.820648,152.53769,2214.849300
2015-05-02 03:00:00,NaT,5.82,30,1035.4,0.0,0.0,0.0,0,0,0,...,323.13,3.55,23.96,3.08,339.44,19.80,22.699288,64.883536,166.90159,2527.609200
2015-05-02 04:00:00,asd,7.73,27,1034.4,0.0,0.0,0.0,0,0,0,...,10.01,6.76,25.20,6.62,22.38,16.56,19.199908,63.795208,182.13526,2640.203400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2015-10-24 08:00:00,,8.69,66,1025.1,0.0,0.0,100.0,100,100,100,...,285.42,25.50,283.06,24.01,282.99,25.20,22.700907,64.952098,167.06794,173.410560
2015-10-24 09:00:00,,7.57,90,1026.1,0.0,0.0,100.0,79,100,100,...,6.91,13.00,4.76,11.61,7.13,10.44,19.307487,63.895118,182.28247,403.792220
2015-10-24 10:00:00,,7.27,90,1026.3,0.1,0.0,100.0,73,100,100,...,2.29,8.71,352.87,7.95,354.81,22.32,25.249506,65.827032,197.33868,158.367780
2015-10-24 11:00:00,,8.25,81,1025.5,0.0,0.0,100.0,74,66,100,...,11.31,6.29,13.24,5.94,14.04,15.48,36.361161,70.496313,211.30027,126.953060


#### The below one starts on the very first time in the month, and includes the last date and time for the month:

In [59]:
df["2015-06":"2015-08"]

Unnamed: 0_level_0,randTime,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-06-01 00:00:00,,9.29,33,1015.8,0.0,0.0,0.0,0,0,0,...,162.12,12.76,163.61,12.52,161.57,11.88,30.834489,50.901216,221.13622,2651.586600
2015-06-01 01:00:00,,9.87,32,1015.5,0.0,0.0,0.0,0,0,0,...,165.62,15.54,166.61,14.49,165.62,15.12,45.768773,59.412694,236.23563,2153.084300
2015-06-01 02:00:00,,10.03,33,1014.6,0.0,0.0,0.0,0,0,0,...,168.18,17.16,170.34,16.17,168.44,16.92,60.713240,69.448275,248.62163,1179.081100
2015-06-01 03:00:00,,9.79,34,1014.6,0.0,0.0,0.0,0,0,0,...,169.16,19.35,170.36,17.94,169.59,18.36,75.632314,80.290352,259.32746,213.976410
2015-06-01 04:00:00,,9.08,36,1015.1,0.0,0.0,0.0,0,0,0,...,169.16,19.42,169.32,17.30,167.99,16.20,85.944587,87.943014,266.32075,3.745139
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2015-08-31 19:00:00,,24.08,56,1014.1,0.0,0.0,100.0,100,100,0,...,198.15,25.28,199.98,22.77,198.43,25.20,30.099626,40.160273,135.72332,305.316620
2015-08-31 20:00:00,,25.46,47,1013.7,0.0,0.0,100.0,100,100,0,...,208.37,22.10,210.32,20.46,208.37,25.92,18.061232,33.971865,158.23295,272.775280
2015-08-31 21:00:00,,23.47,45,1015.6,1.3,0.0,100.0,100,100,85,...,336.80,7.42,337.17,6.95,338.75,28.08,13.936486,32.345608,185.23239,1821.294800
2015-08-31 22:00:00,,23.07,48,1015.7,1.5,0.0,100.0,90,100,100,...,354.81,12.24,360.00,12.24,360.00,30.96,22.171194,35.904986,210.98474,561.156940


#### Here we are specifying the exact stop time

In [60]:
df["2015-06":"2015-08-31 00:00:00"]

Unnamed: 0_level_0,randTime,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-06-01 00:00:00,,9.29,33,1015.8,0.0,0.0,0.0,0,0,0,...,162.12,12.76,163.61,12.52,161.57,11.88,30.834489,50.901216,221.13622,2651.586600
2015-06-01 01:00:00,,9.87,32,1015.5,0.0,0.0,0.0,0,0,0,...,165.62,15.54,166.61,14.49,165.62,15.12,45.768773,59.412694,236.23563,2153.084300
2015-06-01 02:00:00,,10.03,33,1014.6,0.0,0.0,0.0,0,0,0,...,168.18,17.16,170.34,16.17,168.44,16.92,60.713240,69.448275,248.62163,1179.081100
2015-06-01 03:00:00,,9.79,34,1014.6,0.0,0.0,0.0,0,0,0,...,169.16,19.35,170.36,17.94,169.59,18.36,75.632314,80.290352,259.32746,213.976410
2015-06-01 04:00:00,,9.08,36,1015.1,0.0,0.0,0.0,0,0,0,...,169.16,19.42,169.32,17.30,167.99,16.20,85.944587,87.943014,266.32075,3.745139
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2015-08-30 20:00:00,,31.20,23,1015.3,0.0,0.0,4.5,15,0,0,...,180.83,26.65,181.55,24.84,180.83,25.92,22.481375,35.190516,211.22663,2239.977200
2015-08-30 21:00:00,,31.87,21,1014.7,0.0,0.0,6.3,21,0,0,...,185.71,27.19,186.84,25.33,185.71,28.08,35.371488,42.727790,231.66136,1721.691900
2015-08-30 22:00:00,,31.83,20,1014.1,0.0,0.0,4.8,16,0,0,...,194.66,26.21,195.95,24.38,196.29,27.00,49.462203,52.491240,246.85338,1197.756700
2015-08-30 23:00:00,,31.59,19,1013.7,0.0,0.0,6.3,21,0,0,...,199.54,26.08,200.19,23.81,200.35,24.48,63.942621,63.334084,258.81408,466.430230


#### The partial string indexing done above works on the multi index as done above and also on the single index dataframe

#### UTC Offset is also honored by string indexing

In [61]:
df3 = pd.DataFrame([0], index=pd.DatetimeIndex(["2019-01-01"], tz="US/Pacific"))

In [62]:
df3

Unnamed: 0,0
2019-01-01 00:00:00-08:00,0


In [63]:
df3["2019-01-01 12:00:00+04:00":"2019-01-01 13:00:00+04:00"]

Unnamed: 0,0
2019-01-01 00:00:00-08:00,0


### Slice vs. exact match


#### The same string used as an indexing parameter can be treated either as a slice or as an exact match depending on the resolution of the index. If the string is less accurate than the index, it will be treated as a slice, otherwise as an exact match.



#### Now, let us consider a **Series** object with a minute resolution index

In [64]:
series_minute = pd.Series(
    [1, 2, 3],
    pd.DatetimeIndex(
        ["2011-12-31 23:59:00", "2012-01-01 00:00:00", "2012-01-01 00:02:00"]
    ),
)

In [65]:
series_minute

2011-12-31 23:59:00    1
2012-01-01 00:00:00    2
2012-01-01 00:02:00    3
dtype: int64

In [66]:
series_minute.index.resolution

'minute'

#### Now if we want to access a timestamp with a timestamp string less accurate than a minute, it returns a **Series** object.

In [67]:
series_minute["2011-12-31 23"]


2011-12-31 23:59:00    1
dtype: int64

#### A timestamp string with minute resolution (or more accurate), gives a scalar instead, i.e. it is not casted to a slice.

In [68]:
series_minute["2011-12-31 23:59"]

1

In [69]:
series_minute["2011-12-31 23:59:00"]

1

#### If the index resolution is **"SECOND"**, then the minute-accurate timestamp gives a **Series**.

In [70]:
series_second = pd.Series(
    [1, 2, 3],
    pd.DatetimeIndex(
        ["2011-12-31 23:59:59", "2012-01-01 00:00:00", "2012-01-01 00:00:01"]
    ),
)

In [71]:
series_second

2011-12-31 23:59:59    1
2012-01-01 00:00:00    2
2012-01-01 00:00:01    3
dtype: int64

In [72]:
series_second.index.resolution

'second'

In [73]:
series_second["2011-12-31 23:59"]

2011-12-31 23:59:59    1
dtype: int64

#### If the timestamp string is treated as a slice, it can be used to index **DataFrame** with **.loc[]** as well.

In [74]:
dft_minute = pd.DataFrame(
    {"a": [1, 2, 3], "b": [4, 5, 6]}, index=series_minute.index
)

In [75]:
dft_minute

Unnamed: 0,a,b
2011-12-31 23:59:00,1,4
2012-01-01 00:00:00,2,5
2012-01-01 00:02:00,3,6


#### As we can have flexibility of "Hour", "Minute" and "Second" accuracy with **Timestamp**

#### It is to be also noted that **Datetimeindex** resolution cannot be less precise than day.

In [76]:
series_monthly = pd.Series(
    [1, 2, 3], pd.DatetimeIndex(["2011-12", "2012-01", "2012-02"])
)

In [77]:
series_monthly

2011-12-01    1
2012-01-01    2
2012-02-01    3
dtype: int64

In [78]:
series_monthly.index.resolution

'day'

In [79]:
series_monthly["2011-12"]  # returns Series

2011-12-01    1
dtype: int64

### Exact indexing

In [80]:
df_ser[datetime.datetime(2015,6,30):datetime.datetime(2015,8,31)]

Time
2015-06-30 00:00:00    0.262966
2015-06-30 01:00:00   -1.034580
2015-06-30 02:00:00   -1.418419
2015-06-30 03:00:00    1.165410
2015-06-30 04:00:00   -0.302133
                         ...   
2015-08-30 20:00:00    1.725081
2015-08-30 21:00:00    0.384302
2015-08-30 22:00:00   -1.123577
2015-08-30 23:00:00   -1.175073
2015-08-31 00:00:00    1.294957
Length: 1489, dtype: float64

In [81]:
df_ser[datetime.datetime(2015,6,30,10,0,0):datetime.datetime(2015,8,31,12,0,0)]

Time
2015-06-30 10:00:00    0.621268
2015-06-30 11:00:00   -1.018289
2015-06-30 12:00:00    0.716484
2015-06-30 13:00:00    0.639723
2015-06-30 14:00:00   -0.295461
                         ...   
2015-08-31 08:00:00    1.750968
2015-08-31 09:00:00   -1.361725
2015-08-31 10:00:00    0.715409
2015-08-31 11:00:00    3.007038
2015-08-31 12:00:00   -1.657371
Length: 1491, dtype: float64

### Truncating & fancy indexing

#### A **truncate()** convenience function is provided that is similar to slicing. Note that **truncate** assumes a **0** value for any unspecified date component in a **DatetimeIndex** in contrast to slicing which returns any partially matching dates

In [82]:
df_ser.truncate(before = "2015-07", after ="2015-09")

Time
2015-07-01 00:00:00    0.469198
2015-07-01 01:00:00    0.491168
2015-07-01 02:00:00   -0.996886
2015-07-01 03:00:00   -0.227539
2015-07-01 04:00:00   -0.758910
                         ...   
2015-08-31 20:00:00    1.087267
2015-08-31 21:00:00    0.112531
2015-08-31 22:00:00    0.202905
2015-08-31 23:00:00    0.147924
2015-09-01 00:00:00   -0.497387
Length: 1489, dtype: float64

In [83]:
df_ser["2015-07":"2015-09"]

Time
2015-07-01 00:00:00    0.469198
2015-07-01 01:00:00    0.491168
2015-07-01 02:00:00   -0.996886
2015-07-01 03:00:00   -0.227539
2015-07-01 04:00:00   -0.758910
                         ...   
2015-09-30 19:00:00   -0.190716
2015-09-30 20:00:00    1.490352
2015-09-30 21:00:00    2.357368
2015-09-30 22:00:00   -0.159341
2015-09-30 23:00:00    0.276011
Length: 2208, dtype: float64

#### Here it can be observed when truncated between the same dates as slicing. The length of the series returned is different. So, truncate function assumes a 0 value for any unspecified date componenent. It depends on the freq we set.

## Time/date components


#### There are several time/date properties that one can access from **Timestamp** or a collection of timestamps like a **DatetimeIndex**.

#### It can be found here - https://pandas.pydata.org/docs/user_guide/timeseries.html#time-date-components:~:text=Property,a%20leap%20year

#### We can obtain the year, week and day components of the ISO year from the ISO 8601 standard:

In [84]:
idx = pd.date_range(start="2019-12-29", freq="D", periods=4)

In [85]:
idx

DatetimeIndex(['2019-12-29', '2019-12-30', '2019-12-31', '2020-01-01'], dtype='datetime64[ns]', freq='D')

In [86]:
idx.isocalendar()

Unnamed: 0,year,week,day
2019-12-29,2019,52,7
2019-12-30,2020,1,1
2019-12-31,2020,1,2
2020-01-01,2020,1,3


In [87]:
df_ser.index.isocalendar()

Unnamed: 0_level_0,year,week,day
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2015-05-02 00:00:00,2015,18,6
2015-05-02 01:00:00,2015,18,6
2015-05-02 02:00:00,2015,18,6
2015-05-02 03:00:00,2015,18,6
2015-05-02 04:00:00,2015,18,6
...,...,...,...
2015-10-24 08:00:00,2015,43,6
2015-10-24 09:00:00,2015,43,6
2015-10-24 10:00:00,2015,43,6
2015-10-24 11:00:00,2015,43,6


## DateOffset objects


#### The frequency **"freq"** parameter that we use to generate times or dates with a particular frequency map to a **DataOffset** object and its subclasses. **DataOffset** is similar to **Timedelta** but it follows a specific calendar duration rules. 

#### However, all **DateOffset** subclasses that are an hour or smaller (Hour, Minute, Second, Milli, Micro, Nano) behave like **Timedelta** and respect absolute time.

In [88]:
#let's select a particular day that contains a day light savings time transition
tp = pd.Timestamp("2016-10-30 00:00:00", tz="Europe/Helsinki")

In [89]:
tp

Timestamp('2016-10-30 00:00:00+0300', tz='Europe/Helsinki')

In [90]:
tp + pd.Timedelta(days=1) #respects absolute time

Timestamp('2016-10-30 23:00:00+0200', tz='Europe/Helsinki')

In [91]:
tp + pd.DateOffset(days=1) # Respects calendar time

Timestamp('2016-10-31 00:00:00+0200', tz='Europe/Helsinki')

#### Most **DateOffsets** have associated frequencies strings, or offset aliases, that can be passed into **freq** keyword arguments. The available date offsets and associated frequency strings can be found here - https://pandas.pydata.org/docs/user_guide/timeseries.html#dateoffset-objects:~:text=Date%20Offset,one%20nanosecond

### Parametric offsets

#### Some of the offsets that are used can be parameterized when created to result in different behaviours

 #### For example, the **Week** offset for generating weekly data accepts a **weekday** parameter which results in the generated dates always lying on a particular day of the week:

In [92]:
d = datetime.datetime(2008, 8, 18, 9, 0)

In [93]:
d

datetime.datetime(2008, 8, 18, 9, 0)

In [94]:
d + pd.offsets.Week()

Timestamp('2008-08-25 09:00:00')

In [95]:
d + pd.offsets.Week(weekday=4)

Timestamp('2008-08-22 09:00:00')

In [96]:
d - pd.offsets.Week()

Timestamp('2008-08-11 09:00:00')

#### The **normalize** option will be effective for addition and subtraction. Normalize will generally reset the time to midnight.

In [97]:
d + pd.offsets.Week(normalize=True)

Timestamp('2008-08-25 00:00:00')

In [98]:
d - pd.offsets.Week(normalize=True)

Timestamp('2008-08-11 00:00:00')

### Using offsets with Series / DatetimeIndex


#### Offsets can be used with either a **Series** or **DatetimeIndex** to apply the offset to each element.

In [99]:
df_ser

Time
2015-05-02 00:00:00    0.932816
2015-05-02 01:00:00    0.011310
2015-05-02 02:00:00    1.720286
2015-05-02 03:00:00    0.078407
2015-05-02 04:00:00   -1.229160
                         ...   
2015-10-24 08:00:00    1.745733
2015-10-24 09:00:00    0.558440
2015-10-24 10:00:00   -2.085374
2015-10-24 11:00:00    0.510149
2015-10-24 12:00:00    1.029666
Length: 4213, dtype: float64

In [100]:
df_ser.index + pd.DateOffset(months = 2)

DatetimeIndex(['2015-07-02 00:00:00', '2015-07-02 01:00:00',
               '2015-07-02 02:00:00', '2015-07-02 03:00:00',
               '2015-07-02 04:00:00', '2015-07-02 05:00:00',
               '2015-07-02 06:00:00', '2015-07-02 07:00:00',
               '2015-07-02 08:00:00', '2015-07-02 09:00:00',
               ...
               '2015-12-24 03:00:00', '2015-12-24 04:00:00',
               '2015-12-24 05:00:00', '2015-12-24 06:00:00',
               '2015-12-24 07:00:00', '2015-12-24 08:00:00',
               '2015-12-24 09:00:00', '2015-12-24 10:00:00',
               '2015-12-24 11:00:00', '2015-12-24 12:00:00'],
              dtype='datetime64[ns]', name='Time', length=4213, freq=None)

In [101]:
df_ser.index - pd.DateOffset(months = 2)

DatetimeIndex(['2015-03-02 00:00:00', '2015-03-02 01:00:00',
               '2015-03-02 02:00:00', '2015-03-02 03:00:00',
               '2015-03-02 04:00:00', '2015-03-02 05:00:00',
               '2015-03-02 06:00:00', '2015-03-02 07:00:00',
               '2015-03-02 08:00:00', '2015-03-02 09:00:00',
               ...
               '2015-08-24 03:00:00', '2015-08-24 04:00:00',
               '2015-08-24 05:00:00', '2015-08-24 06:00:00',
               '2015-08-24 07:00:00', '2015-08-24 08:00:00',
               '2015-08-24 09:00:00', '2015-08-24 10:00:00',
               '2015-08-24 11:00:00', '2015-08-24 12:00:00'],
              dtype='datetime64[ns]', name='Time', length=4213, freq=None)

#### For datetimeindex

In [102]:
rng = pd.date_range("2012-01-01", "2012-01-03")

In [103]:
rng + pd.DateOffset(months=2)

DatetimeIndex(['2012-03-01', '2012-03-02', '2012-03-03'], dtype='datetime64[ns]', freq=None)

### Custom business days


#### The CDay or CustomBusinessDay class provides a parametric BusinessDay class which can be used to create customized business day calendars which account for local holidays and local weekend conventions

#### Let's assume a "Sat Sun weekend"

In [104]:
weekmask_ss = "Sun Mon Tue Wed Thu Fri"

In [105]:
holidays = [
    "2022-05-03",
    datetime.datetime(2022, 5, 1),
    np.datetime64("2022-05-01"),
]


In [106]:
bday_ss = pd.offsets.CustomBusinessDay(
    holidays=holidays,
    weekmask=weekmask_ss,
)

In [107]:
dt = datetime.datetime(2022, 4, 30) #saturday

In [108]:
dt + 2 * bday_ss #wednesday becuase tuesday is a holiday

Timestamp('2022-05-04 00:00:00')

#### Holiday calendars can also be used to porvide the list of holidays

## Business hour

#### The **BusinessHour** class provides a business hour representation on **BusinessDay**, allowing to use specific start and end times.

#### By default, BusinessHour uses 9:00 - 17:00 as business hours

In [109]:
bh = pd.offsets.BusinessHour()

In [110]:
bh

<BusinessHour: BH=09:00-17:00>

In [111]:
pd.Timestamp("2014-08-01 10:00").weekday() #it is a friday

4

In [112]:
pd.Timestamp("2014-08-01 10:00") + bh

Timestamp('2014-08-01 11:00:00')

#### Below example is the same as: pd.Timestamp('2014-08-01 09:00') + bh

In [113]:
pd.Timestamp("2014-08-01 08:00") + bh

Timestamp('2014-08-01 10:00:00')

#### If the results is on the end time, move to the next business day


In [114]:
pd.Timestamp("2014-08-01 16:00") + bh

Timestamp('2014-08-04 09:00:00')

#### Remainings are added to the next day

In [115]:
pd.Timestamp("2014-08-01 16:30") + bh

Timestamp('2014-08-04 09:30:00')

#### Adding 2 business hours

In [116]:
pd.Timestamp("2014-08-01 10:00") + pd.offsets.BusinessHour(2)

Timestamp('2014-08-01 12:00:00')

#### Subtracting 3 business hours

In [117]:
pd.Timestamp("2014-08-01 10:00") + pd.offsets.BusinessHour(-3)

Timestamp('2014-07-31 15:00:00')

#### We can also specify **start** and **end** time by keywords. The argument must be a **str** with an **hour:minute** representation or a **datetime.time** instance. Specifying seconds, microseconds and nanoseconds as business hour results in **ValueError**.

In [118]:
bh = pd.offsets.BusinessHour(start="11:00", end=datetime.time(23, 0))

In [119]:
bh

<BusinessHour: BH=11:00-23:00>

In [120]:
df.index[1]

Timestamp('2015-05-02 01:00:00')

In [121]:
df.index[1] + bh

Timestamp('2015-05-04 12:00:00')

In [122]:
df.index[2]

Timestamp('2015-05-02 02:00:00')

In [123]:
df.index[2] + bh

Timestamp('2015-05-04 12:00:00')

In [125]:
bh1 = pd.offsets.BusinessHour(start="11:00:30", end=datetime.time(23, 0))

ValueError: time data must match '%H:%M' format

### Custom business hour


#### The **CustomBusinessHour** is a mixture of **BusinessHour** and **CustomBusinessDay** which allows you to specify arbitrary holidays. **CustomBusinessHour** works as the same as **BusinessHour** except that it skips specified custom holidays.

In [None]:
from pandas.tseries.holiday import USFederalHolidayCalendar

In [None]:
bday_us = pd.offsets.CustomBusinessDay(calendar=USFederalHolidayCalendar())

In [None]:
dt = datetime.datetime(2014, 1, 17) #friday

In [None]:
dt + bday_us #gives tuesday because monday is a national holiday in US

### Offset Aliases

#### A number of string aliases are given to useful common time series frequencies. We will refer to these aliases as offset aliases.

#### It can be found here - https://pandas.pydata.org/docs/user_guide/timeseries.html#using-offsets-with-series-datetimeindex:~:text=Alias,nanoseconds

### Combining aliases

#### As we have seen previously, the alias and the offset instance are fungible in most functions:



In [None]:
pd.date_range(start, periods=5, freq="B")

In [None]:
pd.date_range(start, periods=5, freq=pd.offsets.BDay())


#### After combining aliases

In [None]:
pd.date_range(start, periods=10, freq="2h20min")

In [None]:
pd.date_range(start, periods=10, freq="1D10U")

### Anchored offsets

#### For some frequencies you can specify an anchoring suffix respective to the days you want:

#### It can be found here - https://pandas.pydata.org/docs/user_guide/timeseries.html#using-offsets-with-series-datetimeindex:~:text=an%20anchoring%20suffix%3A-,Alias,annual%20frequency%2C%20anchored%20end%20of%20November,-These%20can%20be

### Anchored offset semantics

#### For those offsets that are anchored to the start or end of specific frequency (MonthEnd, MonthBegin, WeekEnd, etc), the following rules apply to rolling forward and backwards.

#### 1. When n is not 0, if the given date is not on an anchor point, it snapped to the next(previous) anchor point, and moved |n|-1 additional steps forwards or backwards.
#### 2. If the given date is on an anchor point, it is moved |n| points forwards or backwards.
#### 3. For the case when n=0, the date is not moved if on an anchor point, otherwise it is rolled forward to the next anchor point.

In [None]:
pd.Timestamp("2014-01-02") + pd.offsets.MonthBegin(n=1)

In [None]:
pd.Timestamp("2014-01-02") - pd.offsets.MonthBegin(n=1)

In [None]:
pd.Timestamp("2014-01-02") + pd.offsets.MonthBegin(n=4)

In [None]:
pd.Timestamp("2014-01-02") + pd.offsets.MonthEnd(n=1)

In [None]:
pd.Timestamp("2014-01-02") - pd.offsets.MonthEnd(n=1)

#### On anchor points

In [None]:
pd.Timestamp("2014-01-01") + pd.offsets.MonthBegin(n=1)

In [None]:
pd.Timestamp("2014-01-01") - pd.offsets.MonthBegin(n=1)

In [None]:
pd.Timestamp("2014-01-01") + pd.offsets.MonthBegin(n=4)

In [None]:
pd.Timestamp("2014-01-31") + pd.offsets.MonthEnd(n=1)

In [None]:
pd.Timestamp("2014-01-31") - pd.offsets.MonthEnd(n=1)

#### n = 0

In [None]:
pd.Timestamp("2014-01-02") + pd.offsets.MonthBegin(n=0)

In [None]:
pd.Timestamp("2014-01-01") + pd.offsets.MonthBegin(n=0)

## Time Series-related instance methods

### Shifting / lagging

#### We may want to shift or lag the values in a time series back and forward in time. The method for this is **shift()**.

In [126]:
df_ser

Time
2015-05-02 00:00:00    0.932816
2015-05-02 01:00:00    0.011310
2015-05-02 02:00:00    1.720286
2015-05-02 03:00:00    0.078407
2015-05-02 04:00:00   -1.229160
                         ...   
2015-10-24 08:00:00    1.745733
2015-10-24 09:00:00    0.558440
2015-10-24 10:00:00   -2.085374
2015-10-24 11:00:00    0.510149
2015-10-24 12:00:00    1.029666
Length: 4213, dtype: float64

In [127]:
df_ser1 = df_ser[:5]

In [128]:
df_ser1

Time
2015-05-02 00:00:00    0.932816
2015-05-02 01:00:00    0.011310
2015-05-02 02:00:00    1.720286
2015-05-02 03:00:00    0.078407
2015-05-02 04:00:00   -1.229160
dtype: float64

In [129]:
df_ser1.shift(1)

Time
2015-05-02 00:00:00         NaN
2015-05-02 01:00:00    0.932816
2015-05-02 02:00:00    0.011310
2015-05-02 03:00:00    1.720286
2015-05-02 04:00:00    0.078407
dtype: float64

#### If we want to change the date by 1 day then we have to specify the freq parameter.

In [130]:
df_ser1.shift(1, freq = "D")

Time
2015-05-03 00:00:00    0.932816
2015-05-03 01:00:00    0.011310
2015-05-03 02:00:00    1.720286
2015-05-03 03:00:00    0.078407
2015-05-03 04:00:00   -1.229160
dtype: float64

#### Here we can observe that the data associated with the time didnot change because - When **freq** is specified, **shift** method changes all the dates in the index rather than changing the alignment of the data and the index.

### Frequency conversion

#### The primary function for changing frequencies is the **asfreq()** method. For a **DatetimeIndex**, this is basically just a thin, but convenient wrapper around **reindex()** which generates a **date_range** and calls **reindex**.

In [131]:
dr = pd.date_range("1/1/2010", periods=3, freq=3 * pd.offsets.BDay())

In [132]:
ts = pd.Series(np.random.randn(3), index=dr)

In [133]:
ts

2010-01-01   -1.066885
2010-01-06   -0.997949
2010-01-11    1.217910
Freq: 3B, dtype: float64

In [134]:
ts.asfreq(pd.offsets.BDay())

2010-01-01   -1.066885
2010-01-04         NaN
2010-01-05         NaN
2010-01-06   -0.997949
2010-01-07         NaN
2010-01-08         NaN
2010-01-11    1.217910
Freq: B, dtype: float64

#### There is also a method for filling out this gaps, such as interpolation. This method is **asfreq**, this fills out the values in the gaps that may appear after frequency conversion. This is similar to **fillna()** method.

In [135]:
ts.asfreq(pd.offsets.BDay(), method="pad")

2010-01-01   -1.066885
2010-01-04   -1.066885
2010-01-05   -1.066885
2010-01-06   -0.997949
2010-01-07   -0.997949
2010-01-08   -0.997949
2010-01-11    1.217910
Freq: B, dtype: float64

## Resampling

#### resample() is like a time-based groupby, followed by a reduction method on each of its groups. 



In [136]:
rng = pd.date_range("1/1/2012", periods=100, freq="S")

In [137]:
rng

DatetimeIndex(['2012-01-01 00:00:00', '2012-01-01 00:00:01',
               '2012-01-01 00:00:02', '2012-01-01 00:00:03',
               '2012-01-01 00:00:04', '2012-01-01 00:00:05',
               '2012-01-01 00:00:06', '2012-01-01 00:00:07',
               '2012-01-01 00:00:08', '2012-01-01 00:00:09',
               '2012-01-01 00:00:10', '2012-01-01 00:00:11',
               '2012-01-01 00:00:12', '2012-01-01 00:00:13',
               '2012-01-01 00:00:14', '2012-01-01 00:00:15',
               '2012-01-01 00:00:16', '2012-01-01 00:00:17',
               '2012-01-01 00:00:18', '2012-01-01 00:00:19',
               '2012-01-01 00:00:20', '2012-01-01 00:00:21',
               '2012-01-01 00:00:22', '2012-01-01 00:00:23',
               '2012-01-01 00:00:24', '2012-01-01 00:00:25',
               '2012-01-01 00:00:26', '2012-01-01 00:00:27',
               '2012-01-01 00:00:28', '2012-01-01 00:00:29',
               '2012-01-01 00:00:30', '2012-01-01 00:00:31',
               '2012-01-

In [138]:
ts = pd.Series(np.random.randint(0, 500, len(rng)), index=rng)

In [139]:
ts

2012-01-01 00:00:00    294
2012-01-01 00:00:01    259
2012-01-01 00:00:02    467
2012-01-01 00:00:03     51
2012-01-01 00:00:04    227
                      ... 
2012-01-01 00:01:35    466
2012-01-01 00:01:36    163
2012-01-01 00:01:37    469
2012-01-01 00:01:38    197
2012-01-01 00:01:39    377
Freq: S, Length: 100, dtype: int32

In [140]:
ts.resample("5Min").sum()

2012-01-01    23324
Freq: 5T, dtype: int32

In [141]:
ts.resample("1Min").sum()

2012-01-01 00:00:00    13977
2012-01-01 00:01:00     9347
Freq: T, dtype: int32

In [142]:
ts.resample("5Min").mean()

2012-01-01    233.24
Freq: 5T, dtype: float64

In [143]:
ts.resample("5Min").ohlc()

Unnamed: 0,open,high,low,close
2012-01-01,294,498,6,377


In [144]:
ts.resample("5Min").max()

2012-01-01    498
Freq: 5T, dtype: int32

## Upsampling

#### For upsampling, we can specify a way to upsample and the **limit** parameter to interpolate over the gaps that are created:


In [145]:
ts[:2].resample("250L").asfreq() # from secondly to every 250 milliseconds

2012-01-01 00:00:00.000    294.0
2012-01-01 00:00:00.250      NaN
2012-01-01 00:00:00.500      NaN
2012-01-01 00:00:00.750      NaN
2012-01-01 00:00:01.000    259.0
Freq: 250L, dtype: float64

In [146]:
ts[:2].resample("250L").ffill()

2012-01-01 00:00:00.000    294
2012-01-01 00:00:00.250    294
2012-01-01 00:00:00.500    294
2012-01-01 00:00:00.750    294
2012-01-01 00:00:01.000    259
Freq: 250L, dtype: int32

In [147]:
ts[:2].resample("250L").ffill(limit=2)

2012-01-01 00:00:00.000    294.0
2012-01-01 00:00:00.250    294.0
2012-01-01 00:00:00.500    294.0
2012-01-01 00:00:00.750      NaN
2012-01-01 00:00:01.000    259.0
Freq: 250L, dtype: float64

## Aggregation

In [148]:
dfnew = df.drop("randTime", axis = 1)

In [149]:
dfnew

Unnamed: 0_level_0,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,shortwave_radiation_backwards_sfc,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-05-02 00:00:00,2.17,31,1035.0,0.0,0.0,0.0,0,0,0,0.00,...,312.71,9.36,22.62,6.62,337.62,24.48,58.753108,83.237322,128.33543,454.100950
2015-05-02 01:00:00,2.31,27,1035.1,0.0,0.0,0.0,0,0,0,1.78,...,294.78,5.99,32.74,4.61,321.34,21.96,45.408585,75.143041,139.65530,1411.999400
2015-05-02 02:00:00,3.65,33,1035.4,0.0,0.0,0.0,0,0,0,108.58,...,270.00,3.89,56.31,3.76,286.70,14.04,32.848282,68.820648,152.53769,2214.849300
2015-05-02 03:00:00,5.82,30,1035.4,0.0,0.0,0.0,0,0,0,258.10,...,323.13,3.55,23.96,3.08,339.44,19.80,22.699288,64.883536,166.90159,2527.609200
2015-05-02 04:00:00,7.73,27,1034.4,0.0,0.0,0.0,0,0,0,375.58,...,10.01,6.76,25.20,6.62,22.38,16.56,19.199908,63.795208,182.13526,2640.203400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2015-10-24 08:00:00,8.69,66,1025.1,0.0,0.0,100.0,100,100,100,257.21,...,285.42,25.50,283.06,24.01,282.99,25.20,22.700907,64.952098,167.06794,173.410560
2015-10-24 09:00:00,7.57,90,1026.1,0.0,0.0,100.0,79,100,100,210.04,...,6.91,13.00,4.76,11.61,7.13,10.44,19.307487,63.895118,182.28247,403.792220
2015-10-24 10:00:00,7.27,90,1026.3,0.1,0.0,100.0,73,100,100,113.92,...,2.29,8.71,352.87,7.95,354.81,22.32,25.249506,65.827032,197.33868,158.367780
2015-10-24 11:00:00,8.25,81,1025.5,0.0,0.0,100.0,74,66,100,186.90,...,11.31,6.29,13.24,5.94,14.04,15.48,36.361161,70.496313,211.30027,126.953060


In [150]:
rs = dfnew.resample("2H")

In [151]:
rs.mean()

Unnamed: 0_level_0,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,shortwave_radiation_backwards_sfc,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-05-02 00:00:00,2.240,29.0,1035.05,0.00,0.0,0.0,0.0,0.0,0.0,0.890,...,303.745,7.675,27.680,5.615,329.480,23.22,52.080846,79.190181,133.995365,933.050175
2015-05-02 02:00:00,4.735,31.5,1035.40,0.00,0.0,0.0,0.0,0.0,0.0,183.340,...,296.565,3.720,40.135,3.420,313.070,16.92,27.773785,66.852092,159.719640,2371.229250
2015-05-02 04:00:00,8.210,28.0,1034.50,0.00,0.0,0.0,0.0,0.0,0.0,412.515,...,16.820,6.930,27.835,6.770,25.140,16.92,22.144037,64.748034,189.677940,2593.142500
2015-05-02 06:00:00,9.895,27.5,1034.05,0.00,0.0,0.0,0.0,0.0,0.0,453.455,...,19.880,7.120,24.325,6.980,21.635,11.88,42.635261,73.790139,217.435315,1667.075450
2015-05-02 08:00:00,7.960,39.5,1034.50,0.00,0.0,0.0,0.0,0.0,0.0,291.030,...,183.290,11.375,3.795,10.835,3.955,11.88,84.118850,103.524295,225.285320,48.242139
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2015-10-24 04:00:00,1.930,96.5,1026.60,0.00,0.0,40.0,23.0,40.0,0.0,1.335,...,230.145,18.140,249.420,14.730,225.050,25.74,65.417816,87.871359,123.485335,71.981529
2015-10-24 06:00:00,6.260,75.0,1026.70,0.00,0.0,100.0,61.0,51.0,100.0,123.710,...,279.215,13.830,281.890,9.395,274.935,12.06,39.045016,72.004143,146.268535,456.446805
2015-10-24 08:00:00,8.130,78.0,1025.60,0.00,0.0,100.0,89.5,100.0,100.0,233.625,...,146.165,19.250,143.910,17.810,145.060,17.82,21.004197,64.423608,174.675205,288.601390
2015-10-24 10:00:00,7.760,85.5,1025.90,0.05,0.0,100.0,73.5,83.0,100.0,150.410,...,6.800,7.500,183.055,6.945,184.425,18.90,30.805334,68.161673,204.319475,142.660420


In [152]:
rs["temperature_2_m_above_gnd"].mean()

Time
2015-05-02 00:00:00    2.240
2015-05-02 02:00:00    4.735
2015-05-02 04:00:00    8.210
2015-05-02 06:00:00    9.895
2015-05-02 08:00:00    7.960
                       ...  
2015-10-24 04:00:00    1.930
2015-10-24 06:00:00    6.260
2015-10-24 08:00:00    8.130
2015-10-24 10:00:00    7.760
2015-10-24 12:00:00    8.000
Freq: 2H, Name: temperature_2_m_above_gnd, Length: 2107, dtype: float64

In [153]:
rs[["temperature_2_m_above_gnd","relative_humidity_2_m_above_gnd"]].mean()

Unnamed: 0_level_0,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd
Time,Unnamed: 1_level_1,Unnamed: 2_level_1
2015-05-02 00:00:00,2.240,29.0
2015-05-02 02:00:00,4.735,31.5
2015-05-02 04:00:00,8.210,28.0
2015-05-02 06:00:00,9.895,27.5
2015-05-02 08:00:00,7.960,39.5
...,...,...
2015-10-24 04:00:00,1.930,96.5
2015-10-24 06:00:00,6.260,75.0
2015-10-24 08:00:00,8.130,78.0
2015-10-24 10:00:00,7.760,85.5


In [154]:
rs["temperature_2_m_above_gnd"].agg([np.sum, np.mean, np.std])

Unnamed: 0_level_0,sum,mean,std
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2015-05-02 00:00:00,4.48,2.240,0.098995
2015-05-02 02:00:00,9.47,4.735,1.534422
2015-05-02 04:00:00,16.42,8.210,0.678823
2015-05-02 06:00:00,19.79,9.895,0.247487
2015-05-02 08:00:00,15.92,7.960,2.008183
...,...,...,...
2015-10-24 04:00:00,3.86,1.930,1.697056
2015-10-24 06:00:00,12.52,6.260,1.767767
2015-10-24 08:00:00,16.26,8.130,0.791960
2015-10-24 10:00:00,15.52,7.760,0.692965


In [155]:
rs[["temperature_2_m_above_gnd","relative_humidity_2_m_above_gnd"]].agg([np.sum, np.mean, np.std])

Unnamed: 0_level_0,temperature_2_m_above_gnd,temperature_2_m_above_gnd,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,relative_humidity_2_m_above_gnd,relative_humidity_2_m_above_gnd
Unnamed: 0_level_1,sum,mean,std,sum,mean,std
Time,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2
2015-05-02 00:00:00,4.48,2.240,0.098995,58,29.0,2.828427
2015-05-02 02:00:00,9.47,4.735,1.534422,63,31.5,2.121320
2015-05-02 04:00:00,16.42,8.210,0.678823,56,28.0,1.414214
2015-05-02 06:00:00,19.79,9.895,0.247487,55,27.5,0.707107
2015-05-02 08:00:00,15.92,7.960,2.008183,79,39.5,10.606602
...,...,...,...,...,...,...
2015-10-24 04:00:00,3.86,1.930,1.697056,193,96.5,3.535534
2015-10-24 06:00:00,12.52,6.260,1.767767,150,75.0,4.242641
2015-10-24 08:00:00,16.26,8.130,0.791960,156,78.0,16.970563
2015-10-24 10:00:00,15.52,7.760,0.692965,171,85.5,6.363961


In [156]:
rs.agg([np.sum, np.mean])

Unnamed: 0_level_0,temperature_2_m_above_gnd,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,mean_sea_level_pressure_MSL,total_precipitation_sfc,total_precipitation_sfc,snowfall_amount_sfc,snowfall_amount_sfc,...,wind_gust_10_m_above_gnd,wind_gust_10_m_above_gnd,angle_of_incidence,angle_of_incidence,zenith,zenith,azimuth,azimuth,generated_power_kw,generated_power_kw
Unnamed: 0_level_1,sum,mean,sum,mean,sum,mean,sum,mean,sum,mean,...,sum,mean,sum,mean,sum,mean,sum,mean,sum,mean
Time,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2015-05-02 00:00:00,4.48,2.240,58,29.0,2070.1,1035.05,0.0,0.00,0.0,0.0,...,46.44,23.22,104.161693,52.080846,158.380363,79.190181,267.99073,133.995365,1866.100350,933.050175
2015-05-02 02:00:00,9.47,4.735,63,31.5,2070.8,1035.40,0.0,0.00,0.0,0.0,...,33.84,16.92,55.547570,27.773785,133.704184,66.852092,319.43928,159.719640,4742.458500,2371.229250
2015-05-02 04:00:00,16.42,8.210,56,28.0,2069.0,1034.50,0.0,0.00,0.0,0.0,...,33.84,16.92,44.288075,22.144037,129.496068,64.748034,379.35588,189.677940,5186.285000,2593.142500
2015-05-02 06:00:00,19.79,9.895,55,27.5,2068.1,1034.05,0.0,0.00,0.0,0.0,...,23.76,11.88,85.270522,42.635261,147.580277,73.790139,434.87063,217.435315,3334.150900,1667.075450
2015-05-02 08:00:00,15.92,7.960,79,39.5,2069.0,1034.50,0.0,0.00,0.0,0.0,...,23.76,11.88,168.237700,84.118850,207.048591,103.524295,450.57064,225.285320,96.484278,48.242139
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2015-10-24 04:00:00,3.86,1.930,193,96.5,2053.2,1026.60,0.0,0.00,0.0,0.0,...,51.48,25.74,130.835631,65.417816,175.742718,87.871359,246.97067,123.485335,143.963059,71.981529
2015-10-24 06:00:00,12.52,6.260,150,75.0,2053.4,1026.70,0.0,0.00,0.0,0.0,...,24.12,12.06,78.090031,39.045016,144.008286,72.004143,292.53707,146.268535,912.893610,456.446805
2015-10-24 08:00:00,16.26,8.130,156,78.0,2051.2,1025.60,0.0,0.00,0.0,0.0,...,35.64,17.82,42.008394,21.004197,128.847216,64.423608,349.35041,174.675205,577.202780,288.601390
2015-10-24 10:00:00,15.52,7.760,171,85.5,2051.8,1025.90,0.1,0.05,0.0,0.0,...,37.80,18.90,61.610667,30.805334,136.323345,68.161673,408.63895,204.319475,285.320840,142.660420


In [157]:
rs.agg({"temperature_2_m_above_gnd": np.sum, "relative_humidity_2_m_above_gnd": lambda x: np.std(x, ddof=1)})

Unnamed: 0_level_0,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd
Time,Unnamed: 1_level_1,Unnamed: 2_level_1
2015-05-02 00:00:00,4.48,2.828427
2015-05-02 02:00:00,9.47,2.121320
2015-05-02 04:00:00,16.42,1.414214
2015-05-02 06:00:00,19.79,0.707107
2015-05-02 08:00:00,15.92,10.606602
...,...,...
2015-10-24 04:00:00,3.86,3.535534
2015-10-24 06:00:00,12.52,4.242641
2015-10-24 08:00:00,16.26,16.970563
2015-10-24 10:00:00,15.52,6.363961


#### If you want to do resampling on the datetimelike index in the dataframe, you can use **on** keyword.

In [158]:
df

Unnamed: 0_level_0,randTime,temperature_2_m_above_gnd,relative_humidity_2_m_above_gnd,mean_sea_level_pressure_MSL,total_precipitation_sfc,snowfall_amount_sfc,total_cloud_cover_sfc,high_cloud_cover_high_cld_lay,medium_cloud_cover_mid_cld_lay,low_cloud_cover_low_cld_lay,...,wind_direction_10_m_above_gnd,wind_speed_80_m_above_gnd,wind_direction_80_m_above_gnd,wind_speed_900_mb,wind_direction_900_mb,wind_gust_10_m_above_gnd,angle_of_incidence,zenith,azimuth,generated_power_kw
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-05-02 00:00:00,2022-07-31 00:00:00,2.17,31,1035.0,0.0,0.0,0.0,0,0,0,...,312.71,9.36,22.62,6.62,337.62,24.48,58.753108,83.237322,128.33543,454.100950
2015-05-02 01:00:00,2022-05-10 00:00:00,2.31,27,1035.1,0.0,0.0,0.0,0,0,0,...,294.78,5.99,32.74,4.61,321.34,21.96,45.408585,75.143041,139.65530,1411.999400
2015-05-02 02:00:00,NaT,3.65,33,1035.4,0.0,0.0,0.0,0,0,0,...,270.00,3.89,56.31,3.76,286.70,14.04,32.848282,68.820648,152.53769,2214.849300
2015-05-02 03:00:00,NaT,5.82,30,1035.4,0.0,0.0,0.0,0,0,0,...,323.13,3.55,23.96,3.08,339.44,19.80,22.699288,64.883536,166.90159,2527.609200
2015-05-02 04:00:00,asd,7.73,27,1034.4,0.0,0.0,0.0,0,0,0,...,10.01,6.76,25.20,6.62,22.38,16.56,19.199908,63.795208,182.13526,2640.203400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2015-10-24 08:00:00,,8.69,66,1025.1,0.0,0.0,100.0,100,100,100,...,285.42,25.50,283.06,24.01,282.99,25.20,22.700907,64.952098,167.06794,173.410560
2015-10-24 09:00:00,,7.57,90,1026.1,0.0,0.0,100.0,79,100,100,...,6.91,13.00,4.76,11.61,7.13,10.44,19.307487,63.895118,182.28247,403.792220
2015-10-24 10:00:00,,7.27,90,1026.3,0.1,0.0,100.0,73,100,100,...,2.29,8.71,352.87,7.95,354.81,22.32,25.249506,65.827032,197.33868,158.367780
2015-10-24 11:00:00,,8.25,81,1025.5,0.0,0.0,100.0,74,66,100,...,11.31,6.29,13.24,5.94,14.04,15.48,36.361161,70.496313,211.30027,126.953060


In [159]:
df_S = pd.DataFrame(pd.to_datetime(df.loc[:,"randTime"], errors = "coerce"))

In [160]:
df_k = df["temperature_2_m_above_gnd"]

In [161]:
df_S = df_S.join(df_k)

In [162]:
df_S = df_S.resample("3H", on="randTime")[["temperature_2_m_above_gnd"]].sum()

In [163]:
df_S

Unnamed: 0_level_0,temperature_2_m_above_gnd
randTime,Unnamed: 1_level_1
2022-05-10 00:00:00,2.31
2022-05-10 03:00:00,0.00
2022-05-10 06:00:00,0.00
2022-05-10 09:00:00,0.00
2022-05-10 12:00:00,0.00
...,...
2022-07-30 12:00:00,0.00
2022-07-30 15:00:00,0.00
2022-07-30 18:00:00,0.00
2022-07-30 21:00:00,0.00


## Time span representation


#### Regular intervals of time are represented by **Period** objects in pandas while sequences of **Period** objects are collected in a **PeriodIndex**, which can be created with the convenience function **period_range**.

### Period

#### A **period** represents a span of time, we can specify the span via **freq** keyword. 

In [164]:
pd.Period("2012", freq="A-DEC") 
#annual frequency, anchored end of December. Same as ‘A’

Period('2012', 'A-DEC')

In [165]:
pd.Period("2012-1-1", freq="D") #D: calendar day frequency

Period('2012-01-01', 'D')

In [166]:
pd.Period("2012-1-1 19:00", freq="H") #H: hourly frequency

Period('2012-01-01 19:00', 'H')

In [167]:
pd.Period("2012-1-1 19:00", freq="5H") #5hour frequency

Period('2012-01-01 19:00', '5H')

#### Adding and subtracting integers from periods shifts the period by its own frequency. Arithmetic is not allowed between **Period** with different **freq** (span).

In [168]:
p = pd.Period("2012", freq="A-DEC")

In [169]:
p + 1

Period('2013', 'A-DEC')

In [170]:
p - 3

Period('2009', 'A-DEC')

In [171]:
p = pd.Period("2012-01", freq="2M")

In [172]:
p + 2

Period('2012-05', '2M')

In [173]:
p - 1

Period('2011-11', '2M')

In [174]:
p == pd.Period("2012-01", freq="3M")

False

#### If **Period** has other frequencies, only the same **offsets** can be added. Otherwise, **ValueError** will be raised.

In [175]:
p = pd.Period("2014-07", freq="M")

In [176]:
p + pd.offsets.MonthEnd(3)

Period('2014-10', 'M')

In [178]:
p + pd.offsets.MonthBegin(3)

IncompatibleFrequency: Input has different freq=3MS from Period(freq=M)

### PeriodIndex and period_range

#### Regular sequences of **Period** objects can be collected in a **PeriodIndex**, which can be constructed using the **period_range** convenience function:


In [None]:
prng = pd.period_range("1/1/2011", "1/1/2012", freq="M")

In [None]:
prng

In [None]:
pd.PeriodIndex(["2011-1", "2011-2", "2011-3"], freq="M")

In [None]:
prng.dtype

In [None]:
prng.astype("period[D]") #change monthly freq to daily freq

In [None]:
prng.astype("datetime64[ns]") # convert to DatetimeIndex

In [None]:
dti = pd.date_range("2011-01-01", freq="M", periods=3)
# convert to PeriodIndex

In [None]:
dti

In [179]:
dti.astype("period[M]")

NameError: name 'dti' is not defined

### PeriodIndex partial string indexing

In [None]:
ps = pd.Series(np.random.randn(len(prng)), prng)

In [None]:
ps

In [None]:
ps["2011-01"]

In [None]:
ps[datetime.datetime(2011, 12, 25):]

In [None]:
ps["10/31/2011":"12/31/2011"]

#### Passing a string representing a lower frequency than **PeriodIndex** returns partial sliced data.

In [None]:
ps["2011"]

In [None]:
dfp = pd.DataFrame(
    np.random.randn(600, 1),
    columns=["A"],
    index=pd.period_range("2013-01-01 9:00", periods=600, freq="T"),
) #T: minutely frequency

In [None]:
dfp

In [None]:
dfp.loc["2013-01-01 10H"]

## Converting between representations

#### Timestamped data can be converted to PeriodIndex-ed data using **to_period** and vice-versa using **to_timestamp**:


In [180]:
rng = pd.date_range("1/1/2012", periods=5, freq="M")

In [181]:
ts = pd.Series(np.random.randn(len(rng)), index=rng)

In [182]:
ts

2012-01-31    1.187791
2012-02-29    0.102287
2012-03-31    0.070065
2012-04-30   -0.544491
2012-05-31   -1.404998
Freq: M, dtype: float64

In [183]:
ps = ts.to_period()

In [184]:
ps

2012-01    1.187791
2012-02    0.102287
2012-03    0.070065
2012-04   -0.544491
2012-05   -1.404998
Freq: M, dtype: float64

In [185]:
ps.to_timestamp()

2012-01-01    1.187791
2012-02-01    0.102287
2012-03-01    0.070065
2012-04-01   -0.544491
2012-05-01   -1.404998
Freq: MS, dtype: float64

#### Converting between period and timestamp enables some convenient arithmetic functions to be used.

In [186]:
prng = pd.period_range("1990Q1", "2000Q4", freq="Q-NOV")
# quarterly frequency, year ends in November

In [187]:
prng

PeriodIndex(['1990Q1', '1990Q2', '1990Q3', '1990Q4', '1991Q1', '1991Q2',
             '1991Q3', '1991Q4', '1992Q1', '1992Q2', '1992Q3', '1992Q4',
             '1993Q1', '1993Q2', '1993Q3', '1993Q4', '1994Q1', '1994Q2',
             '1994Q3', '1994Q4', '1995Q1', '1995Q2', '1995Q3', '1995Q4',
             '1996Q1', '1996Q2', '1996Q3', '1996Q4', '1997Q1', '1997Q2',
             '1997Q3', '1997Q4', '1998Q1', '1998Q2', '1998Q3', '1998Q4',
             '1999Q1', '1999Q2', '1999Q3', '1999Q4', '2000Q1', '2000Q2',
             '2000Q3', '2000Q4'],
            dtype='period[Q-NOV]')

In [188]:
ts = pd.Series(np.random.randn(len(prng)), prng)

In [189]:
ts

1990Q1    0.016114
1990Q2    0.971515
1990Q3    0.626959
1990Q4   -0.459283
1991Q1   -1.076747
1991Q2    0.197575
1991Q3   -0.857014
1991Q4   -1.014429
1992Q1   -1.025144
1992Q2    0.028788
1992Q3   -0.823428
1992Q4   -0.334154
1993Q1   -1.073532
1993Q2    0.362944
1993Q3    0.465438
1993Q4   -0.942190
1994Q1   -0.765775
1994Q2    1.795015
1994Q3   -1.991257
1994Q4   -0.722639
1995Q1    0.055387
1995Q2    0.833438
1995Q3   -0.263488
1995Q4    0.266253
1996Q1   -0.031492
1996Q2   -1.065042
1996Q3   -0.897767
1996Q4   -0.756969
1997Q1   -0.559239
1997Q2    0.840356
1997Q3    0.068477
1997Q4    0.205219
1998Q1    0.064744
1998Q2    0.335258
1998Q3    0.134996
1998Q4    0.874107
1999Q1   -0.816085
1999Q2    0.746760
1999Q3   -0.678184
1999Q4   -0.689789
2000Q1    0.268591
2000Q2    0.530417
2000Q3    0.518674
2000Q4   -2.423959
Freq: Q-NOV, dtype: float64

In [190]:
ts.index = (prng.asfreq("M", "e") + 1).asfreq("H", "s") + 9

In [191]:
ts.head()

1990-03-01 09:00    0.016114
1990-06-01 09:00    0.971515
1990-09-01 09:00    0.626959
1990-12-01 09:00   -0.459283
1991-03-01 09:00   -1.076747
Freq: H, dtype: float64

## Time zone handling

In [192]:
rng = pd.date_range("3/6/2012 00:00", periods=15, freq="D")

In [193]:
rng.tz is None

True

**Generally, we localize the time to the universal time format (UTC) and then convert it into whatever time zone we want to**

In [194]:
dti = rng.tz_localize("UTC")

In [195]:
dti

DatetimeIndex(['2012-03-06 00:00:00+00:00', '2012-03-07 00:00:00+00:00',
               '2012-03-08 00:00:00+00:00', '2012-03-09 00:00:00+00:00',
               '2012-03-10 00:00:00+00:00', '2012-03-11 00:00:00+00:00',
               '2012-03-12 00:00:00+00:00', '2012-03-13 00:00:00+00:00',
               '2012-03-14 00:00:00+00:00', '2012-03-15 00:00:00+00:00',
               '2012-03-16 00:00:00+00:00', '2012-03-17 00:00:00+00:00',
               '2012-03-18 00:00:00+00:00', '2012-03-19 00:00:00+00:00',
               '2012-03-20 00:00:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [196]:
dti.tz_convert("US/Pacific")

DatetimeIndex(['2012-03-05 16:00:00-08:00', '2012-03-06 16:00:00-08:00',
               '2012-03-07 16:00:00-08:00', '2012-03-08 16:00:00-08:00',
               '2012-03-09 16:00:00-08:00', '2012-03-10 16:00:00-08:00',
               '2012-03-11 17:00:00-07:00', '2012-03-12 17:00:00-07:00',
               '2012-03-13 17:00:00-07:00', '2012-03-14 17:00:00-07:00',
               '2012-03-15 17:00:00-07:00', '2012-03-16 17:00:00-07:00',
               '2012-03-17 17:00:00-07:00', '2012-03-18 17:00:00-07:00',
               '2012-03-19 17:00:00-07:00'],
              dtype='datetime64[ns, US/Pacific]', freq='D')

In [197]:
dti

DatetimeIndex(['2012-03-06 00:00:00+00:00', '2012-03-07 00:00:00+00:00',
               '2012-03-08 00:00:00+00:00', '2012-03-09 00:00:00+00:00',
               '2012-03-10 00:00:00+00:00', '2012-03-11 00:00:00+00:00',
               '2012-03-12 00:00:00+00:00', '2012-03-13 00:00:00+00:00',
               '2012-03-14 00:00:00+00:00', '2012-03-15 00:00:00+00:00',
               '2012-03-16 00:00:00+00:00', '2012-03-17 00:00:00+00:00',
               '2012-03-18 00:00:00+00:00', '2012-03-19 00:00:00+00:00',
               '2012-03-20 00:00:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [199]:
rng = pd.date_range("3/6/2012 00:00", periods=15, freq="D")
rng.tz is None

True

In [200]:
import dateutil

In [201]:
rng_pytz = pd.date_range("3/6/2012 00:00", periods=3, freq="D", tz="Europe/London")
rng_pytz.tz
rng_dateutil = pd.date_range("3/6/2012 00:00", periods=3, freq="D")
rng_dateutil = rng_dateutil.tz_localize("dateutil/Europe/London")
rng_dateutil.tz

tzfile('Europe/Belfast')

In [204]:
rng_utc = pd.date_range(
    "3/6/2012 00:00",
    periods=3,
    freq="D",
    tz=datetime.timezone.utc,
)

In [205]:
rng_utc.tz

datetime.timezone.utc