<a href="https://colab.research.google.com/github/ParvezAlam-AI/Advance-Analytics/blob/main/Time_Serires.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Dates and Times in Python


## Native python dates and times: datetime and dateutil

In [1]:
from datetime import datetime

In [2]:
datetime(year=2024, month=8, day=2)

datetime.datetime(2024, 8, 2, 0, 0)

In [3]:
# dateutil examples. parse date from a variety of string formats
from dateutil import parser
date = parser.parse("2nd of Sep, 2024")
date

datetime.datetime(2024, 9, 2, 0, 0)

In [14]:
date.strftime('%A')

'Monday'

## Typed arrays of times: NumPy's datetime64


NumPy's `datetime64` dtype encodes dates as 64-bit integers, and thus allows arrays of dates to be represented compactly and operated on in an efficient manner.
The `datetime64` requires a specific input format:

In [15]:
import numpy as np
date = np.array('2024-09-02',dtype=np.datetime64)
date

array('2024-09-02', dtype='datetime64[D]')

Once we have dates in this form, we can quickly do vectorized operations on it:

In [21]:
date + np.arange(21)

array(['2024-09-02', '2024-09-03', '2024-09-04', '2024-09-05',
       '2024-09-06', '2024-09-07', '2024-09-08', '2024-09-09',
       '2024-09-10', '2024-09-11', '2024-09-12', '2024-09-13',
       '2024-09-14', '2024-09-15', '2024-09-16', '2024-09-17',
       '2024-09-18', '2024-09-19', '2024-09-20', '2024-09-21',
       '2024-09-22'], dtype='datetime64[D]')

In [22]:
np.datetime64('2024-09-02')
#day based time

numpy.datetime64('2024-09-02')

In [24]:
np.datetime64('2024-09-02 12:00')
# minute based time

numpy.datetime64('2024-09-02T12:00')

In [25]:
np.datetime64('2024-09-02 12:59:59.50','ns')

numpy.datetime64('2024-09-02T12:59:59.500000000')

2024-09-02


## Date and times in pandas


In [27]:
import pandas as pd
date = pd.to_datetime("2nd of sep, 2024")
date
# we converted string date in timestamp

Timestamp('2024-09-02 00:00:00')

In [29]:
date.strftime('%A')

'Monday'

In [31]:
date + pd.to_timedelta(np.arange(12),'D')

DatetimeIndex(['2024-09-02', '2024-09-03', '2024-09-04', '2024-09-05',
               '2024-09-06', '2024-09-07', '2024-09-08', '2024-09-09',
               '2024-09-10', '2024-09-11', '2024-09-12', '2024-09-13'],
              dtype='datetime64[ns]', freq=None)

# Pandas time series: Indexing by time

In [38]:
index = pd.DatetimeIndex(['2020-07-04', '2020-08-04',
                          '2021-07-04', '2021-08-04'])
data = pd.Series([0,1,2,3],index=index)
data

Unnamed: 0,0
2020-07-04,0
2020-08-04,1
2021-07-04,2
2021-08-04,3


other ways of indexing i have tried is through list. Just playing around

In [36]:
import pandas as pd

# Create a DatetimeIndex
index = pd.DatetimeIndex(['2020-07-04', '2020-08-04',
                          '2021-07-04', '2021-08-04'])

# Create a Series with the specified index
data = pd.Series(list(range(0, 4)), index=index)

# Display the Series
print(data)


2020-07-04    0
2020-08-04    1
2021-07-04    2
2021-08-04    3
dtype: int64


And now that we have this data in a `Series`, we can make use of any of the `Series` indexing patterns, passing values that can be coerced into dates:

In [41]:
data['2020-07-04':'2021-07-04']

Unnamed: 0,0
2020-07-04,0
2020-08-04,1
2021-07-04,2


In [43]:
data['2021']

Unnamed: 0,0
2021-07-04,2
2021-08-04,3


# pandas time series data structures

This section will introduce the fundamental Pandas data structures for working with time series data:

- For *timestamps*, Pandas provides the `Timestamp` type. As mentioned before, this is essentially a replacement for Python's native `datetime`, but it's based on the more efficient `numpy.datetime64` data type. The associated `Index` structure is `DatetimeIndex`.
- For *time periods*, Pandas provides the `Period` type. This encodes a fixed-frequency interval based on `numpy.datetime64`. The associated index structure is `PeriodIndex`.
- For *time deltas* or *durations*, Pandas provides the `Timedelta` type. `Timedelta` is a more efficient replacement for Python's native `datetime.timedelta` type, and is based on `numpy.timedelta64`. The associated index structure is `TimedeltaIndex`.

The most fundamental of these date/time objects are the `Timestamp` and `DatetimeIndex` objects.
While these class objects can be invoked directly, it is more common to use the `pd.to_datetime` function, which can parse a wide variety of formats.
Passing a single date to `pd.to_datetime` yields a `Timestamp`; passing a series of dates by default yields a `DatetimeIndex`, as you can see here:

In [45]:
dates = pd.to_datetime([datetime(2024,9,2), '4th of august, 2024','2021-Jul-6', '07-07-2021', '20210708'])
dates

DatetimeIndex(['2024-09-02', '2024-08-04', '2021-07-06', '2021-07-07',
               '2021-07-08'],
              dtype='datetime64[ns]', freq=None)

Any `DatetimeIndex` can be converted to a `PeriodIndex` with the `to_period` function, with the addition of a frequency code; here we'll use `'D'` to indicate daily frequency:

In [46]:
dates.to_period('D')

PeriodIndex(['2024-09-02', '2024-08-04', '2021-07-06', '2021-07-07',
             '2021-07-08'],
            dtype='period[D]')

In [48]:
dates - dates[1]

TimedeltaIndex(['29 days', '0 days', '-1125 days', '-1124 days', '-1123 days'], dtype='timedelta64[ns]', freq=None)

## Regular Sequences: pd.date_range

In [49]:
pd.date_range('2015-07-03', '2015-07-10')

DatetimeIndex(['2015-07-03', '2015-07-04', '2015-07-05', '2015-07-06',
               '2015-07-07', '2015-07-08', '2015-07-09', '2015-07-10'],
              dtype='datetime64[ns]', freq='D')

In [53]:
pd.date_range('2024-09-02',periods=8)

DatetimeIndex(['2024-09-02', '2024-09-03', '2024-09-04', '2024-09-05',
               '2024-09-06', '2024-09-07', '2024-09-08', '2024-09-09'],
              dtype='datetime64[ns]', freq='D')

In [54]:
pd.date_range('2024-09-02',periods = 8, freq = 'H',)

DatetimeIndex(['2024-09-02 00:00:00', '2024-09-02 01:00:00',
               '2024-09-02 02:00:00', '2024-09-02 03:00:00',
               '2024-09-02 04:00:00', '2024-09-02 05:00:00',
               '2024-09-02 06:00:00', '2024-09-02 07:00:00'],
              dtype='datetime64[ns]', freq='H')

In [55]:
pd.period_range('2024-09-02',periods=5, freq='M')

PeriodIndex(['2024-09', '2024-10', '2024-11', '2024-12', '2025-01'], dtype='period[M]')