# Analyze and Transform Financial Market Data with Pandas

In this chapter we'll cover the following recipes:
1. Diving into index types
2. Building Pandas series and DataFrames 
3. Manipulating and transforming DataFrames
4. Examining and Selecting Data from DataFrames
5. Calculating asset returns 
6. Measuring the volatility of a return series 
7. Resampling data from different time frames
8. Addressing missing data issues
9. Applying custom functions to analyse time series data

In [1]:
import pandas as pd

In [2]:
idx_1 = pd.Index([0,1,2,3,4,5,6,7,8,9])

In [3]:
idx_1 # as we can see this index is of type Int64Index, which means it's made up of 64-bit integers

Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64')

Pandas has several Index types to support many use cases, including those related to time series analysis. We'll cover examples of the most often used index types

### DatatimeInndex

In [5]:
# extremly useful when dealing with time series data
days = pd.date_range("2016-01-01", periods=6, freq="D")
days # this creates an index with six incremental datetime objects

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06'],
              dtype='datetime64[ns]', freq='D')

In [7]:
# we can use different frequencies, including seconds
seconds = pd.date_range("2016-01-01", periods=6, freq="s")
seconds

DatetimeIndex(['2016-01-01 00:00:00', '2016-01-01 00:00:01',
               '2016-01-01 00:00:02', '2016-01-01 00:00:03',
               '2016-01-01 00:00:04', '2016-01-01 00:00:05'],
              dtype='datetime64[ns]', freq='S')

In [9]:
# by default DatetimeIndexes are "timezone naive". To localize:
seconds_utc = seconds.tz_localize("UTC")
seconds_utc # as we can see, localizing simply appends time zone information to the object

DatetimeIndex(['2016-01-01 00:00:00+00:00', '2016-01-01 00:00:01+00:00',
               '2016-01-01 00:00:02+00:00', '2016-01-01 00:00:03+00:00',
               '2016-01-01 00:00:04+00:00', '2016-01-01 00:00:05+00:00'],
              dtype='datetime64[ns, UTC]', freq='S')

### PeriodIndex

In [10]:
# it's possible to create ranges of periods -> such as quarters using period_range method
prng = pd.period_range("1990Q1", "2000Q4", freq="Q-NOV")
prng

PeriodIndex(['1990Q1', '1990Q2', '1990Q3', '1990Q4', '1991Q1', '1991Q2',
             '1991Q3', '1991Q4', '1992Q1', '1992Q2', '1992Q3', '1992Q4',
             '1993Q1', '1993Q2', '1993Q3', '1993Q4', '1994Q1', '1994Q2',
             '1994Q3', '1994Q4', '1995Q1', '1995Q2', '1995Q3', '1995Q4',
             '1996Q1', '1996Q2', '1996Q3', '1996Q4', '1997Q1', '1997Q2',
             '1997Q3', '1997Q4', '1998Q1', '1998Q2', '1998Q3', '1998Q4',
             '1999Q1', '1999Q2', '1999Q3', '1999Q4', '2000Q1', '2000Q2',
             '2000Q3', '2000Q4'],
            dtype='period[Q-NOV]')

### MultiIndex

In [11]:
# often referred as "hierarchical index", is a data structure that allows for complex data organization within pandas dataframe and series.
# to create a MultiIndex object, pass a list of tuples to the from_tuples method 
tuples = [
    (pd.Timestamp("2023-07-10"), "WMT"),
    (pd.Timestamp("2023-07-10"), "JPM"),
    (pd.Timestamp("2023-07-10"), "TGT"),
    (pd.Timestamp("2023-07-11"), "WMT"),
    (pd.Timestamp("2023-07-11"), "JPM"),
    (pd.Timestamp("2023-07-11"), "TGT")
]

midx = pd.MultiIndex.from_tuples(tuples, names=("date","symbol"))