# Financial Time Series
First I want to give special thanks for **Yves Hilpisch**, this project based on the idea from his book called *"Python for Finance"*.

![image.png](attachment:eaa744f8-b67c-44c4-8d9e-6b76e00737b7.png)


*"Python for Finance"* is a practical guide that shows how to use Python for financial tasks like analysis, quantitative finance, and algorithmic trading. It's tailored for finance professionals such as analysts and traders who want to harness Python's capabilities. The book covers fundamental Python concepts and libraries, making it accessible for both Python beginners and finance experts. Readers will learn to handle financial data, perform time series analysis, create trading strategies, manage risk, model financial derivatives, and integrate with financial APIs. The book's hands-on approach with real-world examples makes it a valuable resource for those looking to apply Python in the finance industry.

**Financial Time Series**

Financial time series data is one of the most important types of data in finance. This is data indexed by date and/or time. For example, prices of stocks over time represent financial time series data. Similarly, the EUR/USD exchange rate over time represents a financial time series; the exchange rate is quoted in brief intervals of time, and a collection of such quotes then is a time series of exchange rates


## Table of contents

- **Financial Data**
    + This section is about the basics of working with financial times series data using pandas:
- **Rolling Statistics**
    + Rolling Statistics (Thống kê trượt) is a concept in statistics and data analysis, typically used to assess the trends and variability of data over time.
- **Correlation Analysis**
- **High-Frequency Data**




## Financial Data
This section works with a locally stored financial data set in the form of a CSV file, we have created by *CryptoData.py*, we will introduce later. However, for now we have the *output_data.csv*, which contains all the information of Bitcoin Time Series Dataset from *alphavantage.co*.

### Data Import

#### Initialization

In [1]:
from CryptoData import CryptoData
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
import datetime

#### Dataset

In [2]:
BTC = CryptoData("BTC")
df = BTC.df
df.head()


file_path = 'output_data.csv'

# Save the DataFrame to a CSV file
df.to_csv(file_path, index=True)

print(f'DataFrame saved to {file_path}')

DataFrame saved to output_data.csv


In [10]:
# Specify the path to your CSV file
file_path = 'output_data.csv'

# Read the CSV file into a DataFrame
df = pd.read_csv(file_path)

# Display the DataFrame
df.head()

Unnamed: 0,date,open,high,low,close,volume,cap
0,2021-05-10,58240.83,59500.0,53400.0,55816.14,89586.34925,89586.34925
1,2021-05-11,55816.14,56862.43,54370.0,56670.02,64329.54055,64329.54055
2,2021-05-12,56670.02,58000.01,48600.0,49631.32,99842.789836,99842.789836
3,2021-05-13,49537.15,51367.19,46000.0,49670.97,147332.002121,147332.002121
4,2021-05-14,49671.92,51483.0,48799.75,49841.45,80082.204306,80082.204306


In [11]:
df.describe().round(2).T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
open,1000.0,33297.15,11625.96,15781.29,23545.57,30287.08,42056.97,67525.82
high,1000.0,34052.84,11940.38,16315.0,24219.43,30743.66,42856.76,69000.0
low,1000.0,32457.28,11229.72,15476.0,23109.97,29713.68,40933.29,66222.4
close,1000.0,33282.23,11603.56,15781.29,23545.57,30287.08,42056.98,67525.83
volume,1000.0,106413.13,114305.81,1290.41,35714.5,54426.43,139773.34,760705.36
cap,1000.0,106413.13,114305.81,1290.41,35714.5,54426.43,139773.34,760705.36


#### Visualizations


Visualization plays a crucial role in understanding and analyzing financial data for several reasons:
- Clarity and Insight
- Identifying Patterns and Trends
- Communication
- Decision Making
- Detecting Anomalies and Errors
- Forecasting and Planning

In [18]:
import pandas as pd
import cufflinks as cf 
import plotly.offline as plyo 
plyo.init_notebook_mode(connected=True) 

In [39]:
pip install chart_studio

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [34]:
a = np.random.standard_normal((250, 5)).cumsum(axis=0) 
index = pd.date_range('2019-1-1', freq='B', periods=len(a))
df = pd.DataFrame(100 + 5 * a, columns=list('abcde'), index=index.astype(str))

type(df.index)
df

Unnamed: 0,a,b,c,d,e
2019-01-01,96.387278,99.588790,105.558888,101.082314,93.356400
2019-01-02,109.056825,100.169992,105.557841,105.932607,91.343223
2019-01-03,118.598392,100.849317,107.270437,93.256611,85.149650
2019-01-04,115.990622,97.011197,112.400307,97.431548,90.296649
2019-01-07,116.608723,97.909693,110.555028,101.604890,103.039320
...,...,...,...,...,...
2019-12-10,188.572002,155.799163,289.569657,134.884010,159.299837
2019-12-11,187.927600,163.913068,284.948377,125.027600,166.985034
2019-12-12,180.703376,169.819985,288.446767,124.821949,162.291690
2019-12-13,187.239217,169.423871,292.202985,125.588734,161.562418


In [41]:
df.iplot(asFigure=True), 

(Figure({
     'data': [{'line': {'color': 'rgba(255, 153, 51, 1.0)', 'dash': 'solid', 'shape': 'linear', 'width': 1.3},
               'mode': 'lines',
               'name': 'a',
               'text': '',
               'type': 'scatter',
               'x': array(['2019-01-01', '2019-01-02', '2019-01-03', ..., '2019-12-12',
                           '2019-12-13', '2019-12-16'], dtype=object),
               'y': array([ 96.38727812, 109.05682547, 118.59839225, ..., 180.7033763 ,
                           187.23921745, 178.80229792])},
              {'line': {'color': 'rgba(55, 128, 191, 1.0)', 'dash': 'solid', 'shape': 'linear', 'width': 1.3},
               'mode': 'lines',
               'name': 'b',
               'text': '',
               'type': 'scatter',
               'x': array(['2019-01-01', '2019-01-02', '2019-01-03', ..., '2019-12-12',
                           '2019-12-13', '2019-12-16'], dtype=object),
               'y': array([ 99.58879005, 100.16999246, 100.849

In [44]:
plyo.iplot( 
 df.iplot(asFigure=True), 
)

#### Resampling

- Resampling is an important operation on financial time series data. Usually this takes the form of *downsampling*.
- Make the daily data into weekly or monthly data (Very importance for long-term prediction in Financial Market).
- Notable, this resampling thing is not very similar with the resampling in Machine Learning.

In [None]:
#Turn series into datetimes series.
df.date = pd.to_datetime(df.date)
df.set_index('date', inplace=True)
type(df.index)

In [None]:
#ME - Monthly
#W - Weekly
data.resample('ME', label='right').last().head()

In [None]:
y = data.resample('ME', label='right').open
plt.plot(y.cumsum())

In [None]:
start_date = '2022-01-1'
end_date = '2022-2-1'

# Filter DataFrame based on the specified date range
filtered_df = df.loc[start_date:end_date]
y = filtered_df.resample('W', label='right').close
y.cumsum()