### Title: Data Collection Using AlphaVantage
Author: Tan Zhi Lun  
Contact: zhilun296@gmail.com

This short project aims to create a file for storing the 1 minute OHLC data extracted from Alpha Vantage's API.  

Essentially there are **2 pandas dataframes**:  
1. The data extracted from Alpha Vantage (using a wrapper API)
2. The original data in the csv file

**Further processing** is required, however, as we need to reverse the data extracted such that it is in reverse chronological order, then remove any **duplicates** when combining it with the original dataframe.

E.g. If the original dataframe has entries up to 08:30 am, whereas the data extracted from Alpha Vantage starts from 08:00 am, the script would remove the duplicate OHLC data from 08:00 am to 08:30 am.

In [5]:
import pandas as pd
from alpha_vantage.foreignexchange import ForeignExchange

# Choosing the file for saving data
filename_eurusd = "EURUSD_data.csv"
auth_token ='' # Insert authorization token from personal account here

# Using wrapper api for Alpha Vantage
fx = ForeignExchange(key= auth_token, output_format='pandas')
data, meta_data = fx.get_currency_exchange_intraday('EUR', 'USD', '1min', 'full')

# Reverse the order such that the most recent entry is last
data = data[::-1]

# Read in csv file to prepare for combination, using date as the index for sorting and further combination
df1 = pd.read_csv(filename_eurusd, index_col = 'date')
df1.index = pd.to_datetime(df1.index)

# Combine the data from Alpha Vantage and original csv file, removing any duplicates
df1 = df1.append(data)
df1 = df1.sort_values(by='date')
df1 = df1.loc[~df1.index.duplicated(keep='first')]

# To save data into the same csv file
df1.to_csv(filename_eurusd, header = True) 

print("EURUSD past day data")
print(df1)

EURUSD past day data
                     1. open  2. high  3. low  4. close
date                                                   
2020-07-30 22:30:00   1.1845   1.1845  1.1842    1.1844
2020-07-30 22:31:00   1.1843   1.1844  1.1842    1.1844
2020-07-30 22:32:00   1.1843   1.1845  1.1841    1.1844
2020-07-30 22:33:00   1.1845   1.1845  1.1842    1.1843
2020-07-30 22:34:00   1.1844   1.1845  1.1842    1.1844
...                      ...      ...     ...       ...
2020-07-31 21:55:00   1.1771   1.1771  1.1771    1.1771
2020-07-31 21:56:00   1.1771   1.1771  1.1771    1.1771
2020-07-31 21:57:00   1.1771   1.1771  1.1771    1.1771
2020-07-31 21:58:00   1.1771   1.1771  1.1771    1.1771
2020-07-31 21:59:00   1.1771   1.1771  1.1771    1.1771

[1410 rows x 4 columns]
