# Automated Coin Watch API Data Extraction Schedueling Tool

#### Website: LIVE_COIN_WATCH
#### API: https://www.livecoinwatch.com/tools/api

The below script will build a model using specifically Python's Requests and Asyncio libraries to hit an API and extract specific data at set timeframes.

Essentially we want to extract data from an API at certain times and build analytical insights from this data using analytical tooling.

Resources:

https://docs.python.org/3/library/datetime.html

https://www.slingacademy.com/article/python-asyncio-run-a-task-at-a-certain-time-every-day/

https://www.askpython.com/python-modules/pandas/update-the-value-of-a-row-dataframe#:~:text=With%20the%20Python%20iloc%20%28%29%20method%2C%20it%20is,the%20same.%20Syntax%3A%20dataframe.iloc%5Bindex%5D%20%3D%20value%20Example%3A%20data.iloc

In [None]:
import numpy as np # numerical python (heavy computational power)
import pandas as pd # analytical library

import asyncio # scedueling asynchronous tasks
import requests # working with APIs

import json # json object
import datetime # datetime library

API Parameters

In [None]:
url = "https://api.livecoinwatch.com/coins/list"

payload = json.dumps({
  "currency": "USD",
  "sort": "rank",
  "order": "ascending",
  "offset": 0,
  "limit": 100,
  "meta": False
})
headers = {
  'content-type': 'application/json',
  'x-api-key': '98b16066-5b97-48bd-aa0c-66fae75a6db7'
}

Asyncio co-routine timer

In [None]:
# This coroutine will run a coroutine at a specific time
async def run_at_time(time, co_routine):
    # datetime -> now()
    print("coroutine_timer")
    # Get the current timestamp
    now = datetime.datetime.now() # datetime -->> retrive date time NOW
    print("run_at_time_now")
    print(now)

    # Calculate the delay until the next occurrence of time
    delay = ((time - now) % datetime.timedelta(days=1)).total_seconds()
    print("run_at_time_delay")
    print(delay)

    # Sleep until then
    await asyncio.sleep(delay) # await -> keyword and expression used within a coroutine to yield execution to an awaitable. 

    # Run the coroutine
    return await co_routine

Empty Data Structures 

In [None]:
cd, r, v, cp, time_stamp = ([] for i in range(5)) # intiate -> empty lists

# Create EMPTYY DataFrame from our models data collected -->> base (globaly declared) dataframes merging API data on each model scrape
base_df = pd.DataFrame ({'Coin':cd,'Rate':r,'Volume':v,'Cap':cp})
base_metadata_df = pd.DataFrame ({'API-Request-Timestamp':time_stamp})

Asyncio co-routine tasks/jobs

In [None]:
# This is the coroutine that will be run <--time-->
async def api_scraper():
    # API post -->> request
    response = requests.request("POST", url, headers=headers, data=payload)
    
    # timestamp
    dt_request = datetime.datetime.now() # Getting the current date and time request executed/processes (time will be close to time server was hit for request)
    time_stamp.append(dt_request) # add/append to (pre-defined list)
    print(dt_request)
        
    # METADATA
    global meta_dataset # global keyword - declare dataframe globally to access in main scope (ie. outside function loop)
    metadata = pd.DataFrame({'API-Request-Timestamp':time_stamp})
    meta_dataset = pd.merge(base_metadata_df, metadata, how="outer")
        
    # RESPONE -->> JSON
    data = response.json()
    
    # API Data Extraction
    for item in data:
        #
        cd.append(item['code'])
        r.append(item['rate'])
        v.append(item['volume'])
        cp.append(item['cap'])
        
        #
        df = pd.DataFrame({'Coin':cd,'Rate':r,'Volume':v,'Cap':cp})
    #
    print(df.tail(4)) # display tail/bottom of dataset
        
    # 
    global dataset
    dataset = pd.merge(base_df, df, how="outer")
    
    print(datetime.datetime.now())
    print("%-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+$")

In [None]:
async def main():
    # delcare datetime object for <-->
    time1 = datetime.datetime.combine(datetime.date.today(), datetime.time(13,30))
    time2 = datetime.datetime.combine(datetime.date.today(), datetime.time(13,35))
    print("main()-->>task-->>creater")
    print(time1)
    print(time2)
    print("%-+-+-+-+-+-+-+-+-+-+-+$")

    # Run api_scraper at <--> every day
    while True:
        await run_at_time(time1, api_scraper())
        await run_at_time(time2, api_scraper())
        print("main()-close")

Create Task - Event Loop

In [None]:
# define -> get running loop
event_loop = asyncio.get_running_loop() # get_running_loop() function used to get the running event loop

#
if event_loop.is_running(): # is_running() function returns True if the event loop is running.
    task = asyncio.create_task(main()) # create task -> scheduele co-routine (pass defined coroutine)

Conclusion

The asyncio module helps you execute multiple tasks concurrently without blocking the main thread of execution. Thus, can improve the performance and responsiveness of your program, especially when dealing with IO-bound operations such as network requests, file operations, or database queries. In this article, we used asyncio to build two programs that can rescue us from tiresome stuff. The code is simple, but the core idea will remain unchanged even in large applications.

DATA ENGINEERING


DATA VISUALIZATION

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# plot settings (pre-defined)
plt.style.use('ggplot')
fig= plt.figure(figsize=(20,7))
plt.title("Crytocurrency Analytics")
plt.xlabel("Timestamp UTC")
plt.ylabel("Rate")

# generate line plot
plt.plot(analysis['TimestampUTC'],analysis['Rate'])
#
plt.show()