Main point of the paper:

For stable prediction performance in unseen price range, the change point detection tech-
nique is employed. In particular, it is used to segment time-series data so that normalization can be separately
conducted based on segmentation (we can identify points where statistical properties change, then segment this data and normalize the segments separately). In addition, on-chain data, the unique records listed on the blockchain that
are inherent in cryptocurrencies, are collected and utilized as input variables to predict prices. Furthermore,
this work proposes self-attention-based multiple long short-term memory (SAM-LSTM), which consists
of multiple LSTM modules for on-chain variable groups and the attention mechanism, for the prediction
model. So, each LSTM module will process different groups of on-chain variables.

Why on-chain data?

On-chain data contains information acquired from the
blockchain. On-chain data consist of valuable information
regarding the blockchain network, including transactions,
block size, and mining difficulty. Thus, existing traditional
asset classification criteria and indicators cannot be directly
applied to cryptocurrency. Considering the aforementioned
points, a novel approach that reflects cryptocurrency’s dis-
tinct characteristics is imperative for successful applications.

Why change in point detection to segment the data?

Since the price moves in an unexpected range that has been previously unseen, constructed
machine learning-based models are not able to predict future
prices accurately. This problem does not apply only to cer-
tain prediction algorithms but could affect practically every
prediction model constructed based on price data within a
moderate range. This work therefore proposes a novel method
to address the aforementioned problem using a change point detection (CPD) technique.

Prediction Framework:

The proposed framework consists of five phases.
1. Extensive variable sets have been collected from on-chain data. 
2. Some variables are selected from the acquired dataset based on significant statistical correlations.
3. The segmentation of input data based on the CPD technique (PELT) is conducted. 
4. The proposed price prediction model, composed of LSTM and the attention mechanism, is illustrated. 
5. An experimental setup, including data preprocessing, evaluation metrics, and implementation details, is provided. The complete algorithm for the proposed price prediction framework is presented in Algorithm 1

*Dune API*

To get the API key:
1. Create an account on dune
2. In the top left corner, click your username, then click the settings symbol
3. Click create new key, and toggle all endpoints
4. Be sure to copy the API key immediately because it won't be shown again

To create a query:
1. Click Create, then New Query
2. In the data explorer, click Raw blockchain data
3. You'll see the data tables for all the tokens
4. Write your SQL Query, then Save it
5. Click 'API', then 'Python', copy that number that's passed as a parameter. This is what you'll need to connect to Python.

In [None]:
#!pip install dune-client

In [None]:
from dune_client.client import DuneClient
import os
import dune_api as dapi #.py file storing the API key
import pandas as pd
import numpy as np

In [None]:
dune = DuneClient(dapi.API_KEY)
query_result = dune.get_latest_result_dataframe(4771564)
query_result

Unnamed: 0,time,height,date,hash,bits,chainwork,difficulty,total_fees,total_reward,mint_reward,merkle_root,transaction_count,nonce,coinbase,previous_block_hash,size,stripped_size,version,weight
0,2009-01-12 07:16:40.000 UTC,187,2009-01-12,0x00000000b2cde2159116889837ecf300bd77d229d49b...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0x52478db532c594119845eb8260d640d78798e4b7fcb7...,2,8.537211e+08,0x04ffff001d011a,0x000000008b3ff2aaf3427f2a624cb9978e687d9fbba5...,414,414,1,1656
1,2009-09-30 16:39:14.000 UTC,24066,2009-09-30,0x00000000e27718aaaeaa09d7cd5ce2225ce3a68546bf...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0xbf7678ef9a133da17bbf5ea290c8e63fc064624fc3a5...,1,7.970128e+06,0x04ffff001d02d601,0x000000002d8d74ab18532a327f71888326d44a0423fb...,216,216,1,864
2,2009-08-02 00:13:44.000 UTC,20404,2009-08-02,0x0000000089ad3ca8b464c87f920fa3346bf97c6961e2...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0xaa7b5e1d47d4d2f434dd34f7ee779afae0237399df10...,1,1.010306e+09,0x04ffff001d02dd00,0x00000000b15d08bc4171c6ddc87d76c9a1daf852490a...,216,216,1,864
3,2009-09-30 16:23:58.000 UTC,24070,2009-09-30,0x000000004559012842720edfbc699322884f9a887c98...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0x7ce2f811fe5c8c0cc28f2d5fa5f93de9b79427960b82...,1,1.441022e+08,0x04ffff001d02ab00,0x000000004d22746fecd8bae3dfdb5ac67f91d567d878...,216,216,1,864
4,2009-09-30 16:24:57.000 UTC,24071,2009-09-30,0x00000000b0742187ae98b7bae07bf13519c39004ead5...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0xb4f52ba8537f71aede351faf44e6b8bc21784eab7983...,1,2.602025e+07,0x04ffff001d02ac00,0x000000004559012842720edfbc699322884f9a887c98...,216,216,1,864
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2009-09-14 17:31:37.000 UTC,22887,2009-09-14,0x00000000ac5b3a80bd9109ba9c1b1c636bf18f149241...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0xd57952908ae4a4d16f34d6839e6fd2965c6902e0a4ee...,1,7.023074e+08,0x04ffff001d0152,0x00000000086d44e8888394d5e67e5a61dde6035270f0...,215,215,1,860
96,2009-09-14 19:15:30.000 UTC,22892,2009-09-14,0x000000005146198029e7c7686ecaad57cb4bb73b898d...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0x04f535e0923ad2c5469d24cffa6c886cbf2b8447260a...,1,3.267854e+09,0x04ffff001d024b04,0x00000000b9f2ba9023168fef2d51a4e5fbbfefbece09...,216,216,1,864
97,2009-09-14 20:42:25.000 UTC,22896,2009-09-14,0x00000000f5a765395ac44251cc796cec5a80f156f8c0...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0xc60fe5e11e0b9cc73ffa3d6ee5a3b6640a516a5aed78...,1,2.794212e+08,0x04ffff001d028404,0x00000000fe1407a8aa60ce22101536ed58d3a31f1a66...,216,216,1,864
98,2009-09-14 21:19:26.000 UTC,22898,2009-09-14,0x00000000d61c82291688a50348a88efcce7260d3c23d...,0x1d00ffff,0x00000000000000000000000000000000000000000000...,1,0,50,50,0x07b2be2923115fb2f194154875210525e750bc4e083c...,1,3.174085e+09,0x04ffff001d029204,0x00000000c7add2c251745b2529e3aca477356186333a...,216,216,1,864


^ This is a test query to get an idea of which columns there are. 

In [None]:
# Query used to generate this:

'''
SELECT * FROM bitcoin.blocks
LIMIT 100

'''

In [21]:
query_result_BTC = dune.get_latest_result_dataframe(4772504).sort_values('date')
query_result_BTC

Unnamed: 0,date,average_height,average_difficulty,average_total_fees,average_total_reward,average_mint_reward,average_transaction_count,average_nonce,average_size,average_stripped_size,average_version,average_weight
3171,2015-01-01,336943.0,4.064096e+10,0.048813,25.048813,25.000,359.660606,2.133365e+09,1.874837e+05,187483.715152,2.000000e+00,7.499349e+05
1861,2015-01-02,337108.0,4.064096e+10,0.071653,25.071653,25.000,480.527273,2.098411e+09,2.651373e+05,265137.327273,2.000000e+00,1.060549e+06
3361,2015-01-03,337271.5,4.064096e+10,0.072830,25.072830,25.000,507.574074,2.122466e+09,2.951487e+05,295148.691358,2.000000e+00,1.180595e+06
3100,2015-01-04,337438.5,4.064096e+10,0.126445,25.126445,25.000,498.220930,1.946171e+09,2.822432e+05,282243.232558,2.000000e+00,1.128973e+06
2270,2015-01-05,337602.5,4.064096e+10,0.103150,25.103150,25.000,612.724359,2.195475e+09,3.609607e+05,360960.737179,2.000000e+00,1.443843e+06
...,...,...,...,...,...,...,...,...,...,...,...,...
3107,2024-12-27,876590.5,1.085226e+14,0.043658,3.168658,3.125,2204.056962,2.036459e+09,1.546256e+06,815842.310127,6.622845e+08,3.993782e+06
726,2024-12-28,876741.0,1.085226e+14,0.039583,3.164583,3.125,2087.972028,2.069825e+09,1.685794e+06,759976.083916,6.450647e+08,3.965722e+06
1440,2024-12-29,876891.5,1.086103e+14,0.040919,3.165919,3.125,2652.607595,2.161659e+09,1.715176e+06,759595.803797,6.750958e+08,3.993964e+06
3647,2024-12-30,877050.0,1.097821e+14,0.044487,3.169487,3.125,2356.735849,1.973508e+09,1.759093e+06,744946.578616,6.526120e+08,3.993932e+06


In [None]:
# ^ Final data. 2015-2024. For taking averages, we can only include numeric datatypes. If we want to include these other columns (such as bits)
# we first need to convert them to numeric and take an average

# Query used to generate this:

'''

SELECT date, AVG(height) AS average_height, AVG(difficulty) AS average_difficulty, AVG(total_fees) as average_total_fees,
AVG(total_reward) AS average_total_reward, AVG(mint_reward) AS average_mint_reward, AVG(transaction_count) AS average_transaction_count, 
AVG(nonce) AS average_nonce, AVG(size) AS average_size, AVG(stripped_size) AS average_stripped_size, 
AVG(version) AS average_version, AVG(weight) AS average_weight  
FROM bitcoin.blocks
WHERE date >= DATE '2015-01-01' AND date < DATE '2025-01-01'
GROUP BY date

'''

In [None]:
# SQL Query for SOL
'''
SELECT 
  DATE_TRUNC('hour', time) AS hour,
  AVG(height) AS average_height, AVG(total_transactions) AS average_total_transactions, 
  AVG(successful_transactions) as average_successful_transactions, AVG(failed_transactions) as average_failed_transactions,
  AVG(total_vote_transactions) AS average_total_vote_transactions, AVG(total_non_vote_transactions) AS average_total_non_vote_transactions, 
  AVG(successful_vote_transactions) AS average_successful_vote_transactions, AVG(successful_non_vote_transactions) AS average_successful_non_vote_transactions, 
  AVG(failed_vote_transactions) AS average_failed_vote_transactions, AVG(failed_non_vote_transactions) AS average_failed_non_vote_transactions, 
  AVG(num_reward_partitions) AS average_num_reward_partitions  
FROM solana.blocks
WHERE time >= TIMESTAMP '2015-01-01 00:00:00' 
  AND time < TIMESTAMP '2025-02-24 00:00:00'
GROUP BY DATE_TRUNC('hour', time);
'''

In [None]:
sol_query_id = None # Replace with actual
query_result_SOL = dune.get_latest_result_dataframe(sol_query_id).sort_values('time')
query_result_SOL