<span style="font-width:bold; font-size: 3rem; color:#333;">- Part 01: Feature Backfill for SOLANA bitcoin</span>


## 🗒️ The tasks of this script
1. Download historical prices for SOLANA and Bitcoin as CSV files
2. Update the path of the CSV files in this notebook to point to the ones that you downloaded
5. Create an account on www.hopsworks.ai and get your HOPSWORKS_API_KEY
6. Run notebook to upload the feature on a hopsworks feature storage



### <span style='color:#ff5f27'> 📝 Imports

In [1]:
import pandas as pd
import hopsworks
from utils import *
import json
import os
import warnings
from dotenv import load_dotenv

warnings.filterwarnings("ignore")

  from .autonotebook import tqdm as notebook_tqdm


### IF YOU WANT TO WIPE OUT ALL OF YOUR FEATURES AND MODELS, run the cell below

In [2]:
# If you haven't set the env variable 'HOPSWORKS_API_KEY', then uncomment the next line and enter your API key
# with open('../../data/hopsworks-api-key.txt', 'r') as file:
#     os.environ["HOPSWORKS_API_KEY"] = file.read().rstrip()
# #proj = hopsworks.login()
#util.purge_project(proj)

### Connect to hopsworks and upload historical data

---

In [3]:
load_dotenv()
os.environ["HOPSWORKS_API_KEY"] = os.getenv("HOPSWORKS_API_KEY")
project = hopsworks.login()

2024-12-21 11:57:17,456 INFO: Initializing external client
2024-12-21 11:57:17,456 INFO: Base URL: https://c.app.hopsworks.ai:443
2024-12-21 11:57:19,140 INFO: Python Engine initialized.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1164448


### Add historical data to hopsworks feature storage

#### Add historical solana prices

In [19]:
hist_data_sol = pd.read_csv("data/SOL_USD Binance Historical Data.csv")
hist_data_sol.columns = ['date', 'price', 'open', 'high', 'low', 'vol', 'change']

In [20]:
fs = project.get_feature_store() 
solana_fg = fs.get_or_create_feature_group(
    name='solana',
    description='Solana price',
    version=2,
    primary_key=["date"])

solana_fg.insert(hist_data_sol)


Uploading Dataframe: 100.00% |██████████| Rows 31/31 | Elapsed Time: 00:01 | Remaining Time: 00:00


(Job('solana_2_offline_fg_materialization', 'SPARK'), None)

In [21]:
solana_fg.update_feature_description("date", "Date")
solana_fg.update_feature_description("price", "The price of Solana")
solana_fg.update_feature_description("open", "The opening price of Solana")
solana_fg.update_feature_description("high", "The highest price of Solana")
solana_fg.update_feature_description("low", "The lowest price of Solana")
solana_fg.update_feature_description("vol", "Volume")
solana_fg.update_feature_description("change", "Change in price")


<hsfs.feature_group.FeatureGroup at 0x15f21f3a0>

#### Add historical data for bitcoin

In [7]:
hist_data_btc = pd.read_csv("data/BTC_USD Binance Historical Data.csv")
hist_data_btc.columns = ['date', 'price', 'open', 'high', 'low', 'vol', 'change']

In [8]:
fs = project.get_feature_store() 
bitcoin_fg = fs.get_or_create_feature_group(
    name='bitcoin',
    description='Bitcoin price',
    version=1,
    primary_key=["date"])

bitcoin_fg.insert(hist_data_btc)

Uploading Dataframe: 100.00% |██████████| Rows 31/31 | Elapsed Time: 00:01 | Remaining Time: 00:00


Launching job: bitcoin_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai:443/p/1164448/jobs/named/bitcoin_1_offline_fg_materialization/executions


(Job('bitcoin_1_offline_fg_materialization', 'SPARK'), None)

In [9]:
bitcoin_fg.update_feature_description("date", "Date")
bitcoin_fg.update_feature_description("price", "The price of Bitcoin")
bitcoin_fg.update_feature_description("open", "The opening price of Bitcoin")
bitcoin_fg.update_feature_description("high", "The highest price of Bitcoin")
bitcoin_fg.update_feature_description("low", "The lowest price of Bitcoin")
bitcoin_fg.update_feature_description("vol", "Volume")
bitcoin_fg.update_feature_description("change", "Change in price")


<hsfs.feature_group.FeatureGroup at 0x105bee4d0>

#### Add historical data for fear and greed index

In [14]:
import requests
import pandas as pd
import io

# URL of the API
url = "https://api.alternative.me/fng/?limit=0&format=csv"

# Fetch data from the API
response = requests.get(url)

if response.status_code == 200:
    content = response.text
    if "data" in content:
        # Locate and clean the pseudo-CSV section
        start_idx = content.find("[") + 1
        end_idx = content.find("]", start_idx)
        raw_data = content[start_idx:end_idx].strip()
        
        # Replace single quotes and braces for easier parsing
        raw_data = raw_data.replace("'", "").replace("{", "").replace("}", "")
        
        # Debug: Print raw_data to check its format
        #print("Raw data:", raw_data)
        
        # Split into rows
        rows = raw_data.split("\n")
        data = []
        
        for row in rows:
            # Debug: Print each row to check format
            print("Processing row:", row)
            if row == "fng_value,fng_classification,date":
                # Skip header row
                continue
            
            # Extract key-value pairs
            key_values = row.split(",")
            # Ensure each field has a key and value
            if len(key_values) == 3:
                data.append(key_values[0].strip())
                data.append(key_values[1].strip())
                data.append(key_values[2].strip())

            else:
                print("Skipping malformed row:", row)

        
        # Assuming rows are in order of [date, fng_value, fng_classification] repeat
        # Split data into chunks of 3 for each record
        structured_data = [data[i:i + 3] for i in range(0, len(data), 3)]
        
        # Create DataFrame
        fng_df = pd.DataFrame(structured_data, columns=["date", "fng_value", "fng_classification"])
        print(fng_df.head())
    else:
        print("Data field not found in response.")
else:
    print(f"Failed to fetch data: {response.status_code}")


Processing row: fng_value,fng_classification,date
Processing row: 21-12-2024,73,Greed
Processing row: 20-12-2024,74,Greed
Processing row: 19-12-2024,75,Greed
Processing row: 18-12-2024,81,Extreme Greed
Processing row: 17-12-2024,87,Extreme Greed
Processing row: 16-12-2024,83,Extreme Greed
Processing row: 15-12-2024,80,Extreme Greed
Processing row: 14-12-2024,83,Extreme Greed
Processing row: 13-12-2024,76,Extreme Greed
Processing row: 12-12-2024,83,Extreme Greed
Processing row: 11-12-2024,74,Greed
Processing row: 10-12-2024,78,Extreme Greed
Processing row: 09-12-2024,78,Extreme Greed
Processing row: 08-12-2024,79,Extreme Greed
Processing row: 07-12-2024,75,Greed
Processing row: 06-12-2024,72,Greed
Processing row: 05-12-2024,84,Extreme Greed
Processing row: 04-12-2024,78,Extreme Greed
Processing row: 03-12-2024,76,Extreme Greed
Processing row: 02-12-2024,80,Extreme Greed
Processing row: 01-12-2024,81,Extreme Greed
Processing row: 30-11-2024,84,Extreme Greed
Processing row: 29-11-2024,78,

In [15]:
fng_df

Unnamed: 0,date,fng_value,fng_classification
0,21-12-2024,73,Greed
1,20-12-2024,74,Greed
2,19-12-2024,75,Greed
3,18-12-2024,81,Extreme Greed
4,17-12-2024,87,Extreme Greed
...,...,...,...
2507,05-02-2018,11,Extreme Fear
2508,04-02-2018,24,Extreme Fear
2509,03-02-2018,40,Fear
2510,02-02-2018,15,Extreme Fear


In [16]:
fs = project.get_feature_store() 
fng_fg = fs.get_or_create_feature_group(
    name='f_n_g_index',
    description='fear_and_greed_index',
    version=1,
    primary_key=["date"])

fng_fg.insert(fng_df)

Uploading Dataframe: 100.00% |██████████| Rows 2512/2512 | Elapsed Time: 00:02 | Remaining Time: 00:00


Launching job: f_n_g_index_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai:443/p/1164448/jobs/named/f_n_g_index_1_offline_fg_materialization/executions


(Job('f_n_g_index_1_offline_fg_materialization', 'SPARK'), None)

#### Enter a description for each feature in the Feature Group

In [17]:
fng_fg.update_feature_description("date", "Date of the Fear and Greed Index")
fng_fg.update_feature_description("fng_value", "Fear and Greed Index value")
fng_fg.update_feature_description("fng_classification", "Fear and Greed Index classification")


<hsfs.feature_group.FeatureGroup at 0x14947cfa0>